• 1 Post
  • 99 Comments
Joined 9 months ago
cake
Cake day: June 9th, 2024

help-circle
  • Universiality, basically: almost everyone, everywhere has an email account, or can find one for free. As well as every OS and every device has a giant pile of mail clients for you to chose from.

    And I mean, email is a simple tech stack and well understood and reliable: I host an internal mail server for notifications and updates and shit, and it’s rapid, fast, and works perfectly.

    It’s only when you suddenly need to email someone OTHER than your local shit that it turns to complete shit.



  • Debian stable is great: it’s, well, stable. It’s well supported, has an extremely long support window, and the distro has a pretty stellar track record of not doing anything stupid.

    It’s very much in the install-once-and-forget-it category, just gotta do updates.

    I run everything in containers for management (but I’m also running something like 90 containers, so a little more complex than your setup) and am firmly of the opinion that, unless you have a compelling reason to NOT run something in a container, just use the containerized version.


  • I’m the same way. If it’s split license, then it’s a matter of when and not if it’s going to have some MBA come along and enshittify it.

    There’s just way, way too much prior experience where that’s what eventually will happen for me to be willing to trust any project that’s doing that, since the split means they’re going to monetize it, and then have all the incentive in the world to shit all over the “free” userbase to try to get them to convert.


  • Mind you the way some of these articles sound is that the whole Xbox gaming division is on its knees and doesn’t sell a machine or make a penny.

    It does feel like they’re using excessively narrow defintions and picking facts to create the narrative they want.

    Okay, the smaller of Microsoft’s gaming platforms isn’t selling as well as the PS5, while the other one is growing so fast even Sony is porting all of their exclusive games to it.

    Doesn’t really feel like a complete market failure to me?





  • The format is the tape in the drive, or the disk or whatever.

    Tape existed 50 years ago: nothing modern and in production can read those tapes.

    The problem is, given a big enough time window, the literal drives to read it will simply no longer exist, and you won’t be able to access even non-rotted media because of that.

    As for data integrity, there’s a lot of options: you can make a md5 sum of each file, and then do it again and see if anything is different.

    The only caveat here is you have to make sure whatever you’re using to make the checksums gets stored somewhere that’s not JUST on the drive because if the drive DOES corrupt itself, and your only record of the “good” hashes is on the drive, well, you can’t necessarily trust those hashes either.


  • So, 50 years isn’t a reasonable goal unless you have a pretty big budget for this. Essentially no media is likely to survive that long and be readable unless they’re stored in a vault, under perfect climate controlled conditions. And even if the media is fine, finding an ancient drive to read a format that no longer exists is not a guaranteed proposition.

    You frankly should be expecting to have to replace everything every couple of years, and maybe more often if your routine tests of the media show it’s started rotting.

    Long term archival storage really isn’t just a dump it to some media and lock it up and never look at ever again.

    Alternately, you could just make someone else pay for all of this, and shove all of this to something like Glacier and make the media Amazon’s problem. (Assuming Amazon is around that long and that nothing catches fire.)


  • I’m using blu-ray disks for the 3rd copy, but I’m not backing up nearly as much data as you are.

    The only problem with optical media is that you should only expect it to be readable for a couple of years, best case, at this point and probably not even that as the tier 1 guys all stop making it and you’re left with the dregs.

    You almost certainly want some sort of tape option, assuming you want long retention periods and are only likely to add incremental changes to a large dataset.

    Edit: I know there’s longer-life archival optical media, but for what that costs, uh, you want tape if at all possible.




  • Buy multiple drives, setup some sort of raid, setup some sort of backup. Then set up a 2nd backup.

    Done.

    All drives from all manufacturers are going to fail at more or less the same rate (see: backblaze’s stats) and trying to buy a specific thing to avoid the death which is coming for all drives is, mostly, futile: at the absolute best you might see a single specific model to avoid, but that doesn’t mean entire product lines are bad.

    I’m using some WD red drives which are pushing 8 years old, and some Seagate exos drives which are pushing 4, and so far no issues on any of the 7 drives.


  • Make sure, if you use hardware RAID, you know what happens if your controller dies.

    Is the data in a format you can access it easily? Do you need a specific raid controller to be able to read it in the future? How are you going to get a new controller if you need it?

    That’s a big reason why people nudge you to software raid: if you’re using md and doing a mirror, then that’ll work on any damn drive controller on earth that linux can talk to, and you don’t need to worry about how you’re getting your data back if a controller dies on you.


  • As with all things email, they probably really wanted to make sure that the mails were delivered and thus were using a commercial MTA to ensure that.

    I’d wager, even at 20 or 30 or 40k a year, that’s way less than it’d cost to host infra and have at least two if not three engineers available 24/7 to maintain critical infra.

    Looking at my mail, over the years I’ve gotten a couple hundred email from them around certificates and expirations (and other things), and if you assume there’s a couple million sites using these certs, I could easily see how you’d end up in a situation where this could scale in cost very very slowly, until it’s suddenly a major drain.



  • Very very little. It’s a billion tiny little bits of text, and if you have image caching enabled, then all those thumbnails.

    My personal instance doesn’t cache images since I’m the only one using it (which means a cached image does nobody any good), and i use somewhere less than 20gb a month, though I don’t have entirely specific numbers, just before-lemmy and after-lemmy aggregates.