Friday, January 3, 2020

Cold Storage in the Cloud

I've got 500GB of crappy useless data on an external drive, just as you almost surely do. Let's see, there's email from the 1990s, music and videos I deliberately winnowed from my active media files years ago, early versions of iPhone apps, disk images of CD-Roms (remember them?), ancient backups of Desktop, Documents and iTunes Media files, etc. etc.

It's crap, and it lives on a wheezing external drive that's three years past its natural life span. It's only a matter of time before the drive starts doing the click of death, auguring the final demise of this data. Which would be okay, I suppose, but why let data disappear when storage is cheap? This America, baby!

It's certainly not worth buying a second hard drive, so thoughts turn heavenward, to The Cloud. "Off-site" back-up is always smart (in case of fire or theft here at The Hovel), and, anyway, the last thing I need is yet another whirling USB drive next to my desk spewing heat and contributing to my power cord carbonara.

And it dawns on me that virtually every computer user over the age of 40 is probably in the exact same predicament: glancing worriedly at a wheezing aged hard drive containing unimportant files, and scheming about storing them in the cloud for pennies. I feel un-alone.

Yet, not for the first time, Adam Smith let me down. The invisible hand of the market has provided no well-trodden path. It's still wild west out there for cloud storage. Having taken the deep dive, I'll share what I've learned.

The main problem is that the tech industry has decided, as usual, that consumers want to do dumb expensive stuff and geeks want to do smart cheap stuff....and what I want falls smack in the middle.

If I wanted a friendly, intrusive backup program constantly synching - i.e. serving as a sort of Time Machine (Apple's backup protocol) to the Cloud, a thousand companies will eagerly take my money (there's consensus that Backblaze Unlimited Backup - "The World's Easiest Cloud Backup", is one of the better options).

I don't want that. I want to park a 500GB and forget it. For cheap.

There's Dropbox, but they have a maximum file size of 50 GB using their web site and 350GB using their API (i.e. various hook-in services). And, for privacy, I intend to create a 500GB encrypted disk image and upload it as a single lump. Yes, this would be unwieldy to grab files from, but, again, this is just bulk storage, so I won't be grabbing much, if ever.

We're still in the consumer range, more or less. Plenty of services will stow this beast for $10-$15/month plus a penny or two per GB to download. But I don't want to pay $180/year for redundant offsite back up of utter garbage. And that's what pulls me out of the realm of "consumers want to do dumb expensive stuff" and firmly into "geeks want to do smart cheap stuff," where the waters are choppy and poorly lit.

I used the term "bulk storage" to describe my vast lump of garbage data, but the industry term is "cold storage". And the coldest of cold storage has long been Amazon Web Service's "Glacier" storage. However they've recently introduced something colder still, which they call "Glacier Deep Archive". This lets you park 500GB for just 50 freaking cents per month. But they're not exaggerating about the breath-freezing coldness. You must give them 12 hours notice if you ever want to download the data, and the download costs a steep 9¢/GB (they call it an "egress charge"). Which is probably ok because, like I said, this data is super unnecessary and I'm just being a pack rat. Amazon knows that, and this service is for people in exactly my situation (ok, and, of course, IT managers who want to migrate from tape backups). I could store my garbagey lump for virtually free, and pay a decent amount in the unlikely chance I ever actually need it.

Problem: Amazon Web Services is maddeningly technical; the ultimate example of geeks doing smart cheap stuff. Same for Google Cloud Service and BackBlaze's professional product, B2 (which offers a nice simple web interface for files under 50GB, but with large files you're forced deeply into UNIX/Terminal territory - why they don't simply create an AppleScript to handle the rote tasks is beyond me).

Geekiness aside, B2 may be the pricing sweet spot: it's five times the storage cost of Glacier Deep Archive (i.e. $3/month for 500GB) but 1/10th the download charge (a penny per GB). Do bear in mind, though, that while Amazon Web Services and Google Cloud will almost surely be here 10-15 years from now, BackBlaze might...but I'm less certain. OTOH, the impenetrable geekery makes it moot.

One observation. Since I want to park a single 500GB blog and likely never do anything with it, and Amazon Web Service Frigid Frostbite MoFo is insanely cheap, it would make sense to hire a geek to walk me through the process. So that might be a smart solution right there.

Here's what I've decided. I'm going to use ARQ Backup software (Mac or PC) costing a one-time $50. It acts as a friendly, polished front end to all the major cloud services, including DropBox, which is a very nice plus (the DropBox app is super inflexible these days). ARQ is very actively developed, but reportedly quite processor intense - don't expect to do much more with your computer while the app is running. Here’s an in-depth MacWorld review of ARQ Backup from 2017 (which also sheds light on cloud backup, generally).

ARQ doesn’t appear to handle Glacier Deep Freeze yet, but I’ll bet it soon will. Meanwhile it does work with normal AWS Glacier, if you want still pretty crazy-cheap storage with costly downloads.

But I'll hook ARQ Backup up to B2. And here's the thing: having gone this far to find viable cheap cold storage, it looks like ARQ is so easy and powerful that I might want to use it for less hypothermic synch/backup/storage as well. I own a couple more drives with slightly more essential data, so maybe I'll sign up for a couple TBs from B2 for extra redundancy. I'll need to take a close look at privacy/security before I use their encryption rather than encrypting on my side (the latter requires the one-big-lump-of-data approach, which is less viable with data I might need more flexible access to). Although...hmm...as redundant backup (I will also carefully maintain it on my external drives), I may go big-encrypted-lump with these, as well, and swap it out with an updated version bimonthly. 


Potential point of confusion: just as there's the more famous BackBlaze consumer product as well as the geeky BackBlaze B2 discussed here, ARQ seems to make most of their dough selling storage to consumers. I'm not talking about ARQ's storage/backup monthly plans above (which fall under my description of the myriad relatively expensive and smart consumer-side offerings). I'm talking specifically about the ARQ Backup app.

4 comments:

  1. I've worked in leading-edge tech companies, co-founded and sold a tech/fin svcs consulting company and most recently, have done some consulting as an aws architect. Faced with a similar problem but probably possessing deeper technical skills, I considered my options and decided to do nothing.

    The reality for me and most people is that we'll never *need* to access that data. When was the last time you copied a file from the drive? How would your life have changed if you had to recreate the file(s)?

    I keep my tax returns and a couple of other personal docs an encrypted google drive managed by boxcryptor and if the usb drive turns to salt, so be it.

    ReplyDelete
  2. Thanks for posting, Jeff.

    I have been made very happy at various moments by my ability to produce some random email from 1991, or pull the audio source of an old video project. The odds are insanely low that any given file will prove useful, but is it worth $36/year to preserve and protect the entirety of my first 20 years of content creation and collection safely backed up off-site? For sure (though mileage certainly varies).

    As a helluva side benefit, I'd also enjoy much greater DropBox power and flexibility via the ARQ front end (and DB frequently drives me nuts), plus an established infrastructure to start moving higher-priority backups into the cloud for off-site redundancy. I'm about as diligent as possible re: on-site drive backup, but I'm terrifyingly at the mercy of Western Digital's build quality, and of the gods of Fire and Theft.

    This stuff's important enough to me to worth pursuing a long-term viable cheap solution, and I'm glad to have found one. For <$50/year, I consider it a no-brainer.

    ReplyDelete
  3. Jim,

    Makes sense if you continue to access the data. As for backups, 20 years ago I built a network-attached storage device (NAS) and ran background backups for everyone in the family. At some point, I got tired of managing the hardware/software and just moved all our data into the google cloud. All my files are now cloud-resident no local copies on my PC. No more backups, next stop, Chromebook.

    My gen x nephew pointed out it's much harder for people using video/music editing software to move to the cloud. Still, given how fast businesses are pushing the cloud tech envelope, the consumer wave can't be far behind.

    best,

    ReplyDelete
  4. Yeah, your way is the future. I’m just getting there...

    ReplyDelete