Comment by jp57

Comment by jp57 5 days ago

> Glacier restores are also no longer painfully slow.

I had a theory (based on no evidence I'm aware of except knowing how Amazon operates) that the original Glacier service operated out of an Amazon fulfillment center somewhere. When you put it a request for your data, a picker would go to a shelf, pick up some removable media, take it back, and slot it into a drive in a rack.

This, BTW, is how tape backups on timesharing machines used to work once upon a time. You'd put in a request for a tape and the operator in the machine room would have to go get it from a shelf and mount it on the tape drive.

danudey 5 days ago

The most likely explanation is that they used a tape robot, such as the one seen here:

https://www.reddit.com/r/DataHoarder/comments/12um0ga/the_ro...

Which is basically exactly what you described but the picker is a robot.

Data requests go into a queue; when your request comes up, the robot looks up the data you requested, finds the tape and the offset, fetches the tape and inserts it into the drive, fast-forwards it to the offset, reads the file to temporary storage, rewinds the tape, ejects it, and puts it back. The latency of offline storage is in fetching/replacing the casette and in forwarding/rewinding the tape, plus waiting for an available drive.

Realistically, the systems probably fetch the next request from the queue, look up the tape it's on, and then process every request from that tape so they're not swapping the same tape in and out twenty times for twenty requests.

Reply View 9 replies

philistine 5 days ago

I've read very definitive discussions on here that Glacier never used tape. It has always been powered off hard disks.

Reply View | 8 replies
- UltraSane 5 days ago
  
  For truly write once read never data tape is the optimal storage method. It is exactly what the LTO standard was designed to do and it does it very well. You can be confident that you will be able to read every bit of data from a 30 year old tape, probably even 50 years old. It has the lowest bit error rate of any technology I am aware of. LTO-9 is better than 1 uncorrectable bit error in 10^20 user bits, which is 1 bit error in 12.5 exabytes. There is also the substantial advantage that tapes on a shelf are completely immune to ransomware. As a sysadmin I get that warm fuzzy feeling when critical data is backed up on a good LTO tape library.
  
  Reply View | 6 replies
  
  dabiged 5 days ago
  
  As someone who does tape recovery on very very old tape I largely concur with this with a couple of caveats.
  1. Do not encrypt your tapes if you want the data back in 30/50 years. We have had so many companies lose encryption keys and turn their tapes into paperweights because the company they bought out 17 years ago had poor key management.
  2. The typical failure case on tape is physical damage not bit errors. This can be via blunt force trauma (i.e. dropping, or sometimes crushing) or via poor storage (i.e. mould/mildew).
  3. Not all tape formats are created equal. I have seen far higher failure rates on tape formats that are repeatedly accessed, updated, ejected, than your old style write once, read none pattern.
  
  Reply View | 0 replies
  
  count 5 days ago
  
  Call it bad luck, but I’ve never had a fully successful restore. Drives eat tapes, drives are damaged and write bad data, robot arms die or malfunction. Tapes have NEVER worked for me. SANs and remote disk though, rock solid.
  That said, I don’t miss any of that stuff, gimme S3 any day :)
  
  Reply View | 1 reply
  
  UltraSane 4 days ago
  
  You do realized that that isn't normal at all? LTO tape is still used by thousands of companies to backup many exabytes of data. I know it once saved Google from permanent loss of gmail data from a bug. You should really get a refund for your tape drives.
  
  Reply View | 0 replies
  
  meepmorp 4 days ago
  
  Aren't LTO formats only backward compatible with the immediate prior version?
  
  Reply View | 2 replies
- danudey 3 days ago
  
  That's... interesting. I wonder what the wear-and-tear on an HDD is to spin it up/power it back down again.
  
  Reply View | 0 replies

Twirrim 5 days ago

I can't talk about it, but I've yet to see an accurate guess at how Glacier was originally designed. I think I'm in safe territory to say Glacier operated out of the same data centers as every other AWS service.

It's been a long time, and features launched since I left make clear some changes have happened, but I'll still tread a little carefully (though no one probably cares there anymore):

One of the most crucial things to do in all walks of engineering and product management is to learn how to manage the customer expectations. If you say customers can only upload 10 images, and then allow them to upload 12, they will come to expect that you will always let them upload 12. Sometimes it's really valuable to manage expectations so that you give yourself space for future changes that you may want to make. It's a lot easier to go from supporting 10 images to 20, than the reverse.

Reply View 5 replies

donavanm 4 days ago

Im like 90% sure ive seen folks (unofficially) disclose the original storage and API decisions over the years, in roughly accurate terms. Personally I think the multi dimensional striping/erasure code ideas are way more interesting than the “its just a tape library” speculation/arguments. That and the real lessons learned around product differentiation as supporting technologies converge.

Reply View | 0 replies
kelnos 4 days ago

> I can't talk about it, but I've yet to see an accurate guess at how Glacier was originally designed.
It feels odd that this is some sort of secret. Why can't you talk about it?

Reply View | 2 replies
- Twirrim 4 days ago
  
  I signed NDAs. I wish Glacier was more open about their history, because it's honestly interesting, and they have a number of notable innovations in how they approach things.
  
  Reply View | 1 reply
  
  Dylan16807 4 days ago
  
  Well assuming your NDA is a reasonable length I hope you talk about it later.
  (And if Amazon is making unreasonable length NDAs I hope they lose a lot of money over it.)
  
  Reply View | 0 replies
mh- 5 days ago

..oh. That's clever. Thanks for posting this.

Reply View | 0 replies

jp57 4 days ago

I think folks have missed what I think would have been clever about the implentation I (apparently) dreamt up. It's not that "it's just a tape library", it's that it would have used the existing FC and picker infrastructure that Amazon had already built, with some racks containing drives for removable media. I was thinking that it would not have been some special facility purely for Glacier, but rather one or more regular FCs would just have had some shelves with Glacier media (not necessarily tapes).

Then the existing pickers would get special instructions on their handhelds: Go get item number NNNN from Row/shelf/bin X/Y/Z and take it to [machine-M] and slot it in, etc.

Reply View 0 replies

browningstreet 5 days ago

Yeah, but they've been robotic for decades since.

Reply View 0 replies

christina97 5 days ago

They would definitely be using rubies robots given how uniform hard drives are. The only reason warehouses still have humans is that heterogeneity (different sizes, different textures, different squishiness, etc).

Reply View 0 replies