AWS has started upgrading the software behind S3 storage cloud

Sharding system coded in 40,000+ lines of Rust is changing the way cloud colossus ensures data durability


Amazon Web Services has released a paper detailing the operations of its Simple Storage Service (S3), and in doing so revealed that the software powering the service is "being gradually deployed within our current service".

Titled Using Lightweight Formal Methods to Validate a Key-Value Storage Node in Amazon S3 [PDF], the paper states that AWS is implementing "ShardStore" – tech described as "a new key-value storage node implementation for the Amazon S3 cloud object storage service".

The document also reveals that S3 currently holds "hundreds of petabytes of customer data".

"At the core of S3 are storage node servers that persist object data on hard disks," the paper explains. "These storage nodes are key-value stores that hold shards of object data, replicated by the control plane across multiple nodes for durability.

"Each storage node stores shards of customer objects, which are replicated across multiple nodes for durability, and so storage nodes need not replicate their stored data internally," it adds.

AWS also describes a concept called "crash consistency" that it employs to prevent data loss and achieve eleven nines of data durability – meaning the service is designed to preserve 99.999999999 per cent of data.

Replicating data across nodes helps AWS to achieve that reliability and means that losing one node won't destroy data.

"Recovering from a crash that loses an entire storage node's data creates large amounts of repair network traffic and IO load across the storage node fleet," the paper explains. "Crash consistency also ensures that the storage node recovers to a safe state after a crash, and so does not exhibit unexpected behavior that may require manual operator intervention."

ShardStore keeps track of all those objects. Its keys are shard identifiers and values are shards of customer object data. The importance of ShardStore data means it, too, is distributed across different nodes and disks.

There is no indication given in the paper regarding whether or not users will perceive any change as ShardStore is implemented, but does mention it is "API-compatible with our existing storage node software, and so requests can be served by either ShardStore or our existing key-value stores". The Reg can't imagine the change to ShardStore would be disruptive to users – a downtime requirement would see AWS laughed out of the cloud.

The paper describes how AWS used lightweight formal methods – a technique for using automation to verify that software meets its spec – to ensure ShardStore is doing its job. Most of the word count is therefore dedicated to explaining how AWS tested the 40,000-plus lines of Rust that make up ShardStore, and the many acts of deep storage wonkery the software performs to keep S3 alive.

In conclusion, the authors report that AWS’s experience with light formal methods have been "positive, with a number of issues prevented from reaching production and substantial adoption by the ShardStore engineering team".

The authors included several AWS staffers as well as folks from the University of Texas at Austin, the University of Washington, and Swiss public research university ETH Zurich. ®

Similar topics


Other stories you might like

  • Prisons transcribe private phone calls with inmates using speech-to-text AI

    Plus: A drug designed by machine learning algorithms to treat liver disease reaches human clinical trials and more

    In brief Prisons around the US are installing AI speech-to-text models to automatically transcribe conversations with inmates during their phone calls.

    A series of contracts and emails from eight different states revealed how Verus, an AI application developed by LEO Technologies and based on a speech-to-text system offered by Amazon, was used to eavesdrop on prisoners’ phone calls.

    In a sales pitch, LEO’s CEO James Sexton told officials working for a jail in Cook County, Illinois, that one of its customers in Calhoun County, Alabama, uses the software to protect prisons from getting sued, according to an investigation by the Thomson Reuters Foundation.

    Continue reading
  • Battlefield 2042: Please don't be the death knell of the franchise, please don't be the death knell of the franchise

    Another terrible launch, but DICE is already working on improvements

    The RPG Greetings, traveller, and welcome back to The Register Plays Games, our monthly gaming column. Since the last edition on New World, we hit level cap and the "endgame". Around this time, item duping exploits became rife and every attempt Amazon Games made to fix it just broke something else. The post-level 60 "watermark" system for gear drops is also infuriating and tedious, but not something we were able to address in the column. So bear these things in mind if you were ever tempted. On that note, it's time to look at another newly released shit show – Battlefield 2042.

    I wanted to love Battlefield 2042, I really did. After the bum note of the first-person shooter (FPS) franchise's return to Second World War theatres with Battlefield V (2018), I stupidly assumed the next entry from EA-owned Swedish developer DICE would be a return to form. I was wrong.

    The multiplayer military FPS market is dominated by two forces: Activision's Call of Duty (COD) series and EA's Battlefield. Fans of each franchise are loyal to the point of zealotry with little crossover between player bases. Here's where I stand: COD jumped the shark with Modern Warfare 2 in 2009. It's flip-flopped from WW2 to present-day combat and back again, tried sci-fi, and even the Battle Royale trend with the free-to-play Call of Duty: Warzone (2020), which has been thoroughly ruined by hackers and developer inaction.

    Continue reading
  • American diplomats' iPhones reportedly compromised by NSO Group intrusion software

    Reuters claims nine State Department employees outside the US had their devices hacked

    The Apple iPhones of at least nine US State Department officials were compromised by an unidentified entity using NSO Group's Pegasus spyware, according to a report published Friday by Reuters.

    NSO Group in an email to The Register said it has blocked an unnamed customers' access to its system upon receiving an inquiry about the incident but has yet to confirm whether its software was involved.

    "Once the inquiry was received, and before any investigation under our compliance policy, we have decided to immediately terminate relevant customers’ access to the system, due to the severity of the allegations," an NSO spokesperson told The Register in an email. "To this point, we haven’t received any information nor the phone numbers, nor any indication that NSO’s tools were used in this case."

    Continue reading

Biting the hand that feeds IT © 1998–2021