Interview At its virtual re:Invent conference, AWS presented faster block storage based on its custom Nitro hardware, strong consistency in its longstanding S3 (Simple Storage Service), and a solution for users forced to buy more storage capacity when all they needed was greater throughput.
Although storage is less trendy than areas like artificial intelligence, getting it right has a big impact on performance and cost. Just after re:Invent, we spoke to Storage VP Mai-Lan Tomsen Bukovec about the range of options on AWS, including new ones just announced. Do customers struggle to make the best choices?
“It depends,” said Bukovec, tactfully. She points out that in the case of S3 customers can now opt for Intelligent Tiering, which automatically moves files to cheaper archive storage if they are not accessed for a period. “The whole storage class is optimized… to give you dynamic pricing based on the access of that object so you don't have to think about how you pick your storage,” she said.
That may be a good solution for S3, but that is only one part of the storage services on offer. There are only a few storage services at a high level, but with numerous options for how each is configured.
At a high level, there is S3 object storage, Elastic File System (EFS) which implements the venerable NFS (Network File System) protocol, and Elastic Block Store (EBS) which is low level and behaves more like locally attached storage.
AWS also has FSx for Windows File Server, for Windows file shares; FSx for Lustre, which implements a distributed file system designed for HPC (High Performance Computing); and AWS Storage Gateway, which connects AWS storage to on-premises networks.
Another range is the Snow family of products, hardware appliances for edge computing or shipping large volumes of data to AWS.
A lot of customers are moving to the separation of storage and compute
Drill down a bit and more options appear. EBS, for example, has GP2 (General Purpose SSD), GP3 (updated general purpose SSD), io1 (thoughput optimized SSD using Nitro, the family of custom AWS network cards), io2 (an update to io1), and in preview, io2 Block Express, which further improves on io2 performance. There are also st1 and sc1 hard drive based volumes which are slower but lower cost. The full range is listed here.
While choice is good, it does put the burden on the customer to work out what is right for their particular application, though changing your mind or adapting to new demands is easier when it is no longer a matter of discarding old hardware and buying new.
“A lot of customers are moving to the separation of storage and compute,” said Bukovec, “because they want to take storage operations, which could be optimized for cost or performance, and make them independent of having any change on the application layer.”
The new high performance block storage types mean that more types of on-premises applications can be migrated. “Block Express has the performance of a SAN (Storage Area Network), because we were able to go all the way down to the network protocol and offload storage operations into the Nitro card,” said Bukovec. “We’re looking at 256K IOPS and 4,000 MB/s throughput,” she added.
Another recent change is the ability to scale up IOPS without having to purchase more storage. Previously, some customers were forced to deploy capacity they did not need for storage types like GP2 just to get more throughput. “You can scale up to 16,000 IOPS and 1000 MB/s for an additional fee,” Bukovec said.
At the other end of the scale, hard drives are not dead. “The HDD volume types are popular because of the price point,” said Bukovec. “One of the things that we launched just recently was to lower the minimum size from 500 GB to 125 GB for an HDD volume.”
As for S3, it may be one of the oldest AWS storage services, but it has evolved substantially from its beginnings in 2006.
Early last month the company introduced strong consistency, whereas previously it only guaranteed eventual consistency: the result of a LIST operation might not necessarily include the very latest changes. “We went down into the guts of S3, over, 200 different microservices and we changed the system,” said Bukovec.
“We have mathematicians that sit with their engineering teams and we build mathematical proofs on combinations of state,” she added. “That helped us validate where the race conditions might occur, so we could make sure it never shows up.”
What is coming in future for AWS Storage?
AWS spokespeople are skilled at saying little about forward plans. "We'll constantly find ways we can add cost efficiency through either new storage classes or just lower cost," said Bukovec. She also referred to cloud SAN and intelligent monitoring as likely focuses. Self-optimizing solutions like S3 intelligent tiering may mean that figuring out the best options may become easier - though at the expense of manual control over cost and performance.
The ransomware scourge is in a sense a storage problem: as Reg readers know, the attack works by encrypting data and asking for payment to decrypt it.
Nutanix, VMware are fighting again – yet it's just a sideshow. Why? Look at the giant predators circlingREAD MORE
Does AWS have a solution? “There’s a couple of different ways we’re helping customers with that,” said Bukovec. “Part of the problem with ransomware and other types of security threat is not actually knowing where your data is. Prevention of things like ransomware starts from understanding your data. We launched something called S3 Storage Lens and you can actually get that view organization-wide across all of your storage and all of your accounts across all of your regions.”
Understanding is one thing but what is the next step? “We have a couple of features that we launched years ago, like object lock, which locks the object so nobody can delete it,” said Bukovec. “We built that for the financial industry, for WORM (write-only) storage. They weren’t talking ransomware, they were just talking about compliance.
"They’ve ended up being useful for customers looking to protect themselves from ransomware also. We also do things like cross-region replication. Now cross-region replication has rogue actor protection. If you turn this on, if someone deletes data in your source data, they can’t delete the data in the other region.”
AWS makes it easy to get data into its cloud. The Snowmobile service “moves up to 100 PB of data in a 45-foot long ruggedized shipping container.” That is ideal for something like closing down a data centre, but the bigger advantage is for AWS. Once it has that data, it is hard to envisage ever going elsewhere for the cloud services which access it. ®