AWS has added persistent file systems for its FSx Lustre storage service, which lets you use a high-performance clustered file system on demand.
Lustre (from "Linux Cluster") is a distributed file system designed for high performance and widely used in supercomputers. The AWS FSx for Lustre service was first announced in late 2018 as a scratch file system, the idea being that you copy data from the AWS Simple Storage Service (S3) to temporary Lustre storage, process it there, and then copy the results back to S3. The specification was 200 MB/s throughput per TB of storage provisioned, and IOPS (input/output operations per second) of "millions".
AWS has now said it will include enhancements including a second-generation scratch file system that can burst to 1,200MB/s throughput per TB, and encryption in transit to add to the existing encryption at rest, subject to some conditions about instance type and region.
The biggest news is that persistent Lustre file systems are now supported, with "automatic data replication and file server failover". Performance is not quite as good as with the second-generation scratch storage, but you still get IOPS of "millions" and options for throughput of 50, 100 or 200MB/s per TB provisioned. The minimum provisioned amount is 1.2TB or "increments of 2.4TB", according to the docs. Pricing is per GB/month starting at $0.164 for the slowest option (in London, other regions vary), which is about $200 per month for the minimum 1.2TB. It is more expensive than scratch storage, which has a single $0.162 GB/month for the highest performance.
In order to use FSx for Lustre, you will need a Linux VM with the Lustre client, or you can use integration with EKS (Elastic Kubernetes Service) or SageMaker, the AWS model training service. Use cases suggested by AWS include geospatial analysis, seismic processing, financial modelling, media rendering, electronic design automation, big data analytics and machine learning.
A strong feature of FSx for Lustre is that you can create a file system via a simple form in the console, or through the FSx API. Both Microsoft Azure and Google Cloud Platform also have Lustre, but you need to deploy your own cluster. Azure has templates for Lustre file systems here and similarly GCP describes how to create a Lustre file system cluster here. ®