Startup VAST Data lifted the lid on its secret storage sauce today, revealing cheap, exabyte-level scale out flash arrays sped by Optane SSDs – which it hopes will persuade users to load up their on-premises spinning rust in the 'barrow and wheel it to the tip.
Everyone's all like 'stick it in the cloud!' What of the mad lads pushing data closer to compute?READ MORE
The company, founded by a former XtremIO chief techie in February 2016, has been shipping product to customers including Ginkgo Bioworks, General Dynamics Information Technology and Zebra Medical Division for some months already.
The firm claimed customers were cutting multi-million dollar cheques for petabyte scale systems – which will have helped trigger the latest VC round.
Along with the information on the storage architecture behind its extraordinary data reduction claims, VAST told world+dog it had just taken $40m to the wallet. The series B financing was backed by TPG Capital, along with existing investors Norwest Venture Partners, Dell Technologies Capital, 83 North, and Goldman Sachs. The A-round was in March last year, according to Crunchbase. The new cash injection brings total funding to $80m.
The firm said its scale-out system offered archive disk vault economics with flash speed, providing NFS v3 file or S3 object access (or both) to hosts.
The premise: To compete with existing on-premises storage bods
VAST's appeal centres on its claim that it can replace virtually all existing layers of primary, secondary and tertiary storage with a single online tier of storage. The theory, at least, is that customers can do away with dozens of storage system and software suppliers, and associated layers of storage management complexity and replace them with a data centre space and power-saving storage system that's easier to scale, manage and operate.
The startup will basically be taking on every on-premises storage supplier with a shared-nothing cluster employing a DASE (disaggregated shared everything) architecture. The product features separate X86 compute node and storage box layers linked across a switched NVMe fabric.
The stateless compute nodes handle the metadata and storage processing, providing a single global namespace, deduplication and data protection. Dumb storage nodes have a combination of QLC (4 bits/cell) and Intel Optane 3D XPoint drives, being JBOFs - just a bunch of flash drive boxes. Every compute node can see every drive and there can be up to a 1,000 storage nodes and 10,000 compute nodes.
The storage nodes are guaranteed for 10 years. The firm said the product would cut write amplification levels to low levels and preserve QLC flash write cycles.
Containers, containers, containers
The system software is delivered inside Docker containers, either software-only, or software plus storage nodes, or software in compute nodes plus storage nodes. Customers could take the middle way and run the VAST software in hyperconverged infrastructure.
Data is written in wide stripes across drives and storage nodes. The block size is the QLC flash block size and writes are concatenated and organised in the Optane tier to ensure full block writes to the flash. It is erasure-coded for protection with a low overhead scheme, and parallel recovery from a failed drive.
The deduplication scheme involves a VAST-designed hashing function applied to blocks of data. The resulting number can be used to compare incoming data blocks with already stored blocks. The block hash number differences represent the degree of byte-level difference between blocks.
Where the difference is small then, the incoming block can be replaced with a reference to the already stored block plus a summary of the byte-level differences. Rehydration involves getting the original stored block and applying the byte-level difference summary.
Renen Hallak, founder and CEO of VAST Data, said: "Storage has always been complicated. Organisations for decades have been dealing with a complex pyramid of technologies that force some tradeoff between performance and capacity." ®