Start-up MaxiScale has stepped from stealth mode out into the sunshine and unveiled a storage system that can serve billions of files from low-cost, commodity hardware.
The FLEX software platform is based on a so-called Peer Set architecture, which uses vanilla x86 servers and SATA disk drives. A peer Set is a node in a cluster which can scale to 1,000s of nodes all operating within a single file name space, with distributed metadata and the ability to store hundreds of petabytes of file data.
MaxiScale documentation states: "A Peer Set instance contains multiple members, the number of which is user-selectable based on desired bandwidth and replication characteristics. Each member consists of a low-cost SATA hard drive on a separate physical server node. All file data and metadata within a Peer Set is replicated to, and load balanced across, all members." There can be up to 65,000 Peer Set instances in the namespace.
Adding nodes to a FLEX cluster increases the system's bandwidth, file I/O, processing power and capacity.
The product uses 1KB blocks and is optimised for small files, the majority of files occurring in its chosen market of advertisement deliverers, social networks, content delivery networks, video serving companies and hosted SW-as-a-service suppliers.
This means that there are two logical file stores in FLEX, the small file store and what we might call a large file store. Each has its own key/value store associated with it. The small file single disk I/O is claimed to be faster than competing approaches because the files get served up with multiple disk I/Os which take longer.
FLEX is described as self tuning - with auto-load distribution and balancing - and self-healing - with the software automatically activating spare resource to recover from hardware failures, and can continue file read and write operations during a component failure and rebuild.
Client systems, such as Windows and Linux servers, web and application servers, access the FLEX platform across one or ten gigabit Ethernet interfaces using a common file systems approach and POSIX interfaces, "triggering a direct request to the Peer Sets containing the file's metadata and data."
Also: "applications access files with consistent pathnames, avoiding risky namespace updates as capacity increases." Could this be an object-like storage approach?
MaxiScale CEO and founder Gianluca Rattazzi set the scene for MaxiScale's approach: “The new wave of Internet-scale applications have completely changed the dynamics of file serving and data workloads in ways we could not have anticipated even five years ago. Our products and technology fully addresses this ‘era of billions’ to help our customers scale accordingly, while improving performance and reducing costs.”
It's not a unique theme. Isilon CEO Sujal Patel could just as well have said this, and the statement wouldn't look out of place on a BlueArc, HP EXDs9100 or Ibrix, or NetApp ONTAP 8 PowerPoint release.
MaxiScale claims that its system can store vast amounts of information using up to ten times fewer drives than existing file-serving systems and at a fraction of the cost. We imagine it has Isilon, BlueArc, HP, NetApp and EMC in its sights here.
A customer, AdMob, is cited by MaxiScale to support the product. It's described as the world's largest mobile advertising marketplace, with over 110 billion ad impressions served in 3 years. Kevin Scott, Admob's VP engineering, says that MaxiScale's platform "supports the scale and performance levels we require while enabling system consolidation to improve our cost structure.”
MaxiScale's key technology seems to be the distributed metadata. It has an overview whitepaper describing it here and differentiating its approach from clustered NAS, shared-nothing clusters and other approaches to scaling file serving. The paper doesn't explain how an accessing server file request is accepted and sent to exactly the right node in the thousands that could make up a cluster. We're told that clients access files with consistent pathnames but files can move in the cluster and we don't know how the client's pathname is associated with an actual file location.
The company was founded in 2007 and received $12m A-round funding in March that year. MaxiScale is headquartered in Sunnyvale - not too far away from NetApp - and there are three venture capital investors: NEA, EDV and Silicon Valley Bank.
Its product seems geared to cloud storage or Google file system in scope and scale. With a supportive customer it seems to be a real product and not vapourware. If you're interested, the software is available now on a trial use basis. ®