Is EMC's Maui another Invista?
Biting off more than it can chew
Comment It was different a year ago - full of confidence, EMC head honcho Joe Tucci blithely told analysts about a slew of oncoming EMC products with codenames. Among them were Hulk and Maui, hardware and software to produce a new kind of clusterable storage system, a global repository scaling up to multiple petabytes in size. The hint then was that six months should see them come out into the open.
Nearly a year has passed and Hulk hardware languishes as the Infiniflex 10,000, a kind of near-also-ran demo product lacking its true software and with all the marketing push behind it of a George W Bush re-election campaign.
Maui has disappeared from the EMC lexicon, with an internal EMC blogger having revelations about what it could do in video form abruptly pulled from his blog site.
It's not that Hulk and Maui are busted flushes, just that preliminary expectations have been set and then... nothing. Staff such as Chuck Hollis, VP technical alliances, won't talk about Maui, but will discuss in general the need for software to run a global storage repository.
The role he envisages for this is mind-blowing. It gives an insight into possible development difficulties that have sprung from Maui seeming to be not just storage array controller software, but a whole new level of storage infrastructure software that front-ends and manages data access and storage for a network of inter-connected global data storage centres.
What follows is my interpretation of what Hollis and other EMC people have said and written over the past year and the questions raised.
Infrastructure system and clustered object/filer
Maui is a storage facility with data containers spread around the globe storing data that is ingested, protected and moved to provide localised access from wherever you are on the planet.
We have been told that Maui is more than a clustered file system and orders of magnitude bigger than anything else available today in terms of capacity. It is built on commodity system components including clusterable storage arrays with commodity hard drives inside them. These storage units hold objects along with what Hollis calls rich semantics. Neither a file-level approach nor a block-level approach will scale enough in his view, and it has to be object-based.
So we should assume we're talking about billions, even trillions, of objects and their associated metadata, multi-petabytes of storage capacity and millions of users. Costs are a great concern because there so many darn components - tens of thousands of disk drives, for example - that shaving pennies off their price or increasing utilisation by single digit percentages can save millions of dollars.
We're talking here about creating a Google-class infrastructure from scratch. Not even Google did that and it's taken Sergey Brin's boys years to build out what we see today. It is the UK's National Health IT system but on a global scale with every aspect of it multiplied a million times - I'm guessing but Hollis has used the term 'uber-massive' - and built by one company.
This data is accessed from virtually any kind of internet client device, such as smart phones, netbooks, notebooks, games consoles, desktops and servers. Other mentioned devices are set-top boxes, mobile iTunes devices, RFID-like sensor devices sending in data, VOIP phones, security cameras and satellites. It is universal access. Let your imagination run riot.
The networking infrastructure over which all this runs has to be carrier class, simply 'there' like the phone (landline, not patchy mobile) or electricity.