EMC is getting into the data warehousing / business intelligence (DW/BI) area by buying Greenplum. This takes it into direct competition with Oracle and its Exadata product line.
Greenplum has about 100 customers, including the New York Stock Exchange and T-Mobile USA. And it is to be the foundation of a new data computing product division within EMC’s Information Infrastructure business.
The company develops DW/BI software with a “shared-nothing” massively parallel processing (MPP) architecture that operates in virtualised X86 servers and EMC thinks Greenplum fits perfectly into its private cloud, virtualised server and federated storage worlds.
There's an open source angle too: Greenplum's software is based on the PostgreSQL database
So why Greenplum?
There is an explosion in the use of digital sales channels and mobile internet devices. This creates a parallel explosion in data about online sales, and the faster that data can be accessed in a data warehouse and analysed by a business intelligence application, the faster a business can identify profitable and unprofitable sales trends and maximise the former and minimise the latter.
If businesses can run their DW/BI systems in near real-time then that gives them a better ability to optimise their product and pricing mix.
An EMC insider says that Greenplum reduced a 10-hour query in a traditional system to six minutes for O'Reilly Media. The customer used to make one BI run a day and now runs six an hour.
Using Greenplum software, data is automatically partitioned across multiple "segment" servers, and each "segment" owns and manages a distinct portion of the overall data. Its database is able to hold and and access warehoused data better than standard relational databases, with the result that BI queries run faster.
Greenplum has a product called Chorus with which IT and DBAs can establish "one or more pool of commodity servers and storage ahead of demand, and can then create new database instances and sandboxes in minutes with just a few clicks. These databases could be small sandboxes on just a couple of servers, or they could be multi-petabyte marts across hundreds of servers. Pools can be expanded by adding more servers, and the databases can themselves grow to span more servers and storage using Greenplum Database's online expansion capabilities."
Two years ago EMC set up a DW/BI competency centre. At that point HP had announced a tie-up with Oracle to produce the HP Oracle Database machine. Greenplum was then working with Sun on the X4500 Thumper hardware. Since then Oracle bought Sun and replaced the HP hardware with Sun's in its Exadata system.
EMC will continue to offer Greenplum’s full product portfolio to customers. It also plans to deliver "new EMC Proven reference architectures as well as an integrated hardware and software offering designed to improve performance and drive down implementation costs".
In other words we can expect an EMC Exadata-type system. That system includes server hardware, which EMC does not make. There is a possibility then, that we will see a Greenplum V-block, using VMware virtualisation, Cisco UCS servers and EMC storage.
It's a deal
Greenplum's CEO, Bill Cook, will run the new data computing product division and report to Pat Gelsinger, the president and COO for EMC's Information Infrastructure Products division.
The all-cash acquisition is expected to be completed by the end of September - the cash amount has not been revealed. Greenplum has reportedly raised around $61m of funding, so a price of $100m to $150m might be in the right ballpark.
Link: EMC press release ®