This article is more than 1 year old
Teradata gets partly cloudy to rein in big data
Forges SAS in-memory analytics beast
Capacity planning on data warehouses is every bit as much of a nightmare as it is for other kinds of servers and their workloads. That's why Teradata is taking a page out of the cloudy infrastructure sales manual and offering customers the ability to buy capacity on their data warehouse appliances on a utility basis.
Teradata is calling this the Active Data Warehouse Private Cloud, which is stretching the definition of a cloud a bit since the appliances are not virtualized and are certainly not sold entirely on a pay-per-user basis. Perhaps it would be more accurate to call the new ADW appliances partly cloudy.
As Chris Twogood, director of product and services marketing at Teradata describes it, the ADW Private Cloud allows customers to get access for latent capacity in their data warehouses when they come to an end of week, end of month, or end of year peak for processing or maybe just during a busy season.
If you want to be able to access this latent capacity, Teradata sets up a machine with a baseline performance that is 80 per cent of your expected peak and keeps the other 20 per cent in reserve. You access this latent capacity, which Teradata calls Elastic Performance on Demand (sounds like suspenders in either the United States or the United Kingdom) on a per-hour basis, and depending on the appliance configuration, it can cost as little as $50 per hour to access it.
Teradata has already been giving customers the option of more traditional capacity on demand, or COD in the server lingo of the early 2000s when this became commonplace with proprietary midrange and mainframe machines as well as on RISC/Unix boxes. With Capacity on Demand for the ADW appliances, you figure out how many x86 server nodes and what storage you need to run the parallelized Teradata database management system, and then Teradata puts an extra 30 per cent more capacity in the box. If you hit a performance ceiling, you activate this extra capacity in 12 per cent chunks (I don't know why 12 per cent) and pay to have the capacity activated at a pre-negotiated price.
The thing with COD is that it is one way: if you turn it on, you can't turn it off. The new ADW Private Cloud's elastic capacity can rise or fall, and like COD capacity, it can be permanently activated if need be and then Teradata is happy to roll in some more server nodes and give you more reserve capacity.
The ADW Private Cloud is more than just elastic pricing on capacity. It's also a server consolidation play. The ADW Private Cloud, says Twogood, is designed to consolidate various data warehousing and data mart workloads onto a single ADW instead of having these workloads scattered across various machines in the departments and data centers of the world. (These other machines often do not having the Teradata brand on it). The idea is to consolidate onto the ADW and drive the utilization above 95 per cent to squeeze every last penny out of that iron. Teradata says that average CPU utilization on data marts is down in the 10 to 20 per cent range, which is just as unacceptable for BI workloads as it was for generic infrastructure workloads that drove the server virtualization wave.
Twogood says that one healthcare company did such a consolidation and saved $4.5m and boosted performance on queries by a factor of 10X, and that a mobile telecom operator consolidated 300 under-utilized data marts onto one of these cloudy appliances and cut costs by 33 per cent.
The ADW Private Cloud setup also has a new portal that is part of the Teradata Viewpoint management tool that adds self-service capability to the whole shebang. This portal allows data warehouse admins to give business managers and analysts access to particular slices of the warehouse and run algorithms against them and collaborate on the analysis from within the warehouse.
The advent of an ADW Private Cloud, which scales up to 92PB of user capacity, begs the question as to why there is not a public cloud service based on the same hardware and software and allowing customers to access it in a truly cloudy fashion.
Twogood says that the public clouds have not been optimized for shared-nothing parallel databases (like the Teradata database) and they are multi-tenant as well, which makes IT shops jumpy. At the moment, they like having their data inside their firewalls and their own iron. But, there's nothing stopping someone from buying a bunch of Teradata machines and setting up a data warehouse service.
Running SAS super fast
While Teradata and SAS Institute have been partners for a long time, and plenty of Teradata shops use the company's gear for data warehousing and SAS tools for analytics, the Teradata Appliance for SAS Higher-Performance Analytics Model 700 is designed to make SAS analytics sit up and bark by running the code in-memory – and to do so on iron that is compatible with the Teradata data warehouse.
Teradata's SAS HPA Model 700 appliance
The SAS HPA Model 700 appliance comes in three sizes. The small configuration is two cabinets of iron with 192 cores, 1.5TB of main memory, and 20.4TB of uncompressed customer data space. The medium-sized Model 700 appliance comes in three cabinets and has 288 cores, 2TB of memory, and 30.6TB of user data space. The large configuration has 384 cores, 3TB of memory, and 40.8TB of user space. The server nodes have two sockets each and use six-core Xeon X5667 processors running at 3.06GHz, and are lashed together with Teradata's BYNET 4 interconnect.
SAS launched the High-Performance Analytics software back in October 2011 and started initial shipments in December for selected Teradata and Greenplum data warehousing appliances. (Greenplum is owned by EMC, and like Teradata does not make its own server nodes but OEMs the machines.) With the SAS HPA software, the idea is to run SAS analytical routines on parallel machines, to use analytics routines that are available natively in the database wherever possible, and to suck as much of the data as possible into main memory on the cluster to execute queries at memory speeds instead of disk speeds.
The SAS HPA Model 700 appliance is available now, with starting prices of $1.5m for the Teradata portion of the box. While the SAS HPA software is preloaded and preconfigured on the appliance, you have to pay separately for the SAS software on top of this. ®