Data warehouse firm hopes more will follow Yellowbrick road for real-time analytics, speedy cloud data transfer
But the competition is fierce
Yellowbrick, purveyors of an analytical data warehouse built on flash memory, has launched feature updates aimed at real-time analysis and cloud-to-on-prem data movement.
The firm made something of a splash in 2018 with the launch of its core turnkey, hyperconverged, all-flash box that it claims can replace up to seven disk-based data warehouse racks with less than half a rack.
But the last couple of years have seen some big changes in the data warehouse market and the 4.1 release reflects the pressure Yellowbrick now finds itself under.
First up is better support for continuous workloads. The company, founded 2014, says queries on micro-batched or small real-time incremental updates, such as retail transactions, can now be addressed by "managing the database behaviour when the row store becomes full, and storing and reclaiming table data more efficiently in both the row store and column store". The changes help customers who rely on huge volumes of up-to-date data for reliable, real-time insights, the company said.
Real-time analytics creates competition from two markets. The transactional database application providers that have added analytics – SAP and Oracle – and data warehouse providers who claim to analyse data in real time, like Teradata and IBM's Netezza. Add to that a new family of database providers that claim to have cracked the problem with different architecture such as Rockset.
Yellowbrick CEO Neil Carson told The Register: "Yellowbrick built a hybrid storage engine for their database – it's both a row store and a column store. Transactional semantics are consistent across both data stores, and every query consistently sees data in both places. Vertica, Redshift, Snowflake, the other guys, don't have this."
He said that the column store is used for the bulk of analytic processing and is where the vast majority of data lives. The row store is much smaller in size and is optimised for commit latency for real-time data ingest. "It uses journaling and mirroring instead of erasure coding, since that gives better latency for small operations. The row store automatically moves its data to the column store when it reaches a certain size."
The other main plank of the Yellowbrick release is the speed of data across from the cloud to the company's appliance. The ybload client tool offers 1GB/sec load speed and now supports bulk loads from Azure Blob containers and object stores from other S3-compatible providers, as well as enhanced support for loading from AWS S3 buckets, which "makes it easier to migrate data into Yellowbrick from cloud-based repositories," the company said.
The future of data warehousing is certainly cloudy. Last year, Teradata launched its Vantage platform available on the three main cloud providers, supporting object storage like S3, and has since updated this with the promise of common cloud data tools and quicker compressed data migration times.
Meanwhile, Netezza, which was bought by IBM in 2010 only to be retired last year, has now been relaunched as a cloud data warehouse.
The problem is that most of the competition is going in the other direction from Yellowbrick, according to Philip Howard, a research director at Bloor Research.
"The new [Yellowbrick] features are fine but the trend is towards putting data into S3 and object storage not moving it out. The market focus is very much on data virtualisation so that you can query data in S3 without moving it," he said.
With intense competition comes a certain testiness. IBM has launched a social media campaign with the catch line declaring "yellow brick roads are for fairy tails", to which Yellowbrick responded with: "Sometimes you gotta leave home to find real-time answers."
So far, so childish, but Howard said "IBM's renewal of Netezza must hurt" Yellowbrick.
While Netezza, Teradata, and Oracle have the heritage to assure concurrency of thousands of users for enterprise-scale business intelligence, cloud-native players like Snowflake, AWS's Redshift, Google's BigQuery, and Azure's Synapse will be chomping up novel analytics workloads, of which there are plenty to go around.
It leaves Yellowbrick somewhere in the middle, which it may not find too comfortable.
In which case the moniker may be unfortunate. The yellow-brick road leads to Oz, home of a wizard who seems so impressive until you peer behind the curtain. ®