Google has introduced a caching layer in Big Query - its cloud data warehouse - designed to speed up responses to users as they explore and experiment with data.
Under the moniker BI Engine, the data layer is designed to bridge the gap between BI tools that are built for interactivity like Tableau, which extracts data from the data warehouse, and an in-memory database for greater interactivity and Google's Looker, which queries data directly.
“The cloud databases... like BigQuery weren't built to support interactive BI, they were built for scale event analytics and scale-type questions,” said Colin Zima, chief analytics officer with Looker, the cloud BI biz Google bought for $2.6bn in February 2020.
The point of Looker, Zima said, was to allow more interactive BI on data in the database rather than creating a second repository. “We're probably the BI tool that works the best with the database, essentially in building out a data model that writes SQL to the database, and the database passes back SQL,” he said. Looker supports 55 flavours of SQL, he added.
The BI Engines caching layer within BigQuery is designed to overcome compromises in performance in terms of interactivity. Other BI tools including Tableau can take advantage of BI Engine through the BigQuery API, Zima said.
“It's a passive cache, which you allocate resource to: it intelligently attempts to figure out the types of things that you're going [to do] that need interactivity. It passively builds itself in the database so it can be queried interactively at much lower latency,” Zima said.
Solar panel company Sunrun has been using the tool. The Google Cloud Architect on its BI team, Kiran Manne, claimed he'd seen a 40 per cent performance increase across more than 1,000 interactive users.
However, Harvinder Atwal, chief data science officer at Big Query user MoneySupermarket told The Register that a real advantage would depend on how critical speed is to the use case.
"BigQuery is pretty fast anyway so you'd really need to value the extra speed if you're already a user. It's going to be more useful for organisations with separate ETL and marts for their BI. Putting the data into BigQuery and using BI Engine plus a BI tool would provide a much bigger benefit," he said.
He also cautioned users to consider their pockets before turning on the tool. "If you're not on BigQuery flat-rate pricing you have to reserve on-demand RAM for BI Engine, which could get pricey," he said.
Philip Howard, research director at analyst firm Bloor Research said the approach to caching was not "exactly original" but could be useful. "Anything that improves performance and reduces latency is potentially useful," he said. ®