Databricks: Ugh, just look at that messy data lake environment. Squints. You know... we could sort that out with a sweet shot of SQL

Data-wrangler previews another lakehouse concept tool


Data management and machine learning framework biz Databricks is launching a tool it has claimed will bring SQL-style analytics to the messy world of data lakes.

SQL Analytics, the company claimed, expands the traditional scope of the data lake from data science and machine learning to include all data workloads including business intelligence and SQL. It is available for preview this week.

The tool is a manifestation of the company’s lakehouse concept, which, you’ve guessed it, is an attempt to bring some of the governance, performance and order from the data warehouse world to the wild and messy world of data lakes, which have the advantage of being able to ingest unstructured data quickly.

Speaking to The Register, Joel Minnick, Databricks product marketing veep said: “Despite it being a little bit of a whimsical name for an architecture, lakehouse is probably the best way to articulate what the architecture is.”

SQL Analytics is built on Delta Lake, Databricks’ open format data engine supposed to help bring order and performance to existing data lakes. It also uses Delta Engine, a “polymorphic query execution engine,” which rewrites Spark into C++ to take advantage of vectorisation, Minnick said. Apache Spark is written in Scala.

The idea, said Minnick, is that it allows users to auto-scale clusters that are structured to be high-performance SQL analytics clusters, which in turn is supposed to allow organisation to handle high user concurrency (many logged-in users) “behind the scenes”.

Databricks had also “done some engineering” to govern how queries were trafficked and executed to keep back and forth communication to a minimum, thereby reducing latency, he said.

Those familiar with SQL analytics or data engineering can explore the schema of their Delta Lake tables, to be able to “run SQL queries, and visualize the results,” Minnick said.

While the Databricks SQL Engine might help bring BI work to the data lake, and help users get value from that messy repository of data, it is unlikely to replace established enterprise data warehouses any time so, opined Philip Carnelley, associate vice president of software research at IDC.

“The idea is to give you the best of both worlds, there is some merit to that. But this is a solution for companies with lots of technical resources. This will run alongside other enterprise data tools. It might be that people use data warehouse systems like Teradata a bit less, because they have these tools as well, but they are not going to switch off the data warehouse any time soon,” Carnelley said.

Databricks was one of the main vendors behind Spark, a data framework designed to help build queries for distributed file systems such as Hadoop. Matei Zaharia, DataBricks' CTO and co-founder, was the initial author for Spark. ®

Broader topics


Other stories you might like

  • Twitter founder Dorsey beats hasty retweet from the board
    We'll see you around the Block

    Twitter has officially entered the post-Dorsey age: its founder and two-time CEO's board term expired Wednesday, marking the first time the social media company hasn't had him around in some capacity.

    Jack Dorsey announced his resignation as Twitter chief exec in November 2021, and passed the baton to Parag Agrawal while remaining on the board. Now that board term has ended, and Dorsey has stepped down as expected. Agrawal has taken Dorsey's board seat; Salesforce co-CEO Bret Taylor has assumed the role of Twitter's board chair. 

    In his resignation announcement, Dorsey – who co-founded and is CEO of Block (formerly Square) – said having founders leading the companies they created can be severely limiting for an organization and can serve as a single point of failure. "I believe it's critical a company can stand on its own, free of its founder's influence or direction," Dorsey said. He didn't respond to a request for further comment today. 

    Continue reading
  • Snowflake stock drops as some top customers cut usage
    You might say its valuation is melting away

    IPO darling Snowflake's share price took a beating in an already bearish market for tech stocks after filing weaker than expected financial guidance amid a slowdown in orders from some of its largest customers.

    For its first quarter of fiscal 2023, ended April 30, Snowflake's revenue grew 85 percent year-on-year to $422.4 million. The company made an operating loss of $188.8 million, albeit down from $205.6 million a year ago.

    Although surpassing revenue expectations, the cloud-based data warehousing business saw its valuation tumble 16 percent in extended trading on Wednesday. Its stock price dived from $133 apiece to $117 in after-hours trading, and today is cruising back at $127. That stumble arrived amid a general tech stock sell-off some observers said was overdue.

    Continue reading
  • Amazon investors nuke proposed ethics overhaul and say yes to $212m CEO pay
    Workplace safety, labor organizing, sustainability and, um, wage 'fairness' all struck down in vote

    Amazon CEO Andy Jassy's first shareholder meeting was a rousing success for Amazon leadership and Jassy's bank account. But for activist investors intent on making Amazon more open and transparent, it was nothing short of a disaster.

    While actual voting results haven't been released yet, Amazon general counsel David Zapolsky told Reuters that stock owners voted down fifteen shareholder resolutions addressing topics including workplace safety, labor organizing, sustainability, and pay fairness. Amazon's board recommended voting no on all of the proposals.

    Jassy and the board scored additional victories in the form of shareholder approval for board appointments, executive compensation and a 20-for-1 stock split. Jassy's executive compensation package, which is tied to Amazon stock price and mostly delivered as stock awards over a multi-year period, was $212 million in 2021. 

    Continue reading

Biting the hand that feeds IT © 1998–2022