Google Cloud previews new BigLake data lakehouse service

Hoarding a bunch of data that 'may prove useful' someday? Data-wrangling product is aimed at you

Google has announced a preview on Google Cloud of BigLake, a data lake storage service that it claims can remove data limits by combining data lakes and data warehouses.

BigLake is designed to address the problems associated with the growing volumes of data and different types of data now being stored and retained by organizations of all sizes. The motivation for storing all of this data can often be summed up as "because it may prove useful", with the idea being that if it is analyzed using the right tools, it will yield valuable insights that will benefit the business.

Unveiled to coincide with Google's Data Cloud Summit, BigLake allows organizations to unify their data warehouses and data lakes to analyze data without worrying about the underlying storage layer. This eliminates the need to duplicate or move data around from its source to another location for processing and reduces cost and inefficiencies, Google claimed.

According to Google, traditional data architectures are unable to unlock the full potential of all the stored data, while managing it across disparate data lakes and data warehouses creates silos and increases risk and cost for organizations. A data lake is essentially just a vast collection of data that has been stored and may be a mix of structured and unstructured formats, while a data warehouse is generally regarded as a repository for structured, filtered data.

Google said that BigLake is built on the experience it has gained from years of development with its BigQuery tool used to access data lakes on Google Cloud Storage to enable what it refers to as a "open lakehouse" architecture.

This concept of a data "lakehouse" was pioneered in the last few years by either Snowflake or Databricks, depending on whom you believe, and refers to a single platform that can support all of the data workloads in an organization.

BigLake offers users fine-grained access controls, support for open file formats like Parquet, an open-source column-oriented storage format designed for analytical querying, plus open-source processing engines like Apache Spark.

Another new data-related feature announced by Google is Spanner change streams, which it said allows users to track changes within their Spanner database in real time in order to unlock new value. Spanner is Google's distributed SQL database management and storage service, and the new capability tracks Spanner inserts, updates, and deletes in real time across a customer's entire Spanner database.

bar at a nightclub

MongoDB loses its mind with marketing budget movie mania: Yep, it's choose-your-own-adventure Hackers with drop-down menus


Having this enables users to ensure the most recent data updates are available for replication from Spanner to BigQuery for real-time analytics, or for other purposes such as triggering downstream application behavior using Pub/Sub.

Google also announced that Vertex AI Workbench is now generally available for its Vertex AI machine learning platform. This brings data and machine learning tools into a single environment so that users have access a common toolset across data analytics, data science, and machine learning.

Vertex AI Workbench is said by Google to enable teams to build, train and deploy machine learning models five times faster than with traditional AI notebooks. ®

Similar topics

Other stories you might like

  • Experts: AI should be recognized as inventors in patent law
    Plus: Police release deepfake of murdered teen in cold case, and more

    In-brief Governments around the world should pass intellectual property laws that grant rights to AI systems, two academics at the University of New South Wales in Australia argued.

    Alexandra George, and Toby Walsh, professors of law and AI, respectively, believe failing to recognize machines as inventors could have long-lasting impacts on economies and societies. 

    "If courts and governments decide that AI-made inventions cannot be patented, the implications could be huge," they wrote in a comment article published in Nature. "Funders and businesses would be less incentivized to pursue useful research using AI inventors when a return on their investment could be limited. Society could miss out on the development of worthwhile and life-saving inventions."

    Continue reading
  • Declassified and released: More secret files on US govt's emergency doomsday powers
    Nuke incoming? Quick break out the plans for rationing, censorship, property seizures, and more

    More papers describing the orders and messages the US President can issue in the event of apocalyptic crises, such as a devastating nuclear attack, have been declassified and released for all to see.

    These government files are part of a larger collection of records that discuss the nature, reach, and use of secret Presidential Emergency Action Documents: these are executive orders, announcements, and statements to Congress that are all ready to sign and send out as soon as a doomsday scenario occurs. PEADs are supposed to give America's commander-in-chief immediate extraordinary powers to overcome extraordinary events.

    PEADs have never been declassified or revealed before. They remain hush-hush, and their exact details are not publicly known.

    Continue reading
  • Stolen university credentials up for sale by Russian crooks, FBI warns
    Forget dark-web souks, thousands of these are already being traded on public bazaars

    Russian crooks are selling network credentials and virtual private network access for a "multitude" of US universities and colleges on criminal marketplaces, according to the FBI.

    According to a warning issued on Thursday, these stolen credentials sell for thousands of dollars on both dark web and public internet forums, and could lead to subsequent cyberattacks against individual employees or the schools themselves.

    "The exposure of usernames and passwords can lead to brute force credential stuffing computer network attacks, whereby attackers attempt logins across various internet sites or exploit them for subsequent cyber attacks as criminal actors take advantage of users recycling the same credentials across multiple accounts, internet sites, and services," the Feds' alert [PDF] said.

    Continue reading

Biting the hand that feeds IT © 1998–2022