This article is more than 1 year old
Google proposes Logica data language for building more manageable SQL code
COBOL-inspired caps lock English just don't add up when dealing with large SQL codebases
Structured Query Language (SQL) at scale can lead to unstructured, unmaintainable database code - at least as far as Google is concerned - so boffins affiliated with the biz have devised an open source logical programming language to make SQL more amenable to maintenance.
"Good programming is about creating small, understandable, reusable pieces of logic that can be tested, given names, and organized into packages which can later be used to construct more useful pieces of logic," explain Google software engineers Konstantin Tretyakov and Evgeny Skvortsov in a post to Google's open source blog. "SQL resists this workflow."
Tretyakov and Skvortsov propose using a new open source logic programming language called, aptly enough, Logica, to craft database interactions using the syntax of mathematical propositional logic instead of the chains of English words used in SQL.
SQL now a dirty word for Oracle, at least in cloudy data warehousesREAD MORE
Logica, its creators say, "stands for Logic with aggregation." The project's description on GitHub makes its target audience a bit more clear: "Logica is for engineers, data scientists and other specialists who want to use logic programming syntax when writing queries and pipelines to run on BigQuery."
Logica code compiles to SQL which Google hopes will get run on BigQuery, the ad giant's data platform-as-a-service. But it can be run locally and SQL is portable; it also offers experimental support for targeting PostgreSQL and SQLite.
The language is "more concise and supports the clean and reusable abstraction mechanisms that SQL lacks," or so say Tretyakov and Skvortsov.
Here's how a basic query might look in Logica code:
MagicComment(comment_text:) :- `comments`(user_id:, comment_text:), user_id == 5;
And here's the equivalent in SQL:
SELECT comment_text FROM comments WHERE user_id = 5;
Logica is the successor of a Datalog-like language Google developed internally and discussed in 2015 called Yedalog, which tried to provide a tool for querying large, semi-structured data sets.
Tretyakov and Skvortsov don't explain why Yedalog needed a successor but a 2016 paper [PDF] by the Yedalog's creators suggests that Google's Yedalog implementation automated control decisions normally left to those working with more general purpose languages.
Because that created uncertainty about who had operational responsibility for keeping things running, the Yedalog developers drafted documentation describing a suggested operational model. But they found that their users learned by experience, rather than reading, and that experience led them to make assumptions that the Yedalog team hadn't anticipated.
"The model they learned assumes some forms of automation beyond any we had planned to support," the paper explains.
Whatever the current state of Yedalog at Google these days, Logica aims to move beyond verbose COBOL-style caps lock English to the language of formal logic, in the hope database code can be broken up into more manageable pieces.
"This inherent resistance to decomposition of logic into bite-sized pieces is what leads into the contrived, lengthy queries, the copy-pasted chunks of code and, eventually, unmaintainable, unstructured (note the irony) SQL codebases," quip Tretyakov and Skvortsov. ®