SQL fights back against NoSQL's big data cred with SQL/MDA spec
The empire strikes back with multi-dimensional arrays
With the growing popularity of big-data tools like NoSQL databases and Hadoop, it might have looked like SQL could be in line to be moved on from “venerable” tag to “obsolete”, but last week, the ISO SQL working group agreed to start work on SQL/MDA (multi-dimensional array) specs.
The people behind SQL have decided it's time to get serious (really, really serious) about big (as in really, really big) multidimensional datasets, and have kicked off an effort to extend the standard, adding the new capabilities needed by spatial, scientific, engineering and medical users.
As spatial publication GIM International notes, SQL doesn't offer an elegant way to handle the kinds of arrays generated by scientific big data. For example, meteorology might maintain four-dimensional data sets covering location, altitude and time, and those kinds of arrays are held and processed in other environments, even if the data stores are then referenced in an SQL database.
Similarly, data collected by big sensor networks can easily become multi-dimensional – and even if the volume data isn't out of this world, the multi-dimensional nature of data sets puts them beyond SQL.
A separate effort, called Rasdaman (a scalable multi-dimensional array analytics server) has been working for some time to apply an SQL-like query language to array databases. Rasdaman's backer, Peter Baumann of Jacobs University Bremen in Germany, put forward the proposal now adopted by the ISO.
Rasdaman, GIM says, has showed impressive results: “In a recent technology demonstration, more than 1,000 computers collaborated in a cloud to jointly compute the result of a single database query. This ‘distributed query processing’ means a massive speed increase, and research challenges on multi-Petabyte data cubes can be answered that were previously unsolvable,” the outlet writes.
Rasdaman has also shown good adoption in the open source GIS world, being adopted by the ubiquitous GDAL (Geospatial Data Abstraction Library) as a library component, and with MapServer integration in beta. ®