Oracle will today release, in its words, "a free and open API and developer kit" for the hardware-accelerated SQL-crunching engines in its Sparc M7 processors. You can register to grab the goodies, here.
"We're opening up the interfaces to enable programmers using C/C++, Java and Python to effectively use these accelerators," Marshall Choy, Oracle's VP of Product Management, told The Register.
"It is open and free: there are no licensing charges. It just requires a simple and standard click-through agreement. I personally inspected the click-through agreement because I’m always suspicious of those. The licensing is fairly similar to the GNU GPL."
We previously described these engines – officially known as Data Analytics Accelerators or DAX – during October's Oracle OpenWorld conference. They work by slurping compressed data from memory at up to 160GB/s, analyzing and filtering the information as it is decompressed, and outputting the results into the CPU's L3 cache.
"They work really well for scan operations, when you’re comparing values and making range comparisons, performing SELECT type functions, filtering to reduce a column, searching, and extracting data," Choy told us.
"They can be used for real-time discovery of outliers in datasets. One of the classic uses for this in industry is fraud detection: for example, finding anomalies in spending on credit cards."
The M7 can process 32 streams through its DAX engines simultaneously; these work independent to the 32 Sparc cores in the processor. The accelerators can be used in parallel to rapidly slurp in-memory columnar databases and perform simple SQL queries – such as calculate the total number of articles written by this hack in 2014. As their name suggests, the DAX units are aimed at analytic operations: reading and grokking lots of data, not altering or writing lots of data.
The engines are used by Oracle Database 12c to speed up its SQL queries – and now any software running on Solaris 11 (there's always a catch) can gain access to the technology via Big Red's APIs. These interfaces allow applications to tap into the in-hardware accelerators to scream through data; the libraries do the hard work of programming the DAX engines as required.
Artist's impression ... Basic layout of the M7's DAX pipelines
Oracle is keen to emphasize the compression and decompression aspect of the DAXs. Compressing the data in storage prior to processing means more information can be fed into accelerators per second, and the decompression happens at the same time as the analysis, so you more or less get it for free, according to Oracle.
We're also told that range comparisons happen in one step, too, so if you want to, say, pick out the number of transactions between two particular dates, that happens in one step in hardware. And all this happens away from the CPU cores' caches, so the DAX units won't pollute the per-core caches with intermediate data.
Ultimately, you should be able to use the DAX units to burn through millions, if not billions, of rows of information a second to drive interactive analytics. Don't forget this information must be stored in compressed in-memory columnar database format; you gotta work for your acceleration.
And it's not just SQL: Oracle has been working on hot-wiring Apache Spark into its M7 CPU's hardware acceleration and providing patches upstream to support its "software-in-silicon." The computer science department at Brown University in the US is in the process of measuring the performance of the M7's DAX technology with large data sets, we're told.
The exact details of the API's license were not disclosed at time of writing – we'll update this article as soon as we find out. ®