In-memory NoSQL database Aerospike is launching connectors for Apache Spark and mainframes to bring the two environments closer together.
Following on from the release of Aerospike Database 5 in May, the idea is that IT teams can use Aerospike Connect to get data from existing transactional systems hosted on mainframes and exploit it using modern machine learning and analytics tools in Apache Spark. To that end, the Spark 2.4 connector supports streaming APIs for Structured Spark Streaming, which promises low latency for both reads and writes, the company said.
Meanwhile, there is also a connector based on JMS 1.1, a preferred option when integrating and synchronising with mainframe applications, to stream data in and out of Aerospike Database 5.
Bryan Betts, principal analyst with Freeform Dynamics, described the move as "extremely interesting".
He added: "The mainframe world is not the old world: these systems are still fundamental to operations of a lot of organisations. The challenge is bringing in new technology from outside the mainframe world."
According to Aerospike, the speed and low latency of its distributed multi-site clustering database allows users to draw data from mainframe systems for near-real-time analytics without changing or re-platforming the mainframe system.
"Data has gravity: mainframe systems are fundamental to the core operations of many business, holding years and decades of data," Betts said. "If you can get to that without making changes, that could be hugely valuable."
He emphasised that relational databases are not going away. Although the number of databases organisations use is a bit "out of control", a variety of technologies will be necessary, the analyst told The Register.
"I'm not convinced you can standardise on a single database technology. You are going to end up using more than one. You have to be open to the fact that very rarely is there a single tool that will do everything for you. The big companies are trying to cover as many bases as possible. With connectors and the ability to access the same data using multiple tools, the issue of having to have single source, from a large database vendor, kind of goes away."
Srini Srinivasan, Aerospike's chief product officer, told us one of the advantages of using the Spark connector for machine learning and analytics was reducing the demand for memory with Spark.
He said Aerospike had built a "data frame" for Spark to avoid pulling so much data into the analytics environment.
"You don't have to store all the data in Spark: you leave the data in Aerospike, and fetch it as it's needed. This allows the Spark process to access a lot more data, and reduces the amount of memory that Spark is using. Otherwise, you would have to expand the Spark memory footprint by a couple of orders of magnitude." ®