Software

This article is more than 1 year old

Love lambda, love Microsoft's Graph Engine. But you fly alone

Open source with a difference, from Redmond

Mon 20 Feb 2017 // 09:34 UTC

Analysis Much has changed at Microsoft since Steve Ballmer described Linux as "a cancer" in reaction to the open-source flag-flyer's threat to Redmond's money-spinning Windows business.

Three years after Redmond's researchers published their whitepaper on distributed graph engine Trinity, Microsoft has announced that it has released the technology – now named Graph Engine – on an open-source basis under the MIT licence.

Why? Graph Engine lands in an odd, open-sourcey kind of place. According to DB-Engines, the most popular graph DBMS by some magnitudes is Neo4j, followed by OrientDB and Aurelius's TitanDB graph databases, which both have roughly 15 per cent as many users.

OrientDB and TitanDB have both been dogged by complaints that they are not production-ready, yet they are both in use. Alongside Neo4j, they are essentially open-source NoSQL databases, which Graph Engine arguably is too, although it has not been designed to do the things that graph databases are known for doing.

Known for doing?

Graph databases are databases (duh) specialised to perform complex queries on highly interconnected data. In theory, they cater to queries multiple levels deep for which multi-way JOINs in relational databases would be prohibitively computationally expensive – although performance in terms of how many levels may actually be traversed varies between graph database offerings.

All graph databases are, by definition, made up of the lines, or edges, connecting the data items within them. These would be considered nodes by graph theory proponents (which almost all graph database users are) and are roughly equivalent to rows in relational databases. Both the edges and nodes contain properties, which function like key-value pairs for the purposes of querying data.

But while all graph databases are constituted of the relationship between data items, the graph databases currently available on the market are quite different in the workloads they're optimised for. Some, such as Neo4j, employ a single data model that's been optimised, while others like OrientDB use different data models.

This is where Graph Engine offers something different.

Late, but different

Microsoft's Graph Engine has not been tuned to query or store data. Nor is it ACID transaction compliant – like most other graph databases. Rather, Graph Engine lets you crunch analytical workloads, including online transaction processing using the memory of a distributed system.

As one researcher from the Graph Engine team described it to The Register, Graph Engine is a "distributed in-memory data processing engine" with "a strongly typed distributed key-value store" as the storage backend.

In other words, the computation takes place across a cluster of machines – a cloud – where the storage infrastructure holds data in-memory.

As the data is being held in-memory, it cannot be located through a physical address to a location on a networked hard disk, but instead is tracked using hashes and replicated index tables across the cloud.

This lets you perform those classic graph queries that delve multiple levels deep, only they are sped up through the use of an memory-based storage infrastructure. It also means that offline graph analytics can be improved with parallel processing provided for by the distributed architecture.

Queries

A key difference for most graph databases is the query languages they utilise; each one independent to its own graph. Neo4j's language is called Cypher, which is a declarative, SQL-inspired language for describing patterns in graphs visually using an ASCII-art syntax.

OrientDB's query language commits to being SQL-like, while TitanDB uses the Gremlin query language, which is vying to become the standard. Similar to Cypher, it chains traversal operators together to form path-like expressions of how the query should be executed throughout the graph.

The real difference in Graph Engine is visible here, with its use of Language Integrated Knowledge Query (LIKQ). According to Microsoft this lets users express their query logic using lambda expressions. "It combines the capability of fast graph exploration and the flexibility of lambda expression: server-side computations can be expressed in lambda expressions, embedded in LIKQ, and executed on the server side during graph traversal," Microsoft said.

Translated: LIKQ arguments can be added to queries via Lambda, for analysis of the data within the graph and more complicated analytics.

Other differences? Graph Engine is Windows-only but will "soon" arrive on non-Windows platforms, according to this post on Hacker News.

Also, unlike the open-source crowd, there's no paid support, so use Graph Engine and you're on your own if things go wrong. Evidently, Microsoft want to encourage folk to play with Graph Engine, suggesting users ping them for "design consulting".

These are early days and "free" is the bait used in open source to encourage adoption. Payment comes later or in parallel for the "enterprise" edition.

If and when Graph Engine receives wide use, paid support will no doubt be forthcoming.

As the first processing-focused technology of its type, Microsoft will await uptake before commercialising Graph Engine as it has SQL Server or Azure SQL and with paid support. ®

More about

Microsoft

More about

Microsoft

Narrower topics

Narrower topics

Broader topics

Bill Gates

TIP US OFF

Send us news

Topics

Special Features

Vendor Voice

Resources

Software

Love lambda, love Microsoft's Graph Engine. But you fly alone

Open source with a difference, from Redmond

Known for doing?

Late, but different

Queries

More about

More about

Narrower topics

Broader topics

More about

More about

More about

Narrower topics

Broader topics

TIP US OFF

Other stories you might like

Microsoft cannot keep its own security in order, so what hope for its add-ons customers?

Researchers claim Windows Defender can be fooled into deleting databases

October 2025 will be a support massacre for a bunch of Microsoft products

Reducing the cloud security overhead

Microsoft is a national security threat, says ex-White House cyber policy director

Open source versus Microsoft: The new rebellion begins

Microsoft breach allowed Russian spies to steal emails from US government

Now all Windows 11 users are getting adverts to 'make the Start menu great again'

Microsoft shrinks AI down to pocket size with Phi-3 Mini

Microsoft claims it didn't mean to inject Copilot into Windows Server 2022 this week

AI gold rush continues as Microsoft invests $1.5B in UAE's G42

Microsoft to use Windows 11 Start menu as a billboard with app ads for Insiders

About Us

Our Websites

Your Privacy