British signals intelligence agency Government Communications Headquarters (GCHQ) has created a repository on Githgub and released open-sourced one of its tools: a graph database called “Gaffer”.
Available here on Github, Gaffer is billed as “ a framework that makes it easy to store large-scale graphs in which the nodes and edges have statistics such as counts, histograms and sketches.”
Gaffer uses Accumulo for storing data, but can use other stores, is built on maven and released under an Apache 2 licence. The tool is said to be useful for the following tasks:
- Allow the creation of graphs with summarised properties within Accumulo with a very minimal amount of coding.
- Allow flexibility of statistics that describe the entities and edges.
- Allow easy addition of new types of nodes and edges.
- Allow quick retrieval of data on nodes of interest.
- Deal with data of different security levels - all data has a visibility, and this is used to restrict who can see data based on their authorizations.
- Support automatic age-off of data.
Why is GCHQ releasing Gaffer now? The agency's repo is silent on the reasons, but does say it's already started work on Gaffer2, a project it hopes will result in “a more general purpose graph database system.” So perhaps it has outgrown Gaffer. Or perhaps GCHQ has caught the UK Government Digital Service bug and become keen on releasing government-penned code.
Feel free to insert your conspiracy theory below. But before you do so, remember we can all read the source ... ®
I was all upset about the surveillance state, but I guess with that github repo gchq have shown me I was wrong and they're just normal guys— John Spray (@jcspray) December 14, 2015