From stock searches to map directions, any time a user queries a database, they tell the database owner something valuable.
A group of researchers from MIT's Computer Science and Artificial Intelligence Laboratory is offering a way that queries can be made private, by breaking them into pieces and having different (identical) databases handle each piece of the query.
The idea will be presented at next week's USENIX Symposium on Networked Systems Design and Implementation.
With the right design, says MIT student Frank Wang, lead author of this paper (PDF), only one of the data providers needs to be honest to protect a user's privacy.
It's based on what Wang's paper calls Function Secret Sharing (FSS), a cryptographic feature that “allows the client to split certain functions into shares that keep parameters of the function hidden unless all the providers collude”, without imposing too heavy a load on the CPUs in the system.
FSS was first described in 2015 by Israeli researchers Elette Boyle and Shafi Goldwasser (who partnered with Wang on the new paper).
The group's work takes advantage of modern multicore processors that implement AES-NI (advanced encryption standard, new instructions); and second, the paper presents protocols that “let Splinter support a subset of SQL that can capture many popular online applications.
In Splinter, each provider in a system hosts a copy of the same database. The client splits a query into a “share” that are submitted to different providers, and the responses are recombined to provide the answer.
In academic language, FSS which accomplishes this “lets a client divide a function f into function shares f1, f2,..., fk so that multiple parties can help evaluate f without learning certain of its parameters.”
COUNT query as an example, assume the user wants to keep the value of count secret (eg:
SELECT COUNT(*) FROM items WHERE ItemId = ?) – only the user knows the “?” value is 5.
FSS works out how to send different queries to the target database, so that the client can work out from their answers what
ItemId=5 would have returned.
As MIT explains: “a database query is converted into a set of complementary mathematical functions, each of which is sent to a different database server. On each server, the function must be applied to every record in the database; otherwise, a spy could determine what data interests the user.
“Every time the function is applied to a new record, it updates a value stored in memory. After it’s been applied to the last record, the final value is returned to the user. But that value is meaningless until it’s combined with the values reported by the other server.”
In the MIT article, Wang describes applications for Splinter: “When people were searching for certain kinds of patents, they gave away the research they were working on. Stock prices is another example: A lot of the time, when you search for stock quotes, it gives away information about what stocks you’re going to buy. Another example is maps: When you’re searching for where you are and where you’re going to go, it reveals a wealth of information about you.” ®