French software developer Applidium claims to have reverse engineered the protocol by which the iPhone 4S' Siri voice assistant talks to Apple's voice recognition and analysis servers.
But don't expect a flood of superior Siri clones on other platforms, or even on other iPhones. Each communication is tied to the sending 4S' unique ID.
With a bit of digital certificate jiggery-pokery, a fake DNS server and the use of Zip decoding, the Applidium team was able to start analysing the binary data.
The upshot: Siri takes the voice recording, encodes it in the Ogg Speex format, Zips it, encrypts it and sends it to the server
guzzoni.apple.com for decoding and analysis.
Says the Applidium team: "The protocol is actually very, very chatty. Your iPhone sends a ton of things to Apple’s servers. And those servers reply an incredible amount of informations. For example, when you’re using text-to-speech, Apple’s server even reply a confidence score and the timestamp of each word."
Applidium has even posted a sample: the speech it sent to Apple's Siri servers - not from an iPhone 4S, though - and the XML data returned by the speech-to-text operation.
Applidium has uploaded the tools it created and used to crack Siri, but - understandably - it's not providing the iPhone 4S ID it used. We'd expect Apple to be able to spot near-simultaneous Siri requests from the same device in many, many different locations and block the device ID.
If it hasn't implemented such a trick it certainly will soon. ®