Statistical translation techniques have been successfully applied to decode an 18th century document written using an encryption scheme that has baffled scholars for decades.
The Copiale Cipher was found in book housed in an East Berlin Academy after the Cold War. The book’s pages contained about 75,000 neatly hand-written characters featuring abstract symbols and doodles alongside Roman and Greek characters. The mysterious cryptogram, bound in gold and green brocade paper, was inscribed in a 105-page book thought to contain the rituals and writings of an 18th-century secret society in Germany. The manuscript can be dated back to between 1760 and 1780.
The cipher had withstood previous attempts by crack it. But computer scientists from Sweden and the United States found that decryption was possible using statistical translation techniques of the kind used by Google Translate, Wired reports.
University of Southern California Viterbi School of Engineering computer scientist Kevin Knight – and colleagues Beáta Megyesi and Christiane Schaefer of Uppsala University in Sweden – first transcribed a machine-readable version of the document before applying various approaches to cracking the code.
The team firstly tried isolating the Roman and Greek characters and tried to uncover its meaning using translation project software and a library of 80 different languages. "It took quite a long time and resulted in complete failure,” Knight said, in a statement on the work.
The codebreakers hit on the idea that the recognisable characters might be there just as a smokescreen. They formed a theory that abstract symbols sharing similar shapes might represent the same letter, or a common letter sequence. Testing this theory using German and frequency analysis allied to statistical translation techniques yielded some meaningful words including "Ceremonies of Initiation" and "secret section". More on the code-breaking technique applied can be found here.
After this breakthrough, the researchers knew they were on the right track and they were subsequently able to decode the book, which has been revealed as the rituals and political thoughts of a German secret society, with a strange fascination for eye surgery and ophthalmology. Members of the secret society were not themselves eye doctors.
"When you get a new code and look at it, the possibilities are nearly infinite," Knight said. "Once you come up with a hypothesis based on your intuition as a human, you can turn over a lot of grunt work to the computer."
Flushed with their success, the group plans to apply their techniques to other documents that have baffled crypto-analysts, such as an unbroken message from the Zodiac Killer, a serial murderer who terrorised northern Californians in the the '60s as well the medieval Voynich Manuscript.
Knight is an expert in machine translation – teaching computers to turn Chinese into English or Arabic into Korean – not cryptography. "Translation remains a tough challenge for artificial intelligence," said Knight.
With researcher Sujith Ravi, a PhD in computer science, Knight has been approaching translation as a cryptographic problem.
The team hopes the approach will not only improve human language translation but also prove useful in making sense of languages that are not currently spoken by humans, including ancient languages and communication between animals. ®
Sponsored: Ransomware has gone nuclear