In a statement issued on Sunday, SpinVox admits it needs call centres staffed by human agents to transcribe voice messages and has begun to back away from its earlier claims that most of the translation is performed by AI-based machine translation software, without human intervention.
But thanks to company insiders and company filings, The Register has built up a picture of the company that makes believing even SpinVox's revised claims extremely difficult. Sources suggest SpinVox, a privately held company, is employing a far larger number of transcribers than it publicly states, even today. These sources also point to extreme difficulties in maintaining its operations as the company scaled, winning new carrier contracts in new markets. And an investigation into the company's much-vaunted intellectual property holdings indicates that it holds no machine translation patents.
The humans can make themselves felt. In one case, unpaid staff in Pakistan took over the centre and began broadcasting "distress" text messages to SpinVox subscribers in North America.
By insisting that its operation relies primarily on machines, rather than human manpower, SpinVox avoids security issues and can maintain a much higher corporate valuation. Mobile carriers are aware that 'Mechanical Turk' (named after the chess-playing Victorian automaton that concealed a human operator) transcription has high costs, as Vodafone found out with its human-assisted service.
Santa's Little Helpers
SpinVox success hinges on an apparent miracle, one made in defiance of the state-of-the-art in machine translation. It claims to translate voicemail messages with little or no human intervention. This is SpinVox's singular claim to fame, and has made it the darling of the press and investors. SpinVox has won $200m of investment and grown rapidly.
SpinVox executives repeated the claim last week: "the ratio of humans to messages and humans to number of users is very, very low" CEO Christina Domecq insisted, adding that "the majority of calls are fully automated." Messages by UK subscribers are handled in the UK, she added. SpinVox director Matthew Hobbs told Sky -
"We don't actually need to send any messages to human agents... All messages in the first instance will go through our automated voice message conversion system. Only if the system itself is unsure of a particular word or a particular fragment of the message will either a whole or part of the message be sent to an agent for quality control purposes. This in turn is fed back into the system to train it in a live learning mode."
But SpinVox's Sunday post backtracks - pointing to "five world class call centres" and for the first time, the significance of "human agents" - as the transcribers are called.
But former staff in key positions at SpinVox tell a very different story:
SpinVox insiders claim the company employs between 8,000 and 10,000 human agents around the world, and has more than the five transcription centres it says are in use.
"When you join they tell you that the technology server translates 92 per cent of the time. Then you're dragged into the HR room and made to sign an NDA. It's then that they tell you the true story. No more than two per cent of messages are not transcribed by humans. These are very simple messages such as 'Hello John, Call me back'".
SpinVox transcription centres span the globe, and as its business expanded to reposition as a B2B rather than a B2C business - winning carrier contracts with Vodacom, Telstra and Rogers - increasing numbers were employed. In its home country of the UK, SpinVox conspicuously failed to land a carrier contract - but this helped create an image of a plucky outsider relying on brilliant machine translation technology.
In a statement to The Register today, SpinVox conceded that five wasn't the full picture - there are indeed more.
"SpinVox has relationships with five major secure call centre suppliers around the world with some of these suppliers operating multiple call centres, all subject to the same security rules." (our emphasis) The company says 3,000 agents are used, and that "The ratio of agents to active users when SpinVox started was 5000 per million users. It is now 100 agents per million users".
“You're made to sign an NDA. It's then that they tell you the true story”
The job of maintaining the rapidly growing business was entrusted to Trainers. SpinVox employed around a dozen, providing technical and cultural-specific training to new agents. Several centres were in South Africa, but also Mauritius, South America and most famously in SpinVox folklore, Pakistan. For many agents English wasn't their first language.
SpinVox maintains that "the majority of the messages are converted by machine alone", adding that, "the machine seeks assistance when required. This means that any message could require between 0 and 100 per cent assistance from a human agent depending to what extent the VMCS technology has learnt the voice of the person leaving the message. Typically this takes around eight calls from an individual to a SpinVox user to reach a steady state of automation."
SpinVox declined to give a figure. "It is our confidential business formula. It is literally the ratio that any competitor or company wanting to start a business in the potential multi-billion dollar marketplace that SpinVox actually created would love to know so that they could come after us. No business is going to give its competitors a helping hand and SpinVox is no different."
Security concerns raised last week are fully justified, sources say.
According to the company, "agents working in a Live environment have no knowledge of customer, individual, product, market or use."
But according to a Spnivox insider: "There's zero security on these messages. An 18 or 19 year old kid is listening to a voicemail from a husband and wife - exchanging personal and financial information. It's outrageous."
So if miraculous speech transcription isn't in SpinVox's arsenal - then what is?