There has been a brouhaha going on today, with allegations that SpinVox has been a bit economical with the truth about how it converts speech into text, and its not so much an algorithm system as Mechanical Turk (or in this case, South African or Phillippino), as its using South African and Phillippines call centres to transcribe messages. The
BBC story says "claims to the BBC suggest that the majority of messages have been heard and transcribed by call centre staff in South Africa and the Philippines"
Not so,
says SpinVox, its all in the technology:
Claims have been made to the BBC, suggesting that the majority of messages have been heard and transcribed by call centre staff in South Africa and the Philippines. These are incorrect.
SpinVox has delivered world-leading breakthroughs in speech recognition and related technologies, developed by its Cambridge–based Advanced Speech Group - a highly qualified team of speech scientists working together with the world’s leading speech academics. This team is considered to be one of the largest commercial speech R&D teams world-wide.
In the past two years, the Cambridge ASG team has applied the latest research to create state-of-the-art techniques that today deliver a system that outperforms any equivalent speech technologies on accuracy, speed, scale, reliability and language range.
Now I must admit I don't use SpinVox, but I have watched the speech to text industry over-promise and under-deliver for 20 years so I would treat the above claims of technology superiority with just a bit of scepticism (I just always assumed they had human intervention on top of the speech to text algorithms). Accents and fast talking usually send these systems into a tailspin. But, if their technology can really do what they claim (ie, the majority of messages are not interpreted by call centre staff*) it is an order of magnitude better than anything else on the market today, so I would be very excited.
Commercially this will be essential, simply because a heavily human based system is far less scalable so the company's growth potential and economic envelope is far more limited. Also, the human method has implications on data protection and privacy for countries where it has no call centre presence .
Best thing for them to do is a public demonstration to clear this up.
* Their blog post is very carefully worded, which makes me wonder.....