26 Dec 2011

Voice recognition and computers that answer back

You can already tell a computer what to do – but what if it could answer back? Benjamin Cohen investigates a new wave of voice recognition technology.

Technology companies say they’re now building systems which can hold a proper conversation with a human being. But the kit has a long way to go before it can create a truly thinking computer, writes Technology Correspondent Benjamin Cohen.

My film for Channel 4 News is all about talking to a piece of technology; principally mobile phones like the Apple iPhone 4S or the Microsoft Xbox which allows you to talk to your television.

But, it made me think back to the first time I remember seeing anyone talking to their computer.

Nowadays, my father is the chairman of a legal services technology company but back in 1993, he was the managing partner of a mid-sized law firm. I remember him coming home from work one day to announce that he was making most of the secretaries redundant and replacing them with computers.

The concept immediately made me think of Rosie, the robot in The Jetsons. Partly because whenever I saw the secretaries (I was 11 at the time), they seemed to be photocopying documents, answering the telephones and making my dad and his clients cups of tea (I preferred hot chocolate).

Sometimes it sends an email to my boss saying that my Grandma has been arrested – seriously it did! Benjamin Cohen

Would he be buying a robot to do this for him? No, but he was intending to get new, state-of-the-art computers to do something else, the thing that his secretaries spent most of their time on, typing up dictation for complex legal documents, using the now discontinued IBM OS/2 operating system.

I remember him telling me at dinner how many words he’d trained his computer to understand and the partners and other lawyers who worked for him would spend whole weekends speaking thousands of words to their computers.

Voice recognition and computers that answer back.

Then the fateful day came, most of the secretaries left, with just a few retained to correct the mistakes made by the computers. The lawyers tried to spend a week talking to their computers, the lonely retained secretaries tried to spend a week correcting the mistakes made by the voice recognition system. Both the lawyers and the lonely secretaries, now occupying almost empty rooms got more and more frustrated.

Then at Friday night dinner, Dad announced that he was hiring all of the secretaries he’d made redundant back. The technology didn’t work well enough to be of any use.

What is remarkable is that it didn’t work despite the amount of time that the lawyers spent training the computers to understand their voices. Even more remarkable is that technology now works before you train it to understand your voice. Siri, on my iPhone, sort of understands what I’m saying, most of the time. In theory, it should learn how I pronounce words, but like the Windows phone, Google Android and the Microsoft Xbox, it works out of the box, so different from nearly 20 years ago when I first saw this sort of technology.

Except, while it might understand what I’m saying, it still struggles to understand what I mean. Sometimes Siri thinks I’m asking it to phone the Israeli Ministry of Foreign Affairs (I have their number in my phone because of a series of reports I filmed there), it keeps phoning them. Sometimes it sends an email to my boss saying that my Grandma has been arrested – seriously it did! She wasn’t. I actually said I’m in traffic and I need to have a rest.

But a lot of the time it works. The problem is, as I found while filming my report, in the UK at least, Siri doesn’t have the answers to the questions I ask. The technology understands them, it just doesn’t have the data to answer them, other than where can I can dispose of a dead body, try it, it has the answer to that. But I haven’t needed to use that yet! If I do, I’ll be sure to let you know on my Facebook page (facebook.com/benjamincohen), except that probably wouldn’t be a very good idea would it?