Getting back into the saddle with voice recognition

March 27, 2011

It’s been nearly a year since I blogged and despite many good intentions I simply haven’t gotten round to writing a post. It seems like it’s part of the nature of blogging or at least my blogging to stop for (long times) and then start again. Today, I’m trying out a new way of blogging by using Dragon Dictate, a simple piece of software for the Mac, which transcribes what I say. It seems funny that 18 years ago I was studying speech recognition in Edinburgh, never expecting that the technology would get so good the general speech recognition would work on a home computer. Yet Dragon Dictate is just about flawless and I can now get text into the computer as fast as I can speak (although clear thinking seems to be a little slower).

11 years ago I ran a project looking at user perception of multimodal interfaces. Despite the lack of deep research, it was prescient in recognising that this would soon become an important part of how we all related to computers. I had no idea at the time that I would be using multi modal interfaces quite so quickly on my iPhone. Nor would I have believed at that time that a few years ahead I would be talking to a computer and seeing my thoughts written word for word on the screen.

Actually using voice as an interface makes the huge investment clearer. I have blogged about Spinvox (not Springboks as Dragon thought) in the past and I’m sure there are other cloud and local voice recognition systems such as the technology that Nuance has built into Dragon Dictate that work just as well. It seems inevitable that Apple must acquire one such technology and that this will become a core part of either iOS and Mac OS, as much or more than touch has become.

