Tech Talk: Speak up and be recognized

Remember when businesses and professions had secretaries who could take dictation, type memos and write letters? Well, the good old days are back.

Now, you can find a secretary who will do all that for $100 or less. Your new secretary won’t make coffee for you, but the rest is within reach.

The best of the applicants to be your next secretary (or secretary’s assistant) is called Dragon Naturally Speaking by Nuance Communications. It is computer software that has the uncanny ability to turn your speech into text with 99 percent accuracy. The program has been around since 1997, but recent improvements finally make it practical for a small office.

Voice-recognition software is hot technology today. Computer engineers, like science fiction writers, have long dreamed of being able to converse with computers, like Hal in “Space Odyssey.” Microsoft’s new operating system, Vista, introduced earlier this year, includes a voice-recognition module. Newsweek, on May 28, said Google is working on a voice-recognition module for mobile devices such as smartphones to reduce the frenzied poking at tiny keys.

You can try voice-recognition for free if you have a recent version of Microsoft Office, Windows XP with Service Pack 1 and a good-quality microphone. In Microsoft Word, pull down the Tools command and see if you have an option for Speech. If not, you’ll need to find your program disk and install Speech as an add-in. For instructions, consult www.microsoft.com/windowsxp/using/setup/expert/moskowitz_02september23.mspx. Microsoft claims only 85 percent to 90 percent accuracy, which I consider inadequate.

Don’t confuse voice-recognition products for personal computers with the more limited voice-recognition services like those used in some telephone call centers.

Telephone-based services recognize only a few words and are designed to work with all kinds of voices. PC-based products are just the opposite. They key to a single user and recognize the full vocabulary of that person, however heavily accented or inflected it might be, as long as it is clear and consistent.

The Wall Street Journal in its issue of May 24 said a “slew” of new voice-recognition services are coming to market. The article described two that can transcribe voice-mail messages into text and e-mail them to you. SimulScribe of New York charges $10 a month plus 25 cents for each message after 40. SpinVox, based in Atlanta, offers a free, one-year trial. Send an e-mail to gamma@spinvox.com. I can’t vouch for either one.

Dragon comes with a microphone built into a headset. Just load the program, plug the headset into the microphone jack on your computer, read a 15-minute training passage into the program and you’re ready. As you dictate, the words appear simultaneously in your word processor. The effect is stunning.

You don’t need to type. In early versions, Dragon was promoted as a way to issue voice commands to your computer. For example, you said “open Microsoft Word” and “open a document” and you got a blank page ready to use. That function is still available for those who don’t type or who have some kinds of disabilities.

But if you can dictate coherently, Dragon will produce reasonably acceptable text that requires only minor touch-up, such as occasionally replacing words like are with our – which is really my fault since I don’t pronounce them differently, as I should.

Even more amazing, the program includes a feature that enables it to learn from its mistakes. Correct it once and the program will remember not to make that mistake again.

Concerned that the Dragon might not recognize construction-specific terms? No problem. Correct an unusual word and the program will add it to its dictionary.

Dragon has two more important advantages: a well-written instruction manual and technical support (first call free).

The only drawback is that you might have to tell Dragon where to put some punctuation marks. The program does rather well with periods and commas, but less common marks like dashes and colons must be spoken. Capitalization also might be an issue, but I was amazed at how often the software guessed right – probably more often than some secretaries I have had.

It gets even better.

Picture this: You’re out of the office — perhaps at a job site or maybe just getting in a little fishing, and you want to document an observation or an agreement, answer some e-mail or capture anything that you can say. You could bring along a microcassette recording device, but then someone would have to transcribe the tape manually — a long and tedious process. Dragon to the rescue.

Through a partnership between Dragon and three manufacturers of digital voice recorders — Sony, Panasonic and Olympus — anything you can say can be captured in the field and transcribed quickly and accurately when you get back in the office — and with less fuss than trying to type up field notes or transcribe them manually.

I paired Version 9 (Professional) of Dragon with the Olympus DS-40 digital voice recorder. The DS-40 is a little smaller than a cell phone and fits comfortably in the smallest pocket. The process couldn’t be simpler – or faster. Just pass press record on the DS-40 and talk away. The DS-40 has a jack for an external microphone, but I didn’t use it. The built-in stereo microphone was satisfactory.

Back in the office, connect the DS-40 to your computer with a USB 2.0 cable that is supplied and issue a download command from the Olympus software. Then tell Dragon to transcribe.

Since the whole process is digital, it takes only a couple of minutes, depending on the size of the file, to go from digital recorder to text file. Accuracy of the transcription from the DS-40 is almost as good as the 99 percent score that Nuance claims for voice recorded directly into the computer through its microphone.

I used the professional version of Dragon version 9. It sells for $800, although other versions with the same voice-recognition engine but with fewer enhancements are available for $200 and $100. The DS-40 sells for $200, which is about in the middle of the price range for Dragon-compatible units.

What Dragon does not do is recognize multiple voices. It hears and obeys only the master’s voice, which the software has been trained to recognize. I tried to get Dragon to transcribe someone else’s speech and got gibberish. Further, Dragon licenses its software for use only by a one person.

Is there a workaround that would enable a job conference to be recorded on site and transcribed back in the office? Yes, but each member of the conference would have to buy a separate copy of the software and voice recorder. A more practical field solution might be for the builder to repeat into the voice recorder what was said by each member of the team and what was agreed. This is how some television closed-captioning works.

Voice-recognition is a fast-moving area of technology that bears watching. When programmers solve the problem of multiple voices, you might be addressing your next computer as Hal.

For more information, consult www.nuance.com and www.olympusamerica.com/cpg_section/cpg_vr_digitalrecorders.asp.

Oliver Witte teaches journalism at Southern Illinois University. He was the founding editor of AIA’s Architecture Technology magazine and for several years managed the computer-aided architecture evaluation program for Architecture magazine. Contact him at owitte@siu.edu.

Related Posts:

  • No Related Posts

COMMENT