O'Reilly From the Editors List
BooksSafari BookshelfConferencesO'Reilly NetworkO'Reilly GearLearning Lab
 



Vox Ridiculi

01/15/2002

A recent discussion on Natural Language Processing (NLP) on the O'Reilly editors' mailing list led to a side discussion on the pros and cons of voice interaction with devices. While one editor contemplated the possibilities, a developer who has worked in an office where they were being used brought him back down to Earth.

From: Brian Jepson bjepson@oreilly.com
Date: Fri, 4 Jan 2002 12:46:01 -0500 (EST)

Speaking as someone who also worked on a bachelor's degree in Linguistics, I think NLP and speech recognition are going to be huge areas in the next year or two. I was looking at a recent copy of Linux Journal, and they had two of those groovy IBM Linux wristwatches on the cover. If I have a little Linux machine that small, how can I interact with it? It would be nice if I could talk into it.

Handwriting recognition on a palm or pocket PC gives me cramps, and the tiny keyboards on some handheld PCs are almost unusable. I think little computers need to understand human speech. I can't see it happening any other way - as these things get smaller, it's going to be harder and harder to key information into them. Speech recognition seems like the way to go.

And what about things that don't have LCD screens? If all my kitchen appliances are going to become incredibly smart, it would be a lot nicer if I could tell them what to do.


From: Rael Dornfest rael@oreilly.com
Date: Fri, 04 Jan 2002 10:11:50 -0800

One small counterpoint -- more a real-world, once-it's-in-your-office yellowish-orange flag than a comment on the tech itself...

On 1/4/02 9:46 AM, "Brian Jepson" <bjepson@oreilly.com> sayeth:

> Handwriting recognition on a palm or pocket PC
> gives me cramps, and the tiny keyboards on some handheld PCs
> are almost unusable. I think little computers need to
> understand human speech. I can't see it happening any
> other way - as these things get smaller, it's
> going to be harder and harder to key information into
> them. Speech recognition seems like the way to go.

If you've never shared an office with someone who uses speech recognition, it's a treat I tell you -- much in the same way it's a treat sharing an office with someone who listens incessantly to their voicemail messages via the speakerphone at high volume. At least the blaring voices on an answering machine may hold some nugget of passing interest; listening to someone using voice-recognition software is about as disturbing and fascinating as joining them on a voyage through their bank's automated telephone response system. "Open this." "Save that." "Deer... No, Dear... No, not Beer... Bloody hell... NO... Erase, erase..."

Now take this beyond simple computer control and file operations and you've an auditory nightmare of administrivia spoken to laptops, palmtops, and cellphones.

I drew a cartoon a few years ago called "What happened to IBM speech recognition research." A maze of cubicles. A voice rising above the walls: "format c:" followed by a chorus of "formatting c drive... 1% complete" echoing through the research halls.

Now I'm a decent office-mate, quiet (I rarely use the phone or get voicemail), and space-sharing. Yet I must admit there were times I wanted to smack the snot out of my VR-using roommate if I had to hear the command "Close window" one more time -- either he himself or his perpetually misunderstanding machine.

Again, the tech's cool, but I question whether a society that is already getting a little hot under the collar about public cellphone conversations will tolerate the din VR will bring to their homes, offices, restaurants, and cars.

My $0.0002,

Rael


From: Brian Jepson bjepson@oreilly.com
Date: Fri, 4 Jan 2002 13:26:58 -0500 (EST)

These are great points - I think speech recognition is going to have to suck less before it becomes practical, and yes, in many cases, it would be as annoying as public cell phone conversations. Most of it would be spoken into combo PDA/Cellphones, anyhow: storing phone numbers, appointments, etc. So, its introduction would not be terribly abrupt.

Headsets will help a little bit, as will speech recognition that isn't perpetually misunderstanding its users. Then there's the combination of NLP and speech recognition - the current technology available to consumers seems to be no more sophisticated than macros: associate a sound with an action.

But who knows? Maybe it will be something different. Sign language recognition? :-) I think that a smaller stylus isn't the answer, and neither is using a touchtone phone as a keyboard. Maybe speech is not the answer, but I think we need some solution to awkward user input devices.

Cheers,

Brian


From: Rael Dornfest rael@oreilly.com
Date: Fri, 04 Jan 2002 13:19:21 -0800

... I guess what I was getting at is that while the "sucks less" factor will significantly reduce the user's frustration level, it'll -- by ubiquity and verbosity -- significantly increase the frustration of everyone that person occupies any space with.

> But who knows? Maybe it will be something different.
> Sign language recognition? :-) I think that a smaller
> stylus isn't the answer, and neither is using a touchtone
> phone as a keyboard. Maybe speech is not the answer, but
> I think we need some solution to awkward user input devices.

I don't see PDAs getting smaller. If anything, they're sure to increase in size slightly for the screen resolution and readability -- either that or glasses and then retinal implants (almost, but scarily not so, kidding on that last bit).

I'm rather interested in the gestures thing. While currently rather basic, it has potential.

But you're right -- nothing like the voice.

Perhaps a throat-mike grokking whispers? There simply has to be a volume control for users.

-R


Return to From the Editor's List.



O'Reilly Home | Privacy Policy

© 2007 O'Reilly Media, Inc.
Website: | Customer Service: | Book issues:

All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.