Callers to FedEx (1-800-463-3339) are currently greeted with this:
Who thought this was a good idea?
Many phone menus have been converted to accept only voice commands. It makes sense: people dislike touch-tone menus, and it seems natural to offer them the ability to speak into the phone. They'll like that! It will give them the experience of speaking to a human without having to pay a real human in India to listen to them! It's no surprise that this appeals to the decision makers in these businesses.
Of course, if a company was doing exactly what it should be doing to please its customers, I probably wouldn't be writing about it. So what's the flaw? If you've used a voice-recognition phone menu, you already know:
Talking to a computer is strange.
Even if speech recognition becomes perfect (don't worry, it won't), people will never feel comfortable talking to their computers. It's unnatural. We're accustomed to interacting with computers by using computer-like means: discrete, unambiguous, deterministic interactions such as pressing buttons. That makes sense to us: we interact with inorganic objects by inorganic means.
Speech is our mechanism for fluid two-way conversations with other humans. In conversation, when we're given a choice between discrete options, we expect the ability to explain our answer or choose something else. ("Your honor, my clients didn't do it.")
Likewise, it's unnatural to interact directly with a human by pressing buttons with discrete actions. It would feel strange to walk up to your friend and press an "Ask About Lunch" button on the side of his head.
When presented with a touch-tone number menu, we know we're interacting with a computer, so it isn't uncomfortable to be given a set of discrete options with numbers. "To track a package, press 1." It doesn't even try to convince us that it's human, and our mental model adjusts accordingly to view it as a discrete menu with buttons.
In a voice menu, we're disoriented easily and less efficient. Like all phone menus, the item you want is rarely obvious the first time you hear it. You might listen to all of the choices before selecting Customer Support because you know you're in for a 20-minute wait followed by a painful conversation with a scripted idiot if you choose it.
After you've heard all of the options, you forget what you were supposed to say for the one you wanted, the third one. In a human conversation, you could just say "Uh, the third one." In a touch-tone menu, you could press 3. In a voice-only menu, you're stuck with hearing the entire list again.
Then there's the workplace factor. If you're responding to a voice menu, you sound like a robot, and your coworkers give you strange looks. "Check the status of an existing order!... Track a package!... Yes!... 1-5-2-1-7!"
Do they think that this is improving the annoying experience of calling a company who will do anything to ensure that you talk to a human as little as possible?