Essay #129: Approaching Human Capabailities.Bruce Papazian writes "Masahiro Mori, a Japanese roboticist, described the phenomenon in 1970 as the Uncanny Valley. He discovered that, to a certain point, as a robot looks and behaves more human, a person's emotional response to it becomes increasingly positive and empathic. Then, at some point along the path toward realism, the reaction flip-flops, and a person interacting with the robot finds it repulsive. Make the robot even more realistic, and the typical response to the robot flip-flops again, once more becoming empathic. The reason, Mori wrote, is that in a robot that is mostly unlike a human, the human characteristics stand out and generate empathy. But if the robot appears almost human, the nonhuman characteristics stand out and create a feeling of strangeness." |
|
|
Closer to human-like behavior is supposed to be better—even if it’s not real human behavior. We often hear that, as we get closer to human behavior, our user interfaces will be friendlier and more easily navigated. Is that really true? In the previous essay, we see that truly human attributes in a conversational interface await technological discontinuities of unknown (but probably very large) proportion. In other words, a speech machine that passes the Turing test is not only in the future, it may be in the very distant future. It may, in fact, be unachievable for ethical and pragmatic reasons even if achievable technologically. And it may not even be achievable technologically. But, we should be able to get close, shouldn’t we? Even if we can’t build a person, surely we can imbue our applications with some human attributes? And surely these attributes will engage, delight, and thrill our users with all the mystery and wonder of conversational feats beyond their ken? This last sentence should be read aloud in the voice of the midway barker at a circus or sideshow. It’s designed to engender raw, gullible enthusiasm. And stop calling me “surely.”
Well let me chum the waters with a devil’s advocate position. Consider the possibility that users prefer devices that are very different from human-like ones because they cause less dissonance between the expectation of intelligence and the reality of system capabilities. Look at the figure above. Using the same dotted line as the one in the previous essay, the illustration shows the gap between what the system is able to do (lower line) and what kind of expectations the user may bring when comparing the system with human capabilities (upper line). Notice that the distance between the two is a feature, not a drawback. We see similar user reactions in the everyday world. Few people driving a car, for example, succumb even briefly to the illusion that the car is sentient. We don’t think— even for just a moment—that the car will remember where it’s going and need not be steered. Cars are not that bright. We are not misled into thinking that they might be. |
Now look at the same user when the system is “improved:”
Here, we see that the user is a little discombobulated. The problem is that the two universes are too close to each other. The current system—the one that the user is struggling with—possesses many human attributes, but is still not a fully sentient conversational entity. Now, certain system behaviors lead the user astray, triggering speech acts that are human. It’s harder to maintain the distinction between rules for talking with this ersatz human and those that apply to the real human. The two can be thought of as dissonant—because they are similar, the context of talking to humans keeps infringing on the space that defines behaviors for talking with the system, causing interference and conflict. I like the term dissonant because it jibes well with its musical corollary. Two notes played together may be consonant or dissonant, depending on their distance from each other. Indeed there are certain harmonic intervals—fifths, fourths, and (if we fudge a little bit to find the sweet spot) even thirds and sixths—that have a very simple and stable ratio. But take those tones and move them closer to each other, and the sound becomes increasingly dissonant, until, eventually, you can actually hear the beating caused by the alternating constructive and destructive interference between the two waves. Two adjacent tones beating against each other create tension—the apotheosis of dissonance—that in effect “pushes apart” the tones. Stable harmonics interspersed with unstable dissonance produces the so-called leading tones that make major scales want to rise and minor scales to descend. The natural gravitation between tones that underpins melody and harmony can be ascribed in very large part to this psychological principle, restated here as a general axiom. “Things that are far apart stand in contrast to each other. Things that are close together stand in conflict. The natural tension associated with the “close together” set creates a force. This force wants either to separate or to converge the two. The separation or convergence in turn resolves the conflict.” |