LINGUIST List 12.1332

Tue May 15 2001

Sum: "Content"/Contributions of Different Modalities

Editor for this issue: Lydia Grebenyova <lydialinguistlist.org>


Directory

  1. Richard Sproat, Contributions of Different Modalities to "Content"

Message 1: Contributions of Different Modalities to "Content"

Date: Sun, 13 May 2001 11:03:18 -0400
From: Richard Sproat <rwsresearch.att.com>
Subject: Contributions of Different Modalities to "Content"

For Query: 12.1170

0. Introduction

I posted a message on April 26 concerning a claim about the
contribution of various modes of communication --- in particular
facial expressions ("visual"), "tone of voice" ("auditory") and the
actual words used ("language") --- to the content of a message. The
substance of this claim is that we get 55% of the content from the
"visual" component, 38% from the "auditory" component and 7% from
"language". The claim is apparently quite widespread in various
communities of people who teach communications skills. I myself heard
this in a management course in a section on effective communication. 

Where do these numbers come from? What did they originally mean?

1. The Source of the 7%-38%-55% Myth.

The prize for first pointing me to the source of the numbers goes to
Suzette Haden Elgin, who gave me the following three references:

 Mehrabian, Albert and Morton Wiener, 1967, "Decoding of inconsistent
 communications," Journal of Personality and Social Psychology
 6:109-114

 Mehrabian, Albert and Susan R. Ferris, 1967, "Inference of attitudes
 from nonverbal communication in two channels," Journal of Consulting
 Psychology 31:248-252.

 Daniel Druckman, Richard M. Rozelle, and James C. Baxter. 1982.
 Nonverbal Communication: Survey, Theory, and Research" Sage
 Publications 1982, pages 84-85.

The original source of the numbers is the pair of studies by Mehrabian
and colleagues. The first study, Mehrabian and Wiener, looked at the
perception, by subjects, of the attitude of a speaker towards her
listener. The experiment used stimuli recorded from two female
speakers, who read each of nine words under three different
intonational conditions. The nine words were divided into three
groups, three conveying positive affect -- "honey", "thanks" and
"dear"; three conveying neutral affect -- "maybe", "really" and "oh";
and three conveying negative affect -- "don't", "brute" and
"terrible". These three groups had been selected (based on written
stimuli) by a prior group of subjects as being good instances of each
of the three categories. The two speakers were then asked to imagine
that they were speaking to someone, using each of the nine words in
turn, and then "to say the words, irrespective of contents, in such a
way as to convey an attitude of liking, high evaluation, or
preference; a neutral attitude, that is neither liking nor disliking;
and an attitude of disliking, low evaluation, or lack of preference,
respectively, towards the target person" (page 10). There was no
control of how the two speakers implemented these attitudes
prosodically.

Three groups of subjects then listened to the stimuli, and were asked
to rate the degree of positive attitude of the speaker towards her
listener on a scale from -3 (most negative) to +3. The three groups
were given different instructions, either: attend to only the content
(of the words); attend to only the tone; or attend to all information
available. Mehrabian and Wiener found that in the "content only"
condition there was a significant effect of content, in the "tone
only" condition, there was a significant effect of tone, and in the
"use all information" condition, there was a significant effect of
tone, but that the effect of content was not significant. (There was a
slight difference in these results for the two speakers.) In other
words, when tone and (lexical) content were both considered, subjects
used tone more robustly than they used content.

The Mehrabian and Ferris article was a similar study, which this time
focused on the interaction of facial expressions and tone. Here a
single neutral word was selected by a group of subjects:
"maybe". Three speakers were then instructed to say this word with
three different intended attitudes towards their listener, as in the
previous experiment. Photographs of the faces of three female models
were taken as they attempted to convey like, neutrality or dislike
towards a hypothetical addressee. Subjects then listened to the
various renditions of the word maybe, crossed with the various
pictures, and were asked to rate the attitude of the (hypothetical)
speaker towards her addressee, again on a scale from -3 to
+3. Significant effects of facial expression and tone were found. They
then present a regression equation that summarizes the empirical
relative contributions of the two components to the total measured
attitude (page 251):

 A_T = 1.50 A_F _+ 1.03 A_V

where A_T represents the inferred attitude on the -3,+3 scale, A_F
represents the attitude of the facial component and A_V the attitude
of the tone. Note that 1.50/1.03 approx= 55/38.

In the final discussion, Mehrabian and Ferris propose to combine the
results of this study with the results of the Mehrabian and Wiener
study. In particular, they propose (without formally deriving):

 It is suggested that the combined effect of simultaneous verbal,
 vocal and facial attitude communications is a weighted sum of their
 independent effects -- with the coefficients of .07, .38, and .55,
 respectively. (page 252) 

This, then, is the original source of those numbers, though it is
possible that the later summary of these results in Druckman et al
(1982, pages 84--85) is the more direct source for many people.

But what do these numbers mean? Clearly the goals of Mehrabian and his
colleagues were very modest: basically they were interested in where
people get information about a speaker's general attitude (positive,
negative, neutral) towards their addressee in situations where the
facial expression, the tone and the words used might send conflicting
signals. Indeed, they are quick to point out the limitations of their
results. For instance, they make the following rather uncontroversial
points:

 These findings regarding the relative contribution of the tonal
 component of a verbal message can be safely extended only to
 communication situations in which no additional information about the
 communicator-addressee relationship is available (as was the case in
 the above experiment). For example, it seems that communications
 accompanied by the communicator's commitment to action will be judged
 according to that action. "I hate you," said in a positive tone of
 voice, may be interpreted as a positive-attitude communication in the
 absence of additional information. However, if the communicator says,
 "I hate you", in a positive tone while striking his addressee, the
 attitude communicated is a negative one. In such instances, the
 positive tone indicates the communicator's pleasure at what he is
 doing. Information about commitments of a communicator can also be
 given in the verbal portion of a communication. For example, a boss
 can tell his assistant that he is fired in a pleasant tone of
 voice. Here again, the tone of voice indicates the boss' pleasure
 about his communication, rather than a positive attitude towards the
 assistant. (Mehrabian and Wiener, pages 113-114).

This latter example would have been relevant in the management course
I was taking (not that the particular topic of firing people, thank
God, came up).

What can safely be concluded, I think, is that when people
communicate, listeners derive information about the speaker's positive
or negative attitudes towards the listener from visual (well,
actually, in the Mehrabian experiments, facial), tonal and verbal
cues, and that if there is an apparent mismatch between those cues,
people may, under certain conditions, derive more information from the
visual or tonal cues, than from the verbal cues (especially if only
one word is being spoken). A person interested in effective
communication might derive from this the maxim that one should be
consistent in their communication. This much I think we mostly knew
already. What the Mehrabian studies further seem to show is that under
very precise conditions, one can actually measure the contributions of
these different components to inferred attitude. But, again, as
Mehrabian and colleagues clearly state, other factors (such as
knowledge about the speaker-addressee relationship) are very important
in real-world communications, and readily override other
considerations.

Of course, we have been speaking here only of attitudes, not of
"content" in the large. There is not a shred of evidence in these
studies that should lead one to conclude that, for example, 38% of the
information conveyed in a spoken message is carried in the tone. That
claim is completely unfounded.

2. Other Resources.

Besides the articles above, there is a fair amount of other
information out there that is germane to this discussion. A number of
web pages relate to the 7%-38%-55% myth. David Smith and Ignasi
Adiego pointed me to:

 http://www.d26toastmasters.org/sage/page3.htm 

which gives a useful summary of the issues. I also found

 http://www.neurosemantics.com/Articles/Non-Verbal_Communication.htm

which similarly gives a useful summary. The authors of both of these
pages are trying to debunk the myth.

Some other web pages that I found or others pointed me to are given
below. In most cases these web pages are good examples of what NOT to
conclude from the Mehrabian data:

 http://uts.cc.utexas.edu/~adgrad/importance.html
 http://www.mackido.com/Thought/Communication.html
 http://www.district52.org/functionary/7vocal.pdf
 http://64.77.1.96/articulos/110.htm
 http://www.lifeprint.com/asl101/pages-layout/whystudyasl.htm
 http://www.chiroweb.com/archives/09/09/01.html

Albert Mehrabian also has a web page: http://www.kaaj.com/psych/ . He
briefly mentions the studies (as well has his book "Silent Messages",
which apparently discusses them). Notably, he makes the point that the
7%-38%-55% result has been widely misinterpreted.

3. Summary.

So the 7%-38%-55% Myth is widespread, and there also seems to have
been a widespread attempt to debunk it, though apparently with at best
mixed results. Interestingly it seems that many people who use these
numbers do not know their source. Various of my respondents said that
they had heard the 7%-38%-55% claim, but that they were not able to
find out the source from the person they heard it from. It has rather
the character of an urban legend.

One piece of all of this that has to my mind not been satisfactorily
answered is how a very careful study on the effects of multimodal
communication under very narrowly defined conditions was transmuted
into a broad claim about the nature of human communication. Somebody
must have popularized this idea. Was there, for example, a Time
magazine article in the late 60's on this new finding about
communication that was immediately taken up into the communications
folklore? 

4. Acknowledgments. 

I was surprised by the number of people who responded to this posting:
evidently this is a topic of some interest. Thanks to the following
for responding and providing useful information:

 Ignasi Adiego
 Dan Moonhawk Alford
 Steve Anderson
 Robert Belvin
 Peter Daniels 
 Suzette Haden Elgin
 Nigel Fabb
 Susan Fischer
 Nancy Frishberg
 Daniel Loehr
 Zouhair Maalej
 Alec Marantz
 Mike Maxwell
 Todd O'Bryan
 Jerry Packard
 And Rosta
 David Smith
 Arnold Zwicky

I have a nasty feeling I've left a couple of people off the list here:
my sincere apologies if it was you.

5. Other References.

Here I list some other references that people suggested. I haven't
checked any of these with a view to the particular issues being
discussed here:
 
Bandler & Grinder. NeuroLinguistic Programming. 

Beattie, Geoffrey and Heather Shovelton (1999) Mapping the range of
information contained in the iconic hand gestures that accompany spontaneous
speech. Journal of Language and Social Psychology 18-4:438-462.

Bolinger, Dwight (1983) Intonation and Gesture. American Speech
58-2:156-174.

Bolinger, Dwight (1986) Intonation and Its Parts: Melody in Spoken English. 
Stanford University Press.

Cameron, Deborah (2000) Good to Talk. Sage Publications.

Cassell, Justine, David McNeill, and Karl-Erik McCullough (1999)
Speech-gesture mismatches: evidence for one underlying representation of
linguistic and nonlinguistics information. Pragmatics and Cognition 7:1.

Hobbs, Jerry (1990) The Pierrehumbert-Hirschberg Theory of Intonational
Meaning Made Simple: Comments on Pierrehumbert and Hirschberg. In
Intentions in Communication, Philip R. Cohen, Jerry Morgan, and Martha E.
Pollack (eds.), 313-323.

Kelly, Spencer and Dale Barr (1999) Offering a hand to pragmatic
understanding: The role of speech and gesture in comprehension and memory.
Journal of Memory and Language 40:577-592.

Ladd, D. Robert (1996) Intonational Phonology. Cambridge University Press.

Loehr, Daniel (2001) Intonation, Gesture, and Discourse. Georgetown
University Round Table on Languages and Linguistics, 2001.

McClave, Evelyn (1991) Intonation and Gesture. Ph.D. Dissertation,
Georgetown University.

McNeill, David (1985) So You Think Gestures are Nonverbal? Psychological
Review 92-3:350-371

McNeill, David (1992) Hand and Mind: What Gestures Reveal About Thought. 
University of Chicago Press.

McNeill, David (1997) Growth Points Cross-Linguistically. In Nuyts, Jan and
Eric Pederson (eds.) Language and Conceptualization. Cambridge University
Press.

McNeill, David (2000) Language and Gesture. Cambridge University Press.

Pierrehumbert, Janet, and Hirschberg, Julia (1990) The Meaning of
Intonational Contours in the Interpretation of Discourse. In Intentions in
Communication, Philip R. Cohen, Jerry Morgan, and Martha E. Pollack (eds.),
271-311.

- 
Richard Sproat Human/Computer Interaction Research
rwsresearch.att.com AT&T Labs -- Research, Shannon Laboratory
Tel: +1-973-360-8490 180 Park Avenue, Room B207, P.O.Box 971
Fax: +1-973-360-8809 Florham Park, NJ 07932-0000
http://www.research.att.com/~rws/
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue