LINGUIST List 12.1332

Tue May 15 2001

Sum: "Content"/Contributions of Different Modalities

Editor for this issue: Lydia Grebenyova <>


  • Richard Sproat, Contributions of Different Modalities to "Content"

    Message 1: Contributions of Different Modalities to "Content"

    Date: Sun, 13 May 2001 11:03:18 -0400
    From: Richard Sproat <>
    Subject: Contributions of Different Modalities to "Content"

    For Query: 12.1170

    0. Introduction

    I posted a message on April 26 concerning a claim about the contribution of various modes of communication --- in particular facial expressions ("visual"), "tone of voice" ("auditory") and the actual words used ("language") --- to the content of a message. The substance of this claim is that we get 55% of the content from the "visual" component, 38% from the "auditory" component and 7% from "language". The claim is apparently quite widespread in various communities of people who teach communications skills. I myself heard this in a management course in a section on effective communication.

    Where do these numbers come from? What did they originally mean?

    1. The Source of the 7%-38%-55% Myth.

    The prize for first pointing me to the source of the numbers goes to Suzette Haden Elgin, who gave me the following three references:

    Mehrabian, Albert and Morton Wiener, 1967, "Decoding of inconsistent communications," Journal of Personality and Social Psychology 6:109-114

    Mehrabian, Albert and Susan R. Ferris, 1967, "Inference of attitudes from nonverbal communication in two channels," Journal of Consulting Psychology 31:248-252.

    Daniel Druckman, Richard M. Rozelle, and James C. Baxter. 1982. Nonverbal Communication: Survey, Theory, and Research" Sage Publications 1982, pages 84-85.

    The original source of the numbers is the pair of studies by Mehrabian and colleagues. The first study, Mehrabian and Wiener, looked at the perception, by subjects, of the attitude of a speaker towards her listener. The experiment used stimuli recorded from two female speakers, who read each of nine words under three different intonational conditions. The nine words were divided into three groups, three conveying positive affect -- "honey", "thanks" and "dear"; three conveying neutral affect -- "maybe", "really" and "oh"; and three conveying negative affect -- "don't", "brute" and "terrible". These three groups had been selected (based on written stimuli) by a prior group of subjects as being good instances of each of the three categories. The two speakers were then asked to imagine that they were speaking to someone, using each of the nine words in turn, and then "to say the words, irrespective of contents, in such a way as to convey an attitude of liking, high evaluation, or preference; a neutral attitude, that is neither liking nor disliking; and an attitude of disliking, low evaluation, or lack of preference, respectively, towards the target person" (page 10). There was no control of how the two speakers implemented these attitudes prosodically.

    Three groups of subjects then listened to the stimuli, and were asked to rate the degree of positive attitude of the speaker towards her listener on a scale from -3 (most negative) to +3. The three groups were given different instructions, either: attend to only the content (of the words); attend to only the tone; or attend to all information available. Mehrabian and Wiener found that in the "content only" condition there was a significant effect of content, in the "tone only" condition, there was a significant effect of tone, and in the "use all information" condition, there was a significant effect of tone, but that the effect of content was not significant. (There was a slight difference in these results for the two speakers.) In other words, when tone and (lexical) content were both considered, subjects used tone more robustly than they used content.

    The Mehrabian and Ferris article was a similar study, which this time focused on the interaction of facial expressions and tone. Here a single neutral word was selected by a group of subjects: "maybe". Three speakers were then instructed to say this word with three different intended attitudes towards their listener, as in the previous experiment. Photographs of the faces of three female models were taken as they attempted to convey like, neutrality or dislike towards a hypothetical addressee. Subjects then listened to the various renditions of the word maybe, crossed with the various pictures, and were asked to rate the attitude of the (hypothetical) speaker towards her addressee, again on a scale from -3 to +3. Significant effects of facial expression and tone were found. They then present a regression equation that summarizes the empirical relative contributions of the two components to the total measured attitude (page 251):

    A_T = 1.50 A_F _+ 1.03 A_V

    where A_T represents the inferred attitude on the -3,+3 scale, A_F represents the attitude of the facial component and A_V the attitude of the tone. Note that 1.50/1.03 approx= 55/38.

    In the final discussion, Mehrabian and Ferris propose to combine the results of this study with the results of the Mehrabian and Wiener study. In particular, they propose (without formally deriving):

    It is suggested that the combined effect of simultaneous verbal, vocal and facial attitude communications is a weighted sum of their independent effects -- with the coefficients of .07, .38, and .55, respectively. (page 252)

    This, then, is the original source of those numbers, though it is possible that the later summary of these results in Druckman et al (1982, pages 84--85) is the more direct source for many people.

    But what do these numbers mean? Clearly the goals of Mehrabian and his colleagues were very modest: basically they were interested in where people get information about a speaker's general attitude (positive, negative, neutral) towards their addressee in situations where the facial expression, the tone and the words used might send conflicting signals. Indeed, they are quick to point out the limitations of their results. For instance, they make the following rather uncontroversial points:

    These findings regarding the relative contribution of the tonal component of a verbal message can be safely extended only to communication situations in which no additional information about the communicator-addressee relationship is available (as was the case in the above experiment). For example, it seems that communications accompanied by the communicator's commitment to action will be judged according to that action. "I hate you," said in a positive tone of voice, may be interpreted as a positive-attitude communication in the absence of additional information. However, if the communicator says, "I hate you", in a positive tone while striking his addressee, the attitude communicated is a negative one. In such instances, the positive tone indicates the communicator's pleasure at what he is doing. Information about commitments of a communicator can also be given in the verbal portion of a communication. For example, a boss can tell his assistant that he is fired in a pleasant tone of voice. Here again, the tone of voice indicates the boss' pleasure about his communication, rather than a positive attitude towards the assistant. (Mehrabian and Wiener, pages 113-114).

    This latter example would have been relevant in the management course I was taking (not that the particular topic of firing people, thank God, came up).

    What can safely be concluded, I think, is that when people communicate, listeners derive information about the speaker's positive or negative attitudes towards the listener from visual (well, actually, in the Mehrabian experiments, facial), tonal and verbal cues, and that if there is an apparent mismatch between those cues, people may, under certain conditions, derive more information from the visual or tonal cues, than from the verbal cues (especially if only one word is being spoken). A person interested in effective communication might derive from this the maxim that one should be consistent in their communication. This much I think we mostly knew already. What the Mehrabian studies further seem to show is that under very precise conditions, one can actually measure the contributions of these different components to inferred attitude. But, again, as Mehrabian and colleagues clearly state, other factors (such as knowledge about the speaker-addressee relationship) are very important in real-world communications, and readily override other considerations.

    Of course, we have been speaking here only of attitudes, not of "content" in the large. There is not a shred of evidence in these studies that should lead one to conclude that, for example, 38% of the information conveyed in a spoken message is carried in the tone. That claim is completely unfounded.

    2. Other Resources.

    Besides the articles above, there is a fair amount of other information out there that is germane to this discussion. A number of web pages relate to the 7%-38%-55% myth. David Smith and Ignasi Adiego pointed me to:

    which gives a useful summary of the issues. I also found

    which similarly gives a useful summary. The authors of both of these pages are trying to debunk the myth.

    Some other web pages that I found or others pointed me to are given below. In most cases these web pages are good examples of what NOT to conclude from the Mehrabian data:

    Albert Mehrabian also has a web page: . He briefly mentions the studies (as well has his book "Silent Messages", which apparently discusses them). Notably, he makes the point that the 7%-38%-55% result has been widely misinterpreted.

    3. Summary.

    So the 7%-38%-55% Myth is widespread, and there also seems to have been a widespread attempt to debunk it, though apparently with at best mixed results. Interestingly it seems that many people who use these numbers do not know their source. Various of my respondents said that they had heard the 7%-38%-55% claim, but that they were not able to find out the source from the person they heard it from. It has rather the character of an urban legend.

    One piece of all of this that has to my mind not been satisfactorily answered is how a very careful study on the effects of multimodal communication under very narrowly defined conditions was transmuted into a broad claim about the nature of human communication. Somebody must have popularized this idea. Was there, for example, a Time magazine article in the late 60's on this new finding about communication that was immediately taken up into the communications folklore?

    4. Acknowledgments.

    I was surprised by the number of people who responded to this posting: evidently this is a topic of some interest. Thanks to the following for responding and providing useful information:

    Ignasi Adiego Dan Moonhawk Alford Steve Anderson Robert Belvin Peter Daniels Suzette Haden Elgin Nigel Fabb Susan Fischer Nancy Frishberg Daniel Loehr Zouhair Maalej Alec Marantz Mike Maxwell Todd O'Bryan Jerry Packard And Rosta David Smith Arnold Zwicky

    I have a nasty feeling I've left a couple of people off the list here: my sincere apologies if it was you.

    5. Other References.

    Here I list some other references that people suggested. I haven't checked any of these with a view to the particular issues being discussed here: Bandler & Grinder. NeuroLinguistic Programming.

    Beattie, Geoffrey and Heather Shovelton (1999) Mapping the range of information contained in the iconic hand gestures that accompany spontaneous speech. Journal of Language and Social Psychology 18-4:438-462.

    Bolinger, Dwight (1983) Intonation and Gesture. American Speech 58-2:156-174.

    Bolinger, Dwight (1986) Intonation and Its Parts: Melody in Spoken English. Stanford University Press.

    Cameron, Deborah (2000) Good to Talk. Sage Publications.

    Cassell, Justine, David McNeill, and Karl-Erik McCullough (1999) Speech-gesture mismatches: evidence for one underlying representation of linguistic and nonlinguistics information. Pragmatics and Cognition 7:1.

    Hobbs, Jerry (1990) The Pierrehumbert-Hirschberg Theory of Intonational Meaning Made Simple: Comments on Pierrehumbert and Hirschberg. In Intentions in Communication, Philip R. Cohen, Jerry Morgan, and Martha E. Pollack (eds.), 313-323.

    Kelly, Spencer and Dale Barr (1999) Offering a hand to pragmatic understanding: The role of speech and gesture in comprehension and memory. Journal of Memory and Language 40:577-592.

    Ladd, D. Robert (1996) Intonational Phonology. Cambridge University Press.

    Loehr, Daniel (2001) Intonation, Gesture, and Discourse. Georgetown University Round Table on Languages and Linguistics, 2001.

    McClave, Evelyn (1991) Intonation and Gesture. Ph.D. Dissertation, Georgetown University.

    McNeill, David (1985) So You Think Gestures are Nonverbal? Psychological Review 92-3:350-371

    McNeill, David (1992) Hand and Mind: What Gestures Reveal About Thought. University of Chicago Press.

    McNeill, David (1997) Growth Points Cross-Linguistically. In Nuyts, Jan and Eric Pederson (eds.) Language and Conceptualization. Cambridge University Press.

    McNeill, David (2000) Language and Gesture. Cambridge University Press.

    Pierrehumbert, Janet, and Hirschberg, Julia (1990) The Meaning of Intonational Contours in the Interpretation of Discourse. In Intentions in Communication, Philip R. Cohen, Jerry Morgan, and Martha E. Pollack (eds.), 271-311.

    - Richard Sproat Human/Computer Interaction Research AT&T Labs -- Research, Shannon Laboratory Tel: +1-973-360-8490 180 Park Avenue, Room B207, P.O.Box 971 Fax: +1-973-360-8809 Florham Park, NJ 07932-0000