LINGUIST List 28.291

Fri Jan 13 2017

Review: General Ling; Ling Theories; Psycholing: Knoeferle, Crocker, Pyykkönen-Klauck (2016)

Editor for this issue: Clare Harshey <>

Date: 16-Aug-2016
From: Andrea Lypka <>
Subject: Visually Situated Language Comprehension
E-mail this message to a friend

Discuss this message

Book announced at

EDITOR: Pia Knoeferle
EDITOR: Pirita Pyykkönen-Klauck
EDITOR: Matthew W. Crocker
TITLE: Visually Situated Language Comprehension
SERIES TITLE: Advances in Consciousness Research 93
PUBLISHER: John Benjamins
YEAR: 2016

REVIEWER: Andrea Eniko Lypka, University of South Florida

Reviews Editor: Helen Aristar-Dry


The interaction among visual context, cognition, motoric, and language processing has become an intriguing research terrain for cognitive neuroscientists and psycholinguists. Monitoring this complex interplay becomes feasible through the visual world paradigm, for example, by tracking participants’ eye movements or mouse clicks to visuals during spoken language tasks and instructions. Thus, the visual world paradigm is instrumental for understanding real-time thinking processes, in-situ language comprehension and production through linguistic, visual, cognitive, and motoric processes (for an overview, see Knoeferle & Guerra, 2016).

As an edited volume in the Advances in Consciousness Research (AiCR) series, Visually Situated Language Comprehension, coedited by Pia Knoeferle (Humboldt University Berlin), Pirita Pyykkönen-Klauck (Saarland University, Norwegian University of Science and Technology), and Matthew W. Crocker (Saarland University), is a rewarding resource for researchers, students, and practitioners interested in exploring the relationship among visual context, cognition, and language comprehension, through the visual world paradigm. In addition to the preface and index, the 12 chapters in this volume, written by established researchers and scholars, offer state-of-the-art overviews of the methodological and theoretical applications of the this paradigm. The first three chapters introduce the visual world paradigm, a psycholinguistic method designed to study questions related to linguistic processing and attention in visually-situated contexts. Following the historical presentation of the field of visually situated language and key concepts, such as “visual world paradigm”, “context”, and “real-time measure”, in Chapter 1, Michael J. Spivey and Stephanie Huette conceptualize language comprehension as a dynamic, interactive process embedded in visual context. The authors describe visual world methodologies, such as tracking natural eye movements, computer-mouse movements, and postural sways, to index real time contextual spoken language processing.

Building on the visual world paradigm, the following two chapters provide an in-depth methodological discussion. In Chapter 2, Benjamin W. Tatler reports on how information is gathered from the visual environment and encoded into memory. The author distinguishes among photographs, motion pictures, and three-dimensional contexts, arguing that characteristics such as motion cues, composition, luminance, and dynamic range influence the cognitive processes while viewing scenes and retaining information. In Chapter 3, Pirita Pyykkönen-Klauck and Matthew W. Crocker focus on overt visual attention in active and passive tasks. The authors provide evidence from various studies to highlight the importance of methodological properties of the visual world paradigm, such as the nature of the task, the linking hypotheses, and statistical decisions in eye movement analyses. The chapter concludes with a discussion of the challenges of monitoring and interpreting eye-movement data to explain the language-attention-visual scene relationship.

The remaining nine chapters review visual world studies on various topics in language processing and representation in visual contexts. The chapters investigate topics ranging from referential processing (Chapters 4 and 5), discourse level processing (Chapter 6), figurative language processing (Chapter 7), sentence processing (Chapters 8 and 9), perspective-taking (Chapter 10), and to natural conversation (Chapters 11-12), using visual world eye-tracking research.

In more details, the role of syntax in sentence and referential processing is the focus of Chapter 4. The authors, Roger P. G. van Gompel and Juhani Järvikivi review experiments to investigate how adults and children process various sentence structures, and how this visual world method influences referential processing. The findings of these experiments reveal that, in contrast to children, who appear to rely more on verb bias, adults rely more on contextual information, such as the visual context, action-based affordances, lexical biases, and prosody in processing structurally ambiguous sentences.

Chapter 5 centers on semantic processing in sentence comprehension and reference. In this chapter, Paul E. Engelhardt and Fernanda L. Ferreira provide evidence from studies that focus on how people construct meaning of sentences, establish semantic-conceptual knowledge, and predict references to promote comprehension. Overall, the authors conclude that, due to the complex interface of linguistic and visual inputs, comprehension in sentence processing and predictive reference tasks are difficult to interpret. In light of these findings, the authors propose more theoretical refinement of the visual world paradigm.

To complement language processing on the syntactic and lexical levels (Chapters 4 and 5), Elsi Kaiser investigates discourse level processing in Chapter 6. In particular, the author discusses theoretical and methodological approaches to discourse-level information. Drawing on existing research, Kaiser demonstrates the importance of discourse level processing, indexed in pitch accents, word order, and referring expressions, suggesting that discourse level processing interacts with syntactic and lexical processing in real-time language comprehension.

In the next chapter, Stephanie Huette and Teenie Matlock review figurative language processing. A central issue is how people interpret figurative language encoded in abstract, non-literal sentences, metaphors, similes, idioms, ironic statements, and other figurative descriptions. Drawing on the notions of a dynamical system, fictive motion, and the visual world paradigm, the authors suggest that mental states are embodied in language comprehension.

In Chapter 8 Craig Chambers, influenced by the embodied framework of language processing, continues the theoretical discussion with a focus on affordances, interaction with objects, participants, or events, in visually situated language comprehension. The studies reviewed in this chapter highlight the importance of affordances and other sensory-perceptual information in language comprehension; nevertheless, evidence is inconclusive about the specific role affordances play in language comprehension.

The relationship among language comprehension, attention, and visual context is the focus of Chapter 9. Pia Knoeferle reports findings from visually situated language comprehension in eye-tracking and event-related brain potential experiments to explain that different types of scenes, such as clip art depictions and photographs as well as the speaker’s visual cues, such as gaze, head movements, emotional mimics, and gestures, interact with the visual attention during reading and spoken comprehension. The chapter concludes with a call for developing a comprehensive theory of situated language comprehension.

The next two chapters draw on visual world studies on interactive dialogue in spoken language comprehension. In Chapter 10, Dale J. Barr addresses debates in conversational perspective taking research, such as discrepancies about the effects of common ground or the shared knowledge between interlocutors. To reconcile issues with data analysis and interpretation, the author proposes expanding investigations of contextual influences on linguistic comprehension to the process level as opposed to individual level.

In the next chapter, Sarah Brown-Schmidt explores how interlocutors collaborate to make meaning in face-to-face communication. In contrast to laboratory speech, everyday natural conversation is context dependent, disfluent, unscripted, interactional, and facilitated through alignment of gesture, attention, gaze, action, and through perspective-taking, common ground, and discourse history. Findings in this chapter support the claim that language processing in interactive settings differs from scripted laboratory communication. Theoretical and methodological paradigms need to account for these characteristics in language processing.

The final chapter expands the dialogue on the engagement of visual and cognitive processes during language comprehension with the motor systems. Drawing on the domain of embodied cognition Thomas A. Farmer, Sarah E. Anderson, Jonathan B. Freeman, and Rick Dale provide evidence from literature to emphasize the role of the motor system, such as gestures, eye movements, and postural sways during language comprehension. The authors propose that computer mouse movements around visual display can complement the eye tracking record to explore the relationship between action and language.


What distinguishes this edited volume from other publications is that it reviews more than forty years of research in the field of visually situated language comprehension. The main contribution of this volume is to provide a systematic review of literature in the fields of cognitive sciences and psycholinguistics, assess theoretical and methodological developments, and common principles, and provide research avenues in the domain of real-time language processing in visual, non-linguistic contexts.

Written with an audience interested in real-time language processing in mind, the studies in the first three foundational chapters ground knowledge on the visual world eye-tracking method in language comprehension inquiry. For example, the first chapter provides a background on the field of language comprehension through a historical overview of the visually situated language field and a discussion of key concepts, such as visual world paradigm, context, and real-time measure. Newcomers to the field will find these chapters useful in familiarizing them with the research field and methodology.

The visual world studies described in these nine chapters following the three introductory chapters address diverse topics pertaining to the relationship between linguistic comprehension and cognition, ranging from visually situated language comprehension, scene perception, dynamic scenes, discourse comprehension, affordances, figurative language processing, fictive motion, sentence and referential processing, conversational perspective-taking, and motoric system. In addition to research synthesis, these chapters offer in-depth discussions of theoretical and methodological insights into the language-vision interaction. For example, the technologies adopted in mouse-tracking studies to index manual action during language processing and the various software discussed in Chapter 12 are useful for researchers, practitioners, and students interested in in-depth learning about the intricacies of conducting mouse-tracking experiments.

Methodological constraints and advantages of the visual world paradigm are explored in this book. Specifically, in Chapter 6, advantages of the eye-tracking methods include the reliance on the auditory nature, as opposed to reliance on the prosodic features. The eye tracking methods also allow for fine-grained analysis to investigate the moment-by-moment language processing of auditory stimuli in naturalistic tasks. Some of the challenges include the nature of display, potential biases arising from the location currently being fixated, and threats to validity. A more in-depth discussion on the methodological considerations and techniques, such as passive listening tasks, story continuations, and picture verification, and the adoption of this method for investigating real-time language processing in participants with limited or no literacy skills, second language learners, bilingual and multilingual speakers, and students with learning difficulties, among other groups, would enrich the conversation about the methodological constraints and advantages and ethical concerns.

The chapters reflect the purpose of this volume: the contributors review seminal studies as opposed to reporting unpublished research. To complement the eye tracking method, they present other methods used to measure language processing. For example, Chapter 7 reviews studies on the processing of fictive motion sentences that used various methods, such as narrative studies, drawing experiments, and time and motion surveys to provide alternative insights on the processing of language and visual information. Although it is regrettable that various methods are underrepresented in this volume, the readers may access further sources for reading in the references. Perhaps an examination of other methods would not be congruent with the overall focus on the visual world paradigm.

Another shortcoming is the absence of a concluding chapter. Though chapters discuss implications of the studies reviewed and suggest further research topics, a concluding chapter by the editors would complemented the authors’ perspectives. This chapter could have recapitulated the broad range of topics and methods, such as eye-tracking, mouse-clicking and interviews, think-aloud protocols, event-related brain potential experiments, covered in the book. Finally, it could have discussed the adoption of the visual world paradigm with various participants, ethical concerns, and the relevance of the visual world paradigm in various areas, such as education, language teaching and learning, naturalistic second language acquisition and everyday communication.

In spite of these minor drawbacks, this book reports cutting edge research that contributes to understanding of the complexity of language processing. The chapters, written by experts in the fields of cognitive science, computational linguistics, developmental psychology, experimental psychology, neurolinguistics, and psycholinguistics, provide a comprehensive review of literature and critical discussions about various topics and theoretical and methodological developments. Chapters centered on specific topics will be an excellent reference guide for researchers, students, curriculum designers, newcomers to the field, and those interested in understanding real-time comprehension and production through linguistic, visual, cognitive, and motoric processes. Due to its introductory chapters, comprehensive literature overviews, and the detailed presentations of a broad range of topics, this edited volume will be an excellent course book.


Knoeferle, P., & Guerra, E. (2016). Visually situated language comprehension. Language and Linguistics Compass, 10(2), 66-82. doi :10.1111/lnc3.12177


Andrea Lypka is a PhD Candidate in the Second Language Acquisition and Instructional Technology (SLA/IT) program at the University of South Florida (USF). Her research interests include learner identity, agency, and visual methods.

Page Updated: 13-Jan-2017