LINGUIST List 31.1582

Tue May 12 2020

Review: Discourse Analysis; Pragmatics; Text/Corpus Linguistics: Loureda, Recio Fernández, Nadal, Cruz (2019)

Editor for this issue: Jeremy Coburn <>

Date: 03-Feb-2020
From: Juan Bueno Holle <>
Subject: Empirical Studies of the Construction of Discourse
E-mail this message to a friend

Discuss this message

Book announced at

EDITOR: Óscar Loureda
EDITOR: Inés Recio Fernández
EDITOR: Laura Nadal
EDITOR: Adriana Cruz
TITLE: Empirical Studies of the Construction of Discourse
SERIES TITLE: Pragmatics & Beyond New Series 305
PUBLISHER: John Benjamins
YEAR: 2019

REVIEWER: Juan José Bueno Holle, Independent Researcher


The volume “Empirical Studies in the Construction of Discourse”, edited by Óscar Loureda, Inés Recio Fernández, Laura Nadal, and Adriana Cruz, collects eleven chapters each dedicated to exploring methodological and theoretical issues in the analysis of discourse. The discourse data presented in the volume range primarily from Indo-European languages, including Spanish, English, German, Swedish, French, and Dutch, as well as from Chinese. The studies emphasize the various ways in which work in this area can and should be carried out, with a strong focus on the potential of corpus-based and experimental methods to reveal insightful analyses and raise novel questions. Each of the chapters investigates and reflects on one of three methodological perspectives, 1) corpus analyses and 2) experimental methods, and 3) a combination of both.

The corpus analyses, on the one hand, study the relationships between the language system and language use by providing corpus evidence of the relationship between linguistic material and its function in discourse. The experimental studies, on the other hand, study the potential correlations between expressions at the discourse level and the processing and production patterns of the users. When used in combination, these methods can potentially effectively combine information about the context in which expressions are used with information about the competence of users. Taken as a whole, the volume aims to demonstrate the benefits of empirical approaches to discourse phenomena by showing how empirical methodologies not only effectively complement the theoretical study of discourse and expressions, but are a central component that cannot be separated. Moreover, these empirical methodologies illustrate and strengthen important links to other, related disciplines, such as psychology, computer science, sociology, and statistics.

The chapters in the volume are ordered according to one of the three methodological perspectives. The first section contains six chapters dedicated to corpus-based studies of discourse phenomena. The central issue in this section is the importance of discourse segmentation to isolate the scope of discourse markers and other discourse-structuring cues and, subsequently, to outline and delimit the uses of these markers and their values in discourse. The second section includes three studies devoted to experimental analyses of discourse markers. These studies focus on the complex relationship that contextual enrichment, procedural semantics, and discourse relations between segments have with respect to cognitive effects, processing patterns, and processing strategies. The third section presents two contributions that highlight the advantage of addressing discourse phenomena through the combination of the two methodological approaches represented in the first two sections.

Part 1: Corpus-based studies

The first part of the book contains six studies. In the first chapter, “Challenges in the contrastive study of discourse markers: The case of THEN”, Karin Aijmer draws on a comparative analysis of English THEN, German DENN/DANN, and Swedish DÅ, to show that a positional analysis and a cross-linguistic approach are key to the comprehensive description of these discourse markers. Aijmer asks: what are the similarities and differences between the languages? What are the functions associated with in the initial and final position of the utterance? How should the differences between languages be explained? To answer these questions, the author compares the uses of the three markers in the left and the right periphery, i.e. contexts in which they seem interchangeable, and then compares the frequency with which they are used as equivalents. As part of the analysis, Aijmer highlights the usefulness of translator intuitions as an additional methodological tool. The author concludes that the three discourse markers vary considerably in function, especially in the right periphery, where significant differences can be observed with respect to their uses at the content level, the discourse level, and the illocutionary level.

In “Local vs. global scope of discourse markers: Corpus-based evidence from syntax and pauses”, Ludivine Crible makes systematic annotations of corpora to explore correlations between the syntactic and semantic-pragmatic features of discourse markers and their degree of scope. In a corpus of spoken English, Crible makes three types of annotations of indirect and independent cues: 1) degree of syntactic integration, 2) position, and 3) co-occurrence between pauses. The author then explores the correlations between these cues and the degree of scope, local or global, that each discourse marker demonstrates. Crible argues that the indirect and independent cues offer a reliable window into the scope of three kinds of discourse markers: 1) topic-shifting vs. topic-resuming, 2) coordinating vs. subordinating conjunctions, and 3) the objective (or consequential) vs. subjective (or conclusive) uses of SO. Specifically, a high degree of syntactic integration and an absence of co-occurring pauses are shown to be associated with local scope, while discourse markers with global scope tend to occur outside of syntactic dependency structures, co-occur with pauses, and introduce hierarchically larger and/or distinct units. In addition, the conclusions suggest that a more fine-grained analysis of global scope is necessary as there is a possibility that there is more than one type, thereby suggesting fruitful ways to critically reconsider existing approaches to the scope of discourse markers and, more generally, to the interdependence between annotation variables.

In the third chapter, “Prosodic versatility, hierarchical rank and pragmatic function in conversational markers”, Antonio Hidalgo Navarro and Diana Martínez Hernández present a series of acoustic analyses of discourse markers in a corpus of conversational Spanish to support the idea that the degree of prosodic realization, as defined by the F0 curve, accentual realization, phonic dependence, and position of certain markers, helps determine the hierarchical rank of a given discourse marker within the discourse structure and, as a result, its pragmatic function in specific contexts. The study focuses on corpus-based data of two specific discourse markers in Spanish, BUENO and HOMBRE. The evidence suggests that the prosodic realization of these markers is relevant in determining 1) the uses of BUENO as a procedural marker with textual functions of continuity and rupture, and 2) the uses of HOMBRE as a procedural marker with the modal function of attenuation of disagreement. More generally, then, Hidalgo Navarro and Martínez Hernández demonstrate that there is a relationship between the prosodic realization of a discourse marker and the frequency with which that marker occupies a particular structural hierarchy or carries out a particular discourse function.

In “A preliminary typology of interactional figures based on a tool for visualizing conversational structure”, Guadalupe Espinosa-Guerri and Amparo García-Ramón apply a visualization tool to a corpus of conversational Spanish with the goal of showing the types of interaction patterns that arise from hierarchical relationships in dialogic interactions. Using the distinction between reactive and initiative relations between speaker turns, Espinosa-Guerri and García-Ramón propose a novel typology of nine interactional figures which are described using a visual illustration of the formal connections between turns in dialogue. The authors argue that their approach allows for the detection and classification of all possible interactional structures. To the extent that their classification is valid, they add, their work opens several possibilities for future research, namely, 1) the possibility of creating a relatively exhaustive typology of interactions in dialogue, 2) the potential for the analyst to more easily detect objects of interaction that would otherwise go unnoticed, 3) the possibility of discovering correlations between the type of interventions produced and certain speakers, and 4) the possibility of measuring the rigidity or dynamism specific to particular dialogic genres.

The fifth chapter, “Causal relations between discourse and grammar: Because in spoken French and Dutch” by Liesbeth Degand explores the use of argumentative connectives as discourse markers through an analysis of their syntactic and semantic features in corpora. Degand focuses on French PARCE QUE and Dutch OMDAT. Both are causal connectives meaning ‘because’, can be coordinating or subordinating, and operate at both the sentential and supra-segmental level. As Degand makes clear, subordinating conjunctions generally do not contribute systematically to the construction of discourse relations, since they do not always link independent utterances or independent speech acts (due to a higher syntactic dependency). However, based on the annotation and analysis of turn management, co-reference between the linked segments, co-occurring discourse markers, filled pauses, and prosodic integration, Degand demonstrates that while both connectives do, in fact, function as subordinating conjunctions, users sometimes employ them as syntactically independent, thus conferring on them a discourse value. More generally, then, this study provides deeper insight into the discursive consequences of the grammatical options of coordination and subordination by showing that an isomorphic mapping does not hold between subordinating conjunctions and objective causal relations on the one hand, and coordinating conjunctions and subjective relations on the other.

In the sixth chapter, “A corpus-based comparative study of concessive connectives in English, German and Spanish: The distribution of ALTHOUGH, OBWOHL, and AUNQUE in the Europarl corpus”, Volker Gast examines the subordinating concessive conjunction ALTHOUGH in English and its roughly equivalent OBWOHL in German and AUNQUE in Spanish with annotated data from the Europarl corpus. Gast identifies several features of concessives and their interactions with distributional facts: 1) The structural position of the linked clause (length and position), 2) the semantic relations between clauses, 3) the level at which the connection exists (propositional, textual, and illocutionary), and 4) the information structure patterns generated by conjunction between the main and subordinate clause. Gast finds that ordering asymmetries arise as OBWOHL clauses rarely precede the main clause in comparison with AUNQUE and ALTHOUGH clauses. Further, OBWOHL exhibits a strong bias towards ‘canonical’ concessivity and is linked to clausal and textual connecting functions, while ALTHOUGH and AUNQUE are commonly used in non-canonical, ‘relativizing’ concessives and display a wider range of discursive uses. The author concludes that this difference is due to higher positional restrictions for OBWOHL and the existence of further specialized concessive connectives in German. Finally, Gast concludes that there are distributional differences with respect to the level of linking, givenness status, and topic-comment structure of the concessive, but these are largely consequences of the asymmetries in the ‘basic’ type of semantic relation (canonical, relativizing, adversative).

Part 2: Experiment-based studies

The second part of the volume contains three studies. The first of these, entitled “Processing patterns of focusing in Spanish” by Adriana Cruz and Óscar Loureda, presents evidence from an eye-tracking experiment and a comprehension test to analyze users’ processing patterns of three types of focusing constructions: 1) unmarked identificational foci, 2) unmarked restrictive foci and 3) structures with contrastive foci marked by the Spanish focus operator INCLUSO. Cruz and Loureda find that the different focusing constructions carry different pragmatic scales and that these, in turn, give rise to different processing patterns. While both the unmarked and the marked focus constructions are found to have similar total processing costs, in unmarked utterances processing is found to be guided by conceptual input and in marked utterances it is the rigidity of the procedural instruction of the focus operator that determines processing and interpretation. The experimental evidence and analysis presented therefore demonstrate that marked and unmarked foci have semantic and syntactic properties that establish different processing patterns.

In “Expectation changes over time: How long it takes to process focus”, Johannes Gerwien and Martha Rudka conceptualize focus-sensitive particles as processing and comprehension-guiding devices. Specifically, the authors explore how and why the German focus particle SOGAR (‘even’) modulates comprehenders’ expectations about subsequent discourse. They report the results of a two-alternative choice task that allows them to then observe viewing behavior in a Visual World Paradigm experiment. The experiment addresses four conditions by crossing factors on two axes, the presence vs. absence of a focus operator and the high vs. low magnitude of expectation of change. The findings demonstrate the immediate effect of the focus particle by showing that when the focus particle induces a high degree of expectation change during online comprehension, visual attention to focalized targets is delayed. More generally, Gerwien and Rudka argue that the approach presented allows for not only the identification of factors involved in how well people can construct predictions about focus alternatives, but also the precise moment when a focus particle exerts its effect in online comprehension.

In the last chapter of this section, “Processing implicit and explicit causality in Spanish” by Laura Nadal and Inés Recio Fernández, the authors examine the Spanish expression POR TANTO (‘therefore’, ‘so’) to address the role of connectives as interpretive guides in the construction of discourse. Nadal and Recio Fernández ask whether the explicit and causal relations in segments linked by POR TANTO give rise to different processing patterns than segments connected by implicit causality. Their theoretical claim is that since the human mind is geared toward optimizing relevance in the face of ostensive stimuli and seeks causal processing of information, causal implicit relations should be predictable and inferable. In the case of the use of POR TANTO, causality is explicitly marked by a procedural-meaning guide, which therefore raises the question of what the actual contribution of discourse markers is to understanding utterances and, ultimately to the construction of discourse relations. The authors report the results of an eye-tracking study and suggest that making the connective explicit in a consecutive relation that is already inferable from the meaning of the lexical expressions in the utterances slows down processing (measured in terms of longer total, first-pass, and second-pass reading times). Therefore, the nature of connectives as procedural guides might be nuanced since the extent to which a connective determines processing varies depending on the type of discourse relation at issue.

Part 3: Combined approaches

The final section of the book is comprised of two studies. The first is titled “Subjectivity and Causality in discourse and cognition: Evidence from corpus analyses, acquisition and processing” by Ted J.M. Sanders and Jacqueline Evers-Vermeul. In this, Sanders and Evers-Vermeul examine how causality and subjectivity condition discourse processing and motivate language use. The authors incorporate data from Dutch, English, French, German, and Chinese gathered using three complementary methodological approaches: 1) corpus studies on language use, 2) experimental studies on language discourse processing and representation, and 3) corpus-based and experimental studies on language acquisition. The authors discuss how these approaches help explain the system and use of causal relations and their linguistic expressions in everyday language use, as well as the acquisition order of connectives. The findings show that people indeed distinguish between several types of causality and support the claim that causality and subjectivity are two basic cognitive notions that organize knowledge of coherence relations. Furthermore, Sanders and Evers-Vermeil argue that evidence from all three approaches is needed to clarify not only whether subjectivity is involved but also whose perspective is being represented and what type of subjectivity is invoked.

The final chapter, entitled “Subjectivity of English connectives: A corpus and experimental investigation of result forward causality signals in written language” by Marta Andersson, combines results collected from a corpus study and from experimentation involving a sentence completion task and a paraphrasing experiment. Andersson asks two main questions: 1) whether English connectives AS A RESULT and FOR THIS REASON show clear tendencies for certain discourse environments, and 2) which intuitions language users share about the functions of each connective. The findings demonstrate that despite their functional flexibility across different causal categories, English resultative connectives show significant tendencies to mark specific coherence relations. While AS A RESULT shows a strong preference in the marking of non-volitional results with no subject of consciousness, FOR THIS REASON is more dispersed across discourse domains and is clearly preferred in result relations with a subject of consciousness.


As a whole, the volume puts together a unique combination of chapters that address the study of discourse by relying on various types of empirical evidence. They address a range of discourse phenomena including discourse markers, connectives, focus operators, causality, subjectivity, and interaction patterns. In many cases, the authors consider more than one of the above phenomena at one time with the aim of discovering relationships between them. In other cases, the authors consider a range of related linguistic cues, such as the syntactic, semantic, or phonetic features of a particular discourse item, the position of the item, or the existence or absence of co-occurring pauses. Some of the chapters and, in particular, the experimental studies, consider the processing costs of particular constructions using eye-tracking data or alternative choice methods.

While the relatively narrow range of topics in the study of the construction of discourse that are covered in the volume could be seen as a drawback, the fact that many of the chapters tackle similar, related topics offers the reader a rounded view of specific phenomena. This is particularly the case, for example, with the study of connectives. A total of five chapters are dedicated to the analysis of connectives, two of which are corpus-based studies, one is experimental, and two are mixed methods. The data presented and the methodologies employed in these studies allow for considerable insight into the many factors influencing the use and interpretation of connectives. Similarly, although all of the chapters present analyses based on data from Indo-European languages (only one study incorporates data from outside of this language family, Chinese), the focus on data from one language family, including from a range of meticulously annotated written and spoken corpora, allows the reader a unique sense of the analytical depth that is increasingly possible in the study of discourse in these languages, a depth that researchers working with other, less studied language families could aspire towards in various ways.

Finally, in addition to what the volume contributes to the study of the construction of discourse, it is important to consider also the contribution of the volume to the understanding and definition of empirical work in linguistics. In this light, the studies present a somewhat narrow view of empirical work as simply being comprised of either corpus-based approaches, experimental approaches, or a combination. Because linguistic data of any kind must be acknowledged to have been produced by specific speakers in specific contexts, a detailed exploration of how data is collected, where, and by whom is critical for assessing how corpora are assembled, how experiments are carried out and, by extension, the validity of the empirical data they provide and how that data is to be analyzed and interpreted. If research on the construction of discourse is increasingly to consider empirical work as central, as the contributors to the volume argue it should, then the question of what constitutes well-collected, well-described, and well-analyzed empirical data in this area needs to be meaningfully assessed and should be viewed as a critical avenue for future work.

Overall, the studies collected here engage with important and current issues in the field and contribute a range of novel data and perspectives through which to consider both the nature of specific types of discourse constructions and the way in which they can potentially be studied. The volume contributes significant insights and the chapters can serve as valuable models for empirical work for scholars at the graduate level and beyond.


Juan José Bueno Holle holds an MA in Applied Linguistics from the National Autonomous University of Mexico (UNAM) and a PhD in Linguistics from the University of Chicago. His research interests include language documentation, Mesoamerican languages, and discourse pragmatics. His work has received support from the Endangered Languages Development Programme (ELDP), the National Science Foundation's Documenting Endangered Languages program (NSF-DEL), and the Smithsonian Institution. He is the author of Information structure in Isthmus Zapotec narrative and conversation published by Language Sciences Press.

Page Updated: 12-May-2020