Sanders, Ted, Joost Schilperoord, and Wilbert Spooren, ed. (2001) Text Representation: Linguistic and Psycholinguistic Aspects. John Benjamins Publishing Company, 363pp, hardback ISBN 1-58811-077-X, USD 86.00 / 90-272-2360-2, EUR 95.00, Human Cognitive Processing 8 Announced in http://linguistlist.org/issues/13/13-217.html
Laura Alonso Alemany and Irene Castellón
In what follows, we give an overview of the structure and contents of the book, with a brief summary of each chapter. Then, each section is presented in turn, with a more extensive review of each of the articles. We finish with a critical overview of the volume as a whole.
SYNOPSIS The theme of this book is the linguistic and psycholinguistic aspects of text representation, from a cross-disciplinary perspective. It is based on papers presented at the International workshop with the same title held at Utrecht University in July 1997.
The book is divided in four sections. Section 1 deals with referential coherence, focussing on accessibility. A second section, the most extended one, discusses relational coherence by way of diverse theoretical perspectives and empirical methods. Section 3 introduces questions of knowledge representation related to text representation, and section 4 presents aspects of text segmentation. Each section is preceded by an overview from the editors, where they give a brief summary of the following chapters, contextualise them and establish relations between them.
The introduction (Chapter 1) presents the field of research and the contents of the book.
Section 1 focuses on referential coherence. Chapter 2 gives an overview of accessibility theory, which tries to explain the choice of one among all possible referential forms in terms of accessibility of the referent. Chapter 3 provide an account of how text cues guide the reader's attention in reading and how they consequently contribute to shaping the cognitive representation of the text. Finally, in Chapter 4, a modular hypothesis of lexical access in text production is supported via experiments on familiarity of metaphors.
Section 2 deals with coherence relations. In Chapter 5, an intention- based definition of pragmatic relations is proposed to adequately address the distinction between the so-called 'semantic' and 'pragmatic' coherence relations. Chapter 6 presents empirical evidence that supports a typology of concessive relations in terms of underlying causality patterns and thematic continuity in the surrounding discourse. Chapter 7 presents some problems of RST relation 'Elaboration' and proposes an alternative model of discourse coherence where local dependencies are explained by RST relations except Elaboration, and global coherence is based on focus-driven non-local dependencies. Chapter 8 provides an account of the conjunction 'and', claiming that it does not link segments but makes them jointly relevant to the surrounding discourse. In Chapter 9 the separation of propositional and illocutionary levels is questioned, and the pragma- dialectical approach of Argumentation Theory is proposed as an alternative analysis for linguistic cues in explanation and argumentation.
Section 3 addresses knowledge representation. In Chapter 10 an investigation about coherence relations and inferences is presented. Chapter 11 proposes a quantitative model that represents how people think about information technical or scientific. This model exploits the relation between text and knowledge.
Section 4 focuses on text segmentation. In Chapter 12, a modular view of language production processes is supported by an analysis of pauses in dictations. The last chapter presents some discrepancies between syntactical structure and discursive segmentation, and proposes the distinction between two levels in discourse interpretation: cognitive coordination and informational content.
INTRODUCTION The first chapter of the book serves as an introduction. Ted Sanders and Wilbert Spooren characterise the field of study of text representation, they signal major research themes in the area and present basic concepts that are going to be used throughout the rest of the book. They put a special emphasis on the concept of coherence. Coherence is considered as the mental correlate of textual connectedness, which is in turn considered a text-constituting characteristic. The distinction is made between referential and relational coherence: while the first accounts for repeated reference to the same object in a discourse, the second explains how coherence relations like 'cause' connect text segments.
SECTION 1. Referential coherence: accessibility and text processing. In the first chapter of this section, Mira Ariel presents an overview of her accessibility theory. The central claim of accessibility theory is that language users choose among all the possible referential forms in the language depending on the accessibility of the referent, so that more elaborate referential markers correspond to less accessible referents. In other words, these markers signal the degree of accessibility with which the mental representation to be retrieved is held, so they can be ordered in an accessibility marking scale, going from low accessibility markers (full names, definite descriptions) to high accessibility markers (pronouns, verbal person inflections, zeroes). It is the combination of accessibility factors (head complexity, distance, grammatical role of the relativized position, restrictiveness) that determines the referential form, and not any single factor. However, accessibility considerations do not account for the whole of the selection process: also contextual assumptions, such as relevance-based considerations, play a role in determining referential form.
Further research is presented that corroborates general accessibility predictions and enriches the original theory. The author compares accessibility theory to other theories of reference, such as Chafe (1976), Givón (1983), Levinson (1987, 1991), Gundel et al. (1993) and Centering (Grosz et al. 1986, 1995). She notes a common core, namely that all theories offer some scale of referring expressions, that they all agree in that pragmatic factors can override the principles they propose and that they all converge on predictions about zeroes, pronouns and lexical NPs. She discusses each of the mentioned theories and comes to the conclusion that none can account for the full range of distributional patterns of referring expressions as well as accessibility theory. To finish, Mira Ariel presents sketches some directions for further research, contextualising and justifying them.
In chapter 3, Michelle L. Gaddy, Paul van den Broek and Yung-Chi Sung explain how heterogeneous text cues (linguistic cues, typographical cues and text structural ones) guide the reader's attention in reading, and how they consequently affect her mental representation of discourse. They provide an account of the on-line reading process in the framework of the Landscape Model, where the reading process and the resulting mental representation is explained in terms of varying activation degrees of the concepts throughout discourse. The studied text cues are found to have definite activation functions in this model. Within linguistic cues, function and relevance indicators (for example, 'in summary') increase activation of the concepts they co- occur with, favouring their subsequent retrieval in further discourse, while anaphors and cataphors re-activate (or pre-active, respectively) a concept that was in background. Typographical cues (italics, boldface) have an effect comparable to that of linguistic relevance indicators. Titles and headings are considered text-structure cues that direct reader's attention to particular content, thus biasing their processing of the text. In all the cases, activation of concepts can be translated to higher attention from the reader, which has been proved in various experiments that the authors refer to, mostly memory tests or reading time experiments.
In Chapter 4, Rachel Giora and Noga Balaban address lexical access in text production. Assuming comparability between text comprehension and text production, they carry out an experiment on how literal and metaphorical lexical meaning is accessed. In this experiment, subjects rated metaphors in newspaper text according to their familiarity. It was checked whether each metaphor was followed by a mention of its literal meaning, which was taken as a signal that the coded meaning of the word was activated. Results show that the coded meaning of a word is usually activated, regardless of the familiarity of the metaphor, that is to say, even if the context strongly evokes a 'figurative' meaning. This rejects an interactionist hypothesis that assumes that contexts directs lexical access so that only the appropriate meaning of words is made available for comprehension. Consequently, a modular view of lexical access, the 'graded saliency hypothesis' is supported. This hypothesis assumes that all the meanings of a word that are coded in a language are always activated. However, as the authors themselves state, this experiment cannot be taken as conclusive evidence for a modular view of lexical access, since the measure used was not on-line.
SECTION 2. Relational Coherence. In Chapter 5, Alistair Knott discusses the distinction between two kinds of coherence relations, the so-called 'pragmatic' and 'semantic' ones. He presents problems with some previous proposals. He argues that Sanders, Spooren and Noordman's (1992) pragma tic - semantic distinction cannot satisfactorily account for the data. Sweetser's proposal of dividing Sander's pragmatic relations into epistemic and speech-act is found to solve some of the exposed problems, yet others arise. Finally, Knott proposes an intention-based definition of pragmatic relations that seems to satisfactorily account for the data, while keeping Sander's simpler binary distinction of pragmatic and semantic relations. However, some explanatory inadequacies are spotted by the author himself that seem to favour Sweetser's tripartite distinction. Future work, the author points out, should try to generalise over the intentions of the protagonists of the discourse and the participants in the speech act.
In Chapter 6, Leo Noordman presents a corpus study on concessive (although) relations. He makes a distinction between asymmetric and symmetric although-sentences, namely, those expressing a denial of expectation and those expressing concessive opposition. In a denial of expectation, the clause containing the expectation is syntactically and discursively subordinate to the main proposition, which denies it. In contrast, concessive oppositions establish a relation between two clauses which are equally central to the discourse, even though one of them is syntactically subordinate to the other. In addition, he takes concession to be a complex thought, the combination of causation and negation, and distinguishes two different concessives based on the different causalities underlying although-sentences: 'default order', if the cause precedes the consequence, and 'reversed order' if the consequence is first. This second division is only applicable within sentences expressing a denial of expectation.
The distinction between these different types of although-sentences is supported by empirical evidence. In the first place, reading time studies show that default order causal relations are processed faster than reversed order ones, implying that the cause-consequence order is more natural to human reasoning processes. This preference is also supported by a corpus study showing that default order causal relations are more frequent than reversed order ones. It was also found that sentences expressing a denial of expectation have a tendency to the subordinate clause - main clause structure, indicating another reasoning preference, namely, to mention first the cause for the expectation and subsequently the negation of that expectation.
What's more, the different relations were found to have different behaviours in relation to the thematic development of discourse. To describe how the different relations are embedded in their context, four factors were taken into account: the main-subordinate order, whether the text following and preceding an although-sentence was a continuation of the main clause, whether the second clause in a complex sentence is thematically related to the subsequent context and whether the first clause is thematically related to the preceding context, regardless of syntactical status. These factors constitute a model for the thematic continuity of each of the three kinds of although- sentences (default order and reversed order denials of expectation, and concessions) with its context, with a fit of the data with the model that ranges from 92% to 99%. Ultimately, this serves for establishing a clear link between the mental representation of discourse and its surface characteristics.
In Chapter 7, Alistair Knott, Jon Oberlander, Michael O'Donnell and Chris Mellish argue that the Elaboration relation proposed by RST (Rhetorical Structure Theory, Mann and Thompson 1988) is usually used as a waste-paper basket, as the default relation in text analysis when no other relation fits. They note that this relation is quantitatively different to the rest of relations proposed by RST. Structurally, elaboration can hold between non-adjacent spans, which is not the case for other relations, whereas it is specially resistant to even the simplest embeddings. Unlike the other relations in RST, elaboration is not really a relation between propositions, but between components of a proposition. Moreover, there are no linguistic signals that can be consistently related to this relation, as there are for all the rest (Knott and Mellish 1996).
The authors propose a new model of coherence, so that a coherent text is considered as a 'sequence of focus spaces which succeed each other in a legal manner'. Each focus space is constituted by a so-called 'entity-chain', a sequence of RS trees whose the top nucleus of each tree is a fact about a common entity. A sequence of entity-chains is coherent if the focused entity in each chain is mentioned in a preceding chain which is not too far in the line of discourse.
In Chapter 8, Henk Pander Maat claims that the conjunction 'and' does not link two segments directly, but makes them jointly relevant to the surrounding context. His starting point is the relevance theoretic work of Carston (1993) and Blakemore (1987), who were the first to propose joint relevance as the adequate account for 'and'. He strengthens this notion by considering 'and' as a topic continuity marker. Taking topic to be the explicit or implicit question that is being answered by a segment of discourse (Van Kuppevelt 1995), 'and' combines two segments of discourse as a single topic. A corpus study of interclausal conjunctions shows that the majority do present joint relevance. Two kinds of joint relevance environments are distinguished: supporting or elaborating an assumption or answering a single question, this being the most frequent.
Some theoretical implications of his account of the conjunction are discussed. First, the meaning of 'and' is claimed to be procedural, because it constrains the possible implicatures between the joined elements. Second, joint relevance relations are placed in an expansion of the coherence relation classification of Sansders, Spooren and Noordmant (1992), as a subtype of non-causal relations, namely additive and comparative. Finally, this account is related to recent work on the role of connectives in the construction of discourse representations, suggesting that the differences between juxtaposed and coordinated sentences should be investigated.
In Chapter 9, Francisca Snoeck Henkemans provides a different perspective on coherence relations and connectives from Argumentation Theory. She argues that the separation of propositional (content) and illocutionary (means-end) levels in theories of coherence relations is often inadequate. As an example, the author states that explanations and argumentations can be confused, because they can be both based on a causal relationship, despite their different illocutionary aim: while explanations intend to facilitate comprehension, argumentations try to increase acceptability of a certain standpoint. Linguistic cues, such as connectives, are of little help, since most of them may signal both argumentation or explanation.
The author argues that the pragma-dialectical approach of argumentation provides a good basis for interpreting the linguistic cues in a well- founded and systematic way. To support her claim, she gives an overview of the main conditions that are taken into account in a pragma- dialectical account of argumentation and explanation. These conditions suggest that systematically linking the propositional and illocutionary levels provides crucial information for the analysis of argumentative discourse.
SECTION 3. From text representation to knowledge representation Chapter 10 deals about the construction of inferences during text comprehension. The aim of authors, A.C. Graeser, P. Wiemer-Hastings and K. Wiemer-Hastings, is to show that knowledge plays a central role in inference mechanisms. First, the constructionist theory of inference generation is presented (Graeser et al 1994). Constructionist theory of inference offers predictions about what Knowledge based inferences are generated when readers construct a situational model. Secondly, a three-pronged method for investigating inferences are explained, the three prongs are: theoretical predictions, verbal protocols and on-line behavioral measures. To finish, we find a catalogue of relations that are used to relate text constituents and conceptual entities in world knowledge structures. The author's assumption is that World Knowledge is sufficiently constrained and it can be integrated in theories of language processing based on lexicon, syntax and semantics.
This last section deals with coherence relations, distinguishing local and global coherence; they present the Zwaan method 'Event indexing' that assumes that the reader accesses five conceptual dimensions (protagonist, temporality, spatiality, causality and intentionality). Also the authors present a catalogue (appendix 1) of relations used in a World Knowledge representation that can be worthwhile. Each relation is defined by a name, a definition, composition rule and an example. All these relations connect nodes of five categories (Concept, State, Event, Goal, Style).
In Chapter 11 a model for thinking about bodies of knowledge is presented. B. K. Britton, P. Schaefer, M. Bryan, S. Silverman and R. Sorrells explain an investigation about the conceptual representation of text; first, they introduce a way of representing knowledge as a network of concepts. Based on expert knowledge on the subject of the text, two structures are created: Expert structure (by qualified human experts) and a Predicted Thought Structure (by an algebraic model). This model is based in 'spreading activation and relaxation' that permits to activate related concepts in a recursively way. Concepts and relations between concepts compose these two structures.
In this way, two experiments are carried out in order to compare the human thinking process with the expert structure and with the Predicted Thought Structure. There are two groups of subjects: think condition group and no think condition group (the difference is that the first ones were asked to think about the text for a few minutes a day during a week and the second ones were not. The authors point out that the hypothesis of this work is that knowledge structures change as a result of being thought about. Results of the experiment support the Prediction Thought Model. The discussion is focused in three issues: the meaning of the products of thought, their implications for memory storage and retrieval and the implications of the results for the validity of the representation.
SECTION 4. Segmentation In Chapter 12, Joost Schilperoord investigates the soundness of analysing pauses in dictations as empirical evidence for discourse planning processes. Ample evidence is provided from previous studies showing that length and location of pauses in discourse are good correlates of the underlying production processes, and that they succeed in distinguishing two levels in planning processing: conceptual and linguistic. Long pauses at paragraph or sentence boundaries signal conceptual planning, while short pauses at word or phrase level signal linguistic planning. Clause level seems to be an intermediate between the two.
Schilperoord analyses how variation in texts influences variation in these correlates of production processes. The underlying hypothesis was that, if the conceptual and linguistic processes interact, variation in texts should cause the pausal correlates to co-vary. Three aspects of variation in text were explored: differences between texts (length, complexity), variation in pause lengths for different location types in texts and 'real production time'. Results show that it was impossible to predict any of the two pause patterns on the basis of the other, for none of the three variables. Therefore, no empirical evidence was gathered supporting the interaction hypothesis, which favours a modular view of language production processes, where conceptual and planning processing are independent of each other. Despite that, it was found that processing at clause level is affected by paragraph and sentence level processes, and that it affected linguistic level, thus partially supporting the interaction hypothesis for this level. The author points out that further research should be carried to support the empirical results obtained so far, for example, exploring writing processes requiring more planning activity than the ones in this study, lawyer's dictations of informative letters.
In Chapter 13, Arie Verhagen addresses the delimitation of discursive units, the so-called segments. He focuses on constructions where there is a conflict between syntactical and discursive subordination, namely complex sentences with embedded subject or complement clauses. RST account of these constructions states that the subordinate clause was to be considered as "part of its host clause" (Mann and Thompson 1988).
Verhagen claims that two different message dimensions of discourse interpretation are necessary to adequately analyse these structures: content and (intersubjective) coordination, similar to the distinction between semantic and ideational relations in RST. Therefore, discourse segments are not to be distinguished linearly, in one dimension, but in two, so that main clauses provide the subject of consciousness for the content of the subordinate clause. This is what the author calls the 'embedding construction'. Assuming this twofold account of discourse, it is the main clause that is actually conceptually dependant on a subordinate one. This conception provides a more adequate explanation of some problematic examples, and succeeds in distinguishing subject and complement clauses from restrictive relatives, which can never constitute a separate discourse segment. In addition, evidence from thematic continuity patterns supports this claim, as the distribution of discourse anaphors is shown to be sensitive to the coordination and content dimensions.
CRITICAL OVERVIEW This book constitutes a good reference to a field which is quite chaotic, because it is young and multidisciplinary. It is not a mere collection of independent works, but it provides a joint overview of the field. The editors emphasize the relations between each chapter, so that it is easier to get a global picture of the area.
In an area in constant evolution, where no there are no established methodologies, the works presented in this book provide a valuable reference: the analysis of the problems is clear, and the proposed methods and solutions are well argued. In addition, in most of the cases, the solutions proposed are novel and sound. We would like to remark a generalised effort to turn the implications of theoretical models into empirically provable issues that can be objectively supported with psycholinguistic experiments or corpus studies. Moreover, the results of these empirical tests are not given as self- sufficient evidence, but authors resort to statistical modelling to provide unbiased interpretations of the data and avoid the fallacy of fully supporting theories on raw data.
In each of the four subtopics of the book, different theoretical perspectives are represented. The authors and editors have been able to interrelate this diversified points of view so that diversity does not lead to useless criticism, but results in mutual enrichment.
REFERENCES Blakemore, D. (1987). Semantic constraints on relevance. London: Basil Blackwell.
Carston, R. (1993). Conjunction, explanation and relevance. Lingua, 90, 27-48.
Chafe, W. L. (1976). Givenness, contrastiveness, definiteness, subjects, topics and point of view. In C. N. Li (ed.), Subject and topic (pp. 25-55). New York: Academic Press.
Givón, T. (1983). Topic continuity in discourse: An introduction. In T. Givón (ed.), Topic Continuity in Discourse: A Quantitative Cross- Language Strudy (pp. 1-42). Amsterdam: John Benjamins.
Grosz, B. J., Joshi, A., and Weinstein, S. (1995). Centering: A framework for modeling the local coherence of discourse. IRCS Report 95-01, The Institute for research in cognitive science, University of Pennsylvania.
Gundel, J. K., and Mulkern, A. E. (1993). Quantity implicatures in reference understanding. Pragmatics and cognition, 6, 21-45.
Knott, A., and Mellish, C. (1996). A feature-based account of the relations signalled by sentence and clause connectives. Language and Speech, 39, 143-183.
Kuppevelt, J. van (1995). Discourse structure, topicality and questioning. Journal of Linguistics, 31, 109-147.
Levinson, S. C. (1991). Pragmatic reduction of the binding conditions revisited. Journal of linguistics, 27, 107-161.
Mann W.C. and Thompson, S. A. (1988). Rhetorical structure theory: A theory of text organization. Text, 8, 243-281.
Sanders, T. J. M., Spooren, W. P. M. and Noordman, L. G. M. (1992). Towards a taxonomy of coherence relations. Discourse Processes, 15, 1- 35.
ABOUT THE REVIEWERS Irene Castellón is professor at the University of Barcelona, in the Linguistics Department. Her main research area is Natural Language Processing, in particular computational grammars and computational lexicography.
Laura Alonso Alemany is a doctoral student at the CLiC, Centre for Language and Computation at the University of Barcelona. Her main research area is Discourse Processing for Automated Text Summarisation in Spanish. She is currently working on a shallow rhetorical parser for Spanish unrestricted text. Her areas of interest are discourse and rhetoric, and natural language processing.
|