LINGUIST List 13.34

Tue Jan 8 2002

Review: G�mez-Gonz�lez, The Theme-Topic Interface

Editor for this issue: Simin Karimi <>

What follows is another discussion note contributed to our Book Discussion Forum. We expect these discussions to be informal and interactive; and the author of the book discussed is cordially invited to join in.

If you are interested in leading a book discussion, look for books announced on LINGUIST as "available for discussion." (This means that the publisher has sent us a review copy.) Then contact Simin Karimi at or Terry Langendoen at

Subscribe to Blackwell's LL+ at and donate 20% of your subscription to LINGUIST! You get 30% off on Blackwells books, and free shipping and postage!


  • Laura Alonso i Alemany, [iso-8859-1] Review of M. �ngeles G�mez-Gonz�lez, T[iso-8859-1] he Theme-Topic Interface

    Message 1: [iso-8859-1] Review of M. �ngeles G�mez-Gonz�lez, T[iso-8859-1] he Theme-Topic Interface

    Date: Tue, 8 Jan 2002 11:54:40 +0100 (MET)
    From: Laura Alonso i Alemany <>
    Subject: [iso-8859-1] Review of M. �ngeles G�mez-Gonz�lez, T[iso-8859-1] he Theme-Topic Interface

    G�mez-Gonz�lez, Mar�a �ngeles, (2001), The Theme-Topic Interface: Evidence from English, Amsterdam/Philadelphia: John Benjamins. Pragmatics & Beyond NS 71. 434 pages.

    announced in the linguistlist at

    Reviewed by Laura Alonso i Alemany, Ph.D. student of Linguistics.

    SYNOPSIS. This book has two main objectives: to shed light on the nebulous studies of Theme and Topic and to demonstrate the functional relevance of clause-initial position as a Theme zone in present English. The underlying hypothesis of the whole investigation is that Theme zone acts as discourse-building device of cross-linguistic validity. The author makes an extensive and intensive critique of approaches to the subject to identify significant evidence and relevant parameters for the study of Theme/Topic. These are then systematised in a theoretical apparatus that tries to conciliate very heterogeneous accounts and at the same time serves as a background for the author's own work, clearly defining concepts and spotting conflictive points. The notion of syntactic Theme is described in the light of the parameters established and unambiguously clarified in three main ways: - by giving formal clues that distinguish it from the concept of Discourse Topic - by differentiating two kinds of givenness to which Theme establishes contrasting relations - by establishing a taxonomy of classes of syntactic Themes, each described with a set of 27 features Moreover, an extensive investigation on syntactic Theme is conducted applying quantitative techniques in a corpus of spoken English, whereupon empirical evidence is provided that supports the author's theoretical claims and grounds the taxonomy of classes of syntactic Theme. The book is well-written and has a clear and well-planned structure. The abundance of references throughout the whole text is supported by an extensive bibliography, an index with helpfully descriptive section titles, a list of Figures and Tables, an inventory of Abbreviations and Conventions and a name index.

    The book is divided in three parts. In the first one, a critical overview is presented of a number of approaches to theme or topic as discourse-pragmatic categories, most of all when associated with the thematic patterning of the clause, resulting in a classification of approaches that is an accordance with previous classifications (Gundel 1994). The second part focuses on the contributions and shortcomings of three major functionalist schools in dealing with this aspect of language. In the last part of the book, the author presents her own work, which applies corpus techniques to obtain statistically significant evidence to systematise the formal features and discourse functions of sentence-initial position in a corpus of contemporary spoken English. I am going to present a summary, together with an evaluation, of each of these three parts in turn.

    THE FIRST PART OF THE BOOK is a remarkable catalogue of previous work. It can be read as self-contained study because of its wide scope, the insight of the critique and the systematicity with which a significant number of heterogeneous approaches have been organised, with very useful charts and tables. The author spots three main reasons for confusion in the field: - the labels "Theme" and "Topic" have received different interpretations - indeterminacy of functional categories - variety of functionalist frameworks, differing in the degree of functionalism and on the perspective (form-to-function vs. function-to-form) To give an accurate analysis of the inadequacies and contradictions of each approach, each of these three factors is addressed in a different way. The terminological confusion and concept vagueness derived from the first problem are dealt with a reasoned classification of previous work in three major lines (chapter 2). At the same time, this classification serves for grounding the theoretical concepts that will be later applied in the author's own study. The second and third problems are not devoted an exclusive part of the book, but they provide the grounds for the analysis of previous work that is carried out. Moreover, both the general aim of the author's work and a good part of her methodology is motivated by the appreciation of unsatisfactorily solved issues in the area.

    Chapter 2 evaluates the contributions and shortcomings of three major approaches to the study of Theme/Topic, resorting to examples often taken from the original works:

    1. In semantic approaches, Theme/Topic is considered to express "what the message is about", the aboutness of the message. One of the main problems to adequately characterise this approach lies on the confusion around the concept of "aboutness" itself. Much of the fuzziness about this term is due to the fact that it has often been used to account for heterogeneous phenomena, which would be more adequately dealt within informational or syntactic approaches. But even when a strict standpoint is adopted, the concept is hard to define in an objective, operative way. Within this approach, three different directions are distinguished:

    1.1. In semantic-relational accounts, the Theme/Topic establishes a relationship of aboutness with the clausal predication. The aim of these accounts is to identify the sentence element that the speaker announces to then say something about it, and that plays an anchoring role to the previous discourse. Two main unresolved issues are spotted in this direction: in the first place, the assumption is questioned whether individual messages are dual, consisting of 'something that is talked about' (Theme or Topic) and 'something that is said about something else' (Rheme or Rheme). In the second place, the necessity is put forward to find objective markers that elicit the communicative categories postulated in this approach. The markers used so far, syntactic or referential-informational, are conflictive because of their lack of homogeneity.

    1.2. In semantic-referential accounts, the relationship of aboutness is established with the overall discourse, in contrast to the clausal/message scope of relational accounts. The main reason for this wider range is that human discourse is considered to be multipropositional and thematically coherent. Within this approach, Theme/Topic can be identified as both grammatically and cognitively salient entities that establish anaphoric and cataphoric relationships with their co(n)text. In relation to this, various scales are proposed that account for the relative topicality or continuity of entity Topics in the diverse linguistic levels, so that subjects, agents or items with the semantic feature +human rank higher than objects, accusatives or +inanimates, respectively(Giv�n 1993:206).

    1.3. In semantic-interactive accounts, aboutness is not defined beforehand, but continuously negotiated by speakers throughout discourse. Due to this dynamic perspective, Topic/Themes are unlikely to be identified with a part of a sentence, so most of the work in this direction focuses in the formal markers of Topic shifts. The main explanatory inadequacy of this approach is the fact that no objective definition is provided for basic operative concepts such as speakers' Topics or discourse Topics.

    2. In informational approaches, Theme/Topic is considered as given information. Just as 'aboutness', 'givenness' does not seem to provide a solid base for defining the category of Topic/Theme unequivocally. To better analyse the problem, two kinds of givenness are distinguished: 2.1. relational givenness, if the Given-New contrast is established within the scope of individual clauses. 2.2. referential givenness, if the cognitive or discursive saliency of utterance referents is determined by their relation to the discourse co(n)text (referential-contextual givenness)or a model of the speakers' minds (referential-activated givenness), appealing to concepts as recoverability, predictability, shared knowledge or assumed familiarity. Some of these concepts have been highly formalised, thus providing an adequate toolset for a satisfactory study of the phenomena. Informational approaches are also concerned with the thematic patterning of the clause. Some of the current hypotheses are that information tends to a (Given-)towards-New movement, whereas New-towards(-Given) is the marked option, reserved for special communicative functions. Some problems with informational approaches are that many of them mix up two different dimensions, namely shared knowledge and theme, consequently creating more confusion. Besides, no operative definition of the notion of Topic/Theme is provided, since it is described indirectly, in relation to other communicative categories. Moreover, the explanatory power of these accounts is apparently restricted to nominal expressions, so the question arises whether only nominal expressions qualify for Theme/Topical status.

    3. In syntactic approaches, Theme/Topic is identified as (clause) initial position. In contrast with the other two, syntactic accounts of theme are rather homogeneous. Their underlying axiom is that clause/message initial position is a universal category fulfilling a semantico-pragmatic function, that of Theme/Topic. Nevertheless, an operational criterion that systematically identifies the initial constituent of a message is still missing. What is more, the assumption that clause initial positions have some grammatical relevance should be empirically demonstrated. In addition, there is a lack of empirical evidence that can provide an adequate delimitation of the category of syntactic theme. The author suggests that statistically significant data from natural language should provide the basis for such a delimitation, thus contextualising her own work and supplying an argumented motivation for it.

    Within the structure of the book, this critical overview constitutes a solid motivation for the author's own work and framework. In the first place, it serves to place her approach in the complex field of Theme and Topic, thus making it possible to evaluate it in the adequate context. Secondly, it establishes a set of reference concepts which prove very valuable in further discussion of theoretical claims and concrete phenomena. Last but not least, it identifies some of the questions that still have to be solved, describes some of them in depth and sketches out some of the possible work that should be done to address them.

    Although this overview can be read as an independent catalogue of the field, one should not forget that is aimed at motivating the author's work. Thus, those aspects that are more closely related to syntactic theme and the formal account of phenomena in natural language are given stronger emphasis. One sometimes has the impression that approaches are evaluated mainly in relation to the solutions they provide for the problems that the author has encountered in her own work. A good example of this is the fact that one of the main issues in judging syntactic accounts is their adequacy to determine clause-initial position as a Theme zone, making little or no mention of significant contributions such as general clause-patterning principles, interactions of grammatical structure with conversational implicatures and focus, etc. Also, semantic-interactive accounts are said to "avoid, instead of providing answers to, the difficulties inherent in the notion of Theme/Topic" (p. 30) because they are concerned with identifying the formal markers of Topic shifts, and not with identifying Topics as parts of sentences, which is the objective of the author. A similar critique is given to informational approaches, arguing that "the notion of Theme/Topic is not defined directly but rather is described [..] in relation to such elusive concepts as "recoverability", "predictability", "shared knowledge" and "saliency"" (pg. 44). Surprisingly, the author herself cites at length some notable attempts of formalising some of this 'elusive concepts', such as Grosz, Joshi and Weinstein (1995) or Vallduv� and Engdahl (1996).

    Another aspect that has to be taken into account is the functionalist perspective of this overview, which could be considered as a source of bias to the evaluation of the various approaches. However, given that very heterogeneous accounts are dealt with, that the analysis of each of them is grounded on sound arguments and that this theoretical position is made explicit from the very beginning, one should consider the (moderate) functionalist perspective as a characteristic of this evaluation rather than as a limitation.

    THE SECOND PART OF THE BOOK is a sympathetic critique of previous accounts of pragmatic functions within the frameworks of three major functionalist schools.

    In the Prague School (Chapter 3), aboutness is considered as co(n)textually recoverable information. A general lack of consistency is remarked, which can be noted in the interchangeable use of theoretically separated notions as Given and Theme and the fusion of relational (what the clause is about) and referential-semantic (what the text is about) perspectives to achieve illusory solutions of problematic issues. Moreover, little data is provided to support theoretical claims, and most of it is clause centred, with no co(n)textual evidence.

    In Systemic Functional Grammar (SFG)(Chapter 4) the subject is addressed from a relational-semantic perspective, identifying it with clause-initial position in English, therefore, from a perspective close to the author's. Many contributions to the study of Theme/Topic are spotted in this critique: firstly, the identification of a double-sided nature to topical Theme, separating relational-semantic features (what the clause is about) from syntactic ones (point of departure), which provides a better explanation for many problematic phenomena. Other interesting issues raised by SFG are the notion of 'displaced Theme' or the attempts to delimit clause-initial position. However, the author points out that a cross-linguistic study of Theme would supply quantitative and qualitative evidence to solve some of the points that are still to be solved.

    In contrast to SFG, Functional Grammar (Chapter 5) addresses the subject from a referential-semantic perspective, with Topic designating the entity about which the predication predicates something in the whole discourse, and Theme representing an initial predication-external entity about which the predication is about. A third concept is introduced, Tail, a right-most predication external element modifying the predication. A generalised merging of syntactic, semantic and informational criteria is found to result in inconsistent conclusions and controversy, mostly about the criterion of aboutness/relevance, the criterion of initial position and the treatment of givenness and the assignment of Topic and Focus.

    THE THIRD PART OF THE BOOK is devoted to the author's own account of syntactic Theme. In Chapter 6, the theoretical foundations and methodology of the study are presented, whereas Chapter 7 discusses the results obtained from the analysis of the corpus.

    The main aim of the author is to demonstrate the functional relevance of the Theme-zone, or clause-initial position. As required in functionalist models, Theme is described not only in relation to features in the same linguistic level, namely syntactic level, but also in relation to the morphological, cognitive or socio-pragmatic levels. Thus, issues as the cognitive salience of clause-initial position, subjectivity, themes as discourse markers and others are discussed to adequately describe Theme. This network of interrelationships constitutes the basis for an empirically-grounded and systematic account of features in any of the levels, which results in a taxonomy of classes of Theme that tries to reconcile conflicting accounts of Theme/Topic and to overcome some of the problems spotted in the preceding sections. In the proposed classification (pg. 181), major clauses are defined as (+Process, +Predicator, +Theme), and one can distinguish the following features:

    1. Transitivity 2. Mood 3. Theme 3.1. Theme selection 3.1.1. Theme unmarked -- +subject, finite, wh-word, etc. 3.1.2. Theme Marked -- adjunct, complement, process, etc. 3.2. Theme special 3.2.1. identification -- pseudo-cleft clauses 3.2.2. predication -- cleft-clauses 3.2.3. substitution -- right detachment 3.2.4. reference -- left detachment 3.2.5. inversion 3.2.6. it-extraposition 3.2.7. there-existential 3.2.8. non-special theme

    Although this classification is mainly inspired by the SFG model, discrepancies between the two arise from a new approach to marked and unmarked Theme. The author defines her own taxonomy as a 'survey of thematic options', where distinctions are established between default thematic options and those which the speaker chooses to perform a noteworthy communicative function. This motivated choice constitutes the principal source of the semantics of Theme, and is defined by parameters such as internal structure, non-special vs. special thematic constructions (including a novel account of extended multiple themes) and unmarked vs. marked Theme-Rheme patterns, the latter taking into account mood, voice, canonical word order and relative ordering of topical, textual and/or interpersonal elements in the Theme zone. In a conciliatory spirit, the characterisation of non-special thematic constructions is compatible with previous accounts. As for special thematic constructions, the ones studied in detail are existential-there constructions, it-extrapositions, inversions, left detachments, right detachments, clefting and pseudo-clefting.

    An extensive corpus study is presented that provides empirical evidence to support the theoretical claims made by the author. The corpus used is the Lancaster IBM Spoken English Corpus (LIBMSEC), which consists of 49285 words of radio broadcast, divided in ten textual categories that provide a certain delimitation of the socio-pragmatic parameters, invaluable to account for speaker's roles, intentions, etc. The main disadvantage of this corpus is that it does not reflect spontaneous oral language. Its relatively small size constitutes both an advantage and a disadvantage: on the one hand, since automatic searching of Themes as they are characterised by the author is impossible, a manual analysis is the only option left, and in that respect the LIBMSEC is a human-sized corpus: 4097 tokens of syntactic Themes in major clauses are obtained. On the other hand, it does not contain instances of all the subclasses postulated in the taxonomy, and some are represented in a very small number. To avoid that statistical significance is reduced for seldom occurring classes, the notion of peripheral member is resorted to.

    The description of each of the syntactic theme classes and instances is translated to 27 features (Appendix), which are filled for every one of the 4097 tokens, so that they are treatable from a quantitative point of view. This translation illustrates the possibility and productivity of the interaction between complex grammatical description and quantitative methods by applying corpus linguistics techniques in an area traditionally devoted to non-quantitative description. The analysis of these variables was done by means of three statistical tests: Chi Square association test, Fisher's Exact test and Stepwise Logistic Regression procedure. These tests are appropriate to exploit raw frequencies of nominal variables, by classifying different tokens into categories based upon some of their defining features. In this case, the tests provided a classification of the tokens of syntactic theme which constitutes a strong empirical evidence to establish a classification of syntactic Theme.

    Some significant corpus-based conclusions are: - Unmarked, non-special themes are characterised by being the initial Transitivity/Mood constituent demanded by each active mood pattern, expressed by a noun group which is the agent/subject of a declarative clause which occupies the initial predication-internal position. The criterion of markedness is supported by the fact that unmarked, non-special themes are more frequent than marked ones (with a result of p=0.144% in the Chi Square test), thus constituting the speaker's default choice within the available thematic options. They introduce informative messages and convey co(n)textually recoverable information. - within special Theme constructions, the most frequent are There-existential, followed by Subject It-Extrapositions, Inversions, It-Clefts, left detachments and right detachments. All of them tend to have a high amount of preposings and tend to be realized clause-externally. They generally occur in subjective texts because they convey conventional implicatures that re-orient the typical discourse flow. A detailed characterisation of each of the seven classes of special Theme construction is given in section 7.4. - in instances of multiple themes (Extended Multiple Themes), topical themes tend to be triggered by the presence of structural rather than interpersonal elements in the Theme zone. Since they code experiential meaning intervening in choices of mood, topical Themes are congruently within the scope of both interpersonal and logico-conjunctive Themes, which occupy outer slots within the Theme zone, in conformity with their increasing scope potential, in a pattern as follows: (logico-conjunctive)^(interpresonal)^topical^(interpersonal)^(logico-conjunctive). - preposings are typically realized by prepositional, adverbial or clausal circumstantial Adjuncts expressing condition, place or time, and are typical of formal and planned discourse - passive serves to place unmarked Focus on a final constituent which receives thematic highlighting as expressing the speaker's point of view.

    Besides, in an effort to overcome give a satisfactory explanation to some of the issues regarded as inconsistent in previous approaches, the notion of syntactic Theme is precisely delimited. It is first differentiated from Topic, a term that is used for the reference points that a speaker has at hand at a given point of discourse. The relation between this newly defined Topic and syntactic Theme is accounted for in terms of Thematic Progression, taking Theme as a discourse structuring device. Consequently, one of the main criticisms made to semantic-interactive accounts is solved. Secondly, to avoid making the same mistake attributed to the Prague School, a clear distinction is made between referential givenness (what is recoverable in a certain point of discourse) and relational givenness (the addressee's current focus of attention).

    To conclude, Chapter 8 is an excellent synthesis, in which one can get a global picture of the whole book together with an critical evaluation of the main contributions and shortcomings of the work discussed, including the author's. Owing to the huge amount of information that is condensed, there is a high concentration of concepts, specific terms and references to the theoretical apparatus built throughout the book, but the clarity of the exposition facilitates the reading. A number of directions for further research are proposed, namely the extension of the thematic constructions studied and the linguistic levels considered for their analysis, also the research on Themes at levels other than the clause or the category of Rheme. Much stress is given to the explanatory power of cross-linguistic and cross-textual evidence for the validation of clause-initial position as a linguistic universal. However, no remark is made as to the quantitative and qualitative shortcomings of the corpus. Enhancing the presented study by applying it to a larger corpus or to a corpus that is more representative of spontaneous oral language might yield significant improvements on the results.

    Throughout the whole book there is a moderate but persistent vindication of the use of statistics, which can constitute a qualitative improvement in a field that has traditionally relied on introspective data. An explanation of the possible contributions of statistics to the field is provided in section 6.4.3, and some of the related shortcomings are pointed out in section 6.4.2., most of all in relation to the lack of corpus exploitation tools that can provide a data of linguistic quality in a statistical significant quantity. However, some of the proposals for the use of statistics in Chapter 2 seem not to take account of these drawbacks, as for example the suggestion that the scales for entity Topics/Themes adduced by semantic referential approaches to Theme could strengthen their position by means of statistically significant empirical evidence. It is not clear whether this strengthening would be provided after clearly defining these scales or as a way to define them, and the difficulties of applying statistics to such complex and fuzzy phenomena are not even mentioned. Nevertheless, the author has sufficiently proved that quantitative methods can satisfactorily account for highly complex linguistic phenomena, so there is no reason to doubt that they could be successfully applied to other spheres of analysis.

    The use of statistical methods for describing non-paradigmatic phenomena, such as syntactical ones, is certainly a very significant contribution, however, collaboration between the two kinds of knowledge is not symmetrical. Since statistics is clearly subordinate to grammatical needs, the features used are motivated on grammatical theoretical claims only, with their statistical relative relevance, significance and productivity are left unexploited. A deeper exploration in this aspect might yield surprisingly good results in further research.

    References: Giv�n, Talmy, (1993), English Grammar (vol. 2), Amsterdam/Philadelphia: John Benjamins.

    Grosz, Barbara J., Joshi, Aravind K., and Weinstein, Scott, (1995), "Centering: A framework for modelling local coherence in discourse", Computational Linguistics 21 (2): 203-26.

    Gundel, Janette, (1994), "On the different kinds of Focus", Focus and Natural Language Processing, vol. 3, P. Bosch and R. A. van der Sandt (eds.), 457-466. Heidelberg. IBM Deutschland: IBM Working Papers of the Institute for Logic and Linguistics 8.

    Vallduv�, Enric, and Engdahl, E., (1996), "The linguistic realization of information packaging", Linguistics 34: 459-519.

    About the reviewer: Laura Alonso Alemany is a doctorate student at Last but not least. She is currently affiliated to the CLiC (Centre for Language and Computation), in the Department of General Linguistics of the University of Barcelona.