Gómez-González, María Ángeles, (2001), The Theme-Topic Interface: Evidence
from English, Amsterdam/Philadelphia: John Benjamins. Pragmatics & Beyond
NS 71. 434 pages.
announced in the linguistlist at
Reviewed by Laura Alonso i Alemany, Ph.D. student of Linguistics.
This book has two main objectives: to shed light on the nebulous studies
of Theme and Topic and to demonstrate the functional relevance of
clause-initial position as a Theme zone in present English. The underlying
hypothesis of the whole investigation is that Theme zone acts as
discourse-building device of cross-linguistic validity.
The author makes an extensive and intensive critique of approaches to the
subject to identify significant evidence and relevant parameters for the
study of Theme/Topic. These are then systematised in a theoretical
apparatus that tries to conciliate very heterogeneous accounts and at the
same time serves as a background for the author's own work, clearly
defining concepts and spotting conflictive points. The notion of syntactic
Theme is described in the light of the parameters established and
unambiguously clarified in three main ways:
- by giving formal clues that distinguish it from the concept of Discourse
- by differentiating two kinds of givenness to which Theme establishes
- by establishing a taxonomy of classes of syntactic Themes, each
described with a set of 27 features
Moreover, an extensive investigation on syntactic Theme is conducted
applying quantitative techniques in a corpus of spoken English, whereupon
empirical evidence is provided that supports the author's theoretical
claims and grounds the taxonomy of classes of syntactic Theme.
The book is well-written and has a clear and well-planned structure. The
abundance of references throughout the whole text is supported by an
extensive bibliography, an index with helpfully descriptive section
titles, a list of Figures and Tables, an inventory of Abbreviations and
Conventions and a name index.
The book is divided in three parts. In the first one, a critical overview
is presented of a number of approaches to theme or topic as
discourse-pragmatic categories, most of all when associated with the
thematic patterning of the clause, resulting in a classification of
approaches that is an accordance with previous classifications (Gundel
1994). The second part focuses on the contributions and shortcomings of
three major functionalist schools in dealing with this aspect of language.
In the last part of the book, the author presents her own work, which
applies corpus techniques to obtain statistically significant evidence to
systematise the formal features and discourse functions of
sentence-initial position in a corpus of contemporary spoken English. I am
going to present a summary, together with an evaluation, of each of these
three parts in turn.
THE FIRST PART OF THE BOOK is a remarkable catalogue of previous work. It
can be read as self-contained study because of its wide scope, the insight
of the critique and the systematicity with which a significant number of
heterogeneous approaches have been organised, with very useful charts and
tables. The author spots three main reasons for confusion in the field:
- the labels "Theme" and "Topic" have received different
- indeterminacy of functional categories
- variety of functionalist frameworks, differing in the degree of
functionalism and on the perspective (form-to-function vs.
To give an accurate analysis of the inadequacies and contradictions of
each approach, each of these three factors is addressed in a different
way. The terminological confusion and concept vagueness derived from the
first problem are dealt with a reasoned classification of previous work in
three major lines (chapter 2). At the same time, this classification
serves for grounding the theoretical concepts that will be later applied
in the author's own study. The second and third problems are not devoted
an exclusive part of the book, but they provide the grounds for the
analysis of previous work that is carried out. Moreover, both the general
aim of the author's work and a good part of her methodology is motivated
by the appreciation of unsatisfactorily solved issues in the area.
Chapter 2 evaluates the contributions and shortcomings of three major
approaches to the study of Theme/Topic, resorting to examples often taken
from the original works:
1. In semantic approaches, Theme/Topic is considered to express "what the
message is about", the aboutness of the message.
One of the main problems to adequately characterise this approach lies on
the confusion around the concept of "aboutness" itself. Much of the
fuzziness about this term is due to the fact that it has often been used
to account for heterogeneous phenomena, which would be more adequately
dealt within informational or syntactic approaches. But even when a strict
standpoint is adopted, the concept is hard to define in an objective,
Within this approach, three different directions are distinguished:
1.1. In semantic-relational accounts, the Theme/Topic establishes a
relationship of aboutness with the clausal predication. The aim of these
accounts is to identify the sentence element that the speaker announces to
then say something about it, and that plays an anchoring role to the
previous discourse. Two main unresolved issues are spotted in this
direction: in the first place, the assumption is questioned whether
individual messages are dual, consisting of 'something that is talked
about' (Theme or Topic) and 'something that is said about something else'
(Rheme or Rheme). In the second place, the necessity is put forward to
find objective markers that elicit the communicative categories postulated
in this approach. The markers used so far, syntactic or
referential-informational, are conflictive because of their lack of
1.2. In semantic-referential accounts, the relationship of aboutness is
established with the overall discourse, in contrast to the clausal/message
scope of relational accounts. The main reason for this wider range is that
human discourse is considered to be multipropositional and thematically
coherent. Within this approach, Theme/Topic can be identified as both
grammatically and cognitively salient entities that establish anaphoric
and cataphoric relationships with their co(n)text. In relation to this,
various scales are proposed that account for the relative topicality or
continuity of entity Topics in the diverse linguistic levels, so that
subjects, agents or items with the semantic feature +human rank higher
than objects, accusatives or +inanimates, respectively(Givón 1993:206).
1.3. In semantic-interactive accounts, aboutness is not defined
beforehand, but continuously negotiated by speakers throughout discourse.
Due to this dynamic perspective, Topic/Themes are unlikely to be
identified with a part of a sentence, so most of the work in this
direction focuses in the formal markers of Topic shifts. The main
explanatory inadequacy of this approach is the fact that no objective
definition is provided for basic operative concepts such as speakers'
Topics or discourse Topics.
2. In informational approaches, Theme/Topic is considered as given
information. Just as 'aboutness', 'givenness' does not seem to provide a
solid base for defining the category of Topic/Theme unequivocally. To
better analyse the problem, two kinds of givenness are distinguished:
2.1. relational givenness, if the Given-New contrast is established
within the scope of individual clauses.
2.2. referential givenness, if the cognitive or discursive saliency of
utterance referents is determined by their relation to the discourse
co(n)text (referential-contextual givenness)or a model of the speakers'
minds (referential-activated givenness), appealing to concepts as
recoverability, predictability, shared knowledge or assumed familiarity.
Some of these concepts have been highly formalised, thus providing an
adequate toolset for a satisfactory study of the phenomena.
Informational approaches are also concerned with the thematic patterning
of the clause. Some of the current hypotheses are that information tends
to a (Given-)towards-New movement, whereas New-towards(-Given) is the
marked option, reserved for special communicative functions.
Some problems with informational approaches are that many of them mix up
two different dimensions, namely shared knowledge and theme, consequently
creating more confusion. Besides, no operative definition of the notion of
Topic/Theme is provided, since it is described indirectly, in relation to
other communicative categories. Moreover, the explanatory power of these
accounts is apparently restricted to nominal expressions, so the question
arises whether only nominal expressions qualify for Theme/Topical status.
3. In syntactic approaches, Theme/Topic is identified as (clause) initial
In contrast with the other two, syntactic accounts of theme are rather
homogeneous. Their underlying axiom is that clause/message initial
position is a universal category fulfilling a semantico-pragmatic
function, that of Theme/Topic. Nevertheless, an operational criterion that
systematically identifies the initial constituent of a message is still
missing. What is more, the assumption that clause initial positions have
some grammatical relevance should be empirically demonstrated. In
addition, there is a lack of empirical evidence that can provide an
adequate delimitation of the category of syntactic theme. The author
suggests that statistically significant data from natural language should
provide the basis for such a delimitation, thus contextualising her own
work and supplying an argumented motivation for it.
Within the structure of the book, this critical overview constitutes a
solid motivation for the author's own work and framework. In the first
place, it serves to place her approach in the complex field of Theme and
Topic, thus making it possible to evaluate it in the adequate context.
Secondly, it establishes a set of reference concepts which prove very
valuable in further discussion of theoretical claims and concrete
phenomena. Last but not least, it identifies some of the questions that
still have to be solved, describes some of them in depth and sketches out
some of the possible work that should be done to address them.
Although this overview can be read as an independent catalogue of the
field, one should not forget that is aimed at motivating the author's
work. Thus, those aspects that are more closely related to syntactic theme
and the formal account of phenomena in natural language are given stronger
emphasis. One sometimes has the impression that approaches are evaluated
mainly in relation to the solutions they provide for the problems that the
author has encountered in her own work. A good example of this is the fact
that one of the main issues in judging syntactic accounts is their
adequacy to determine clause-initial position as a Theme zone, making
little or no mention of significant contributions such as general
clause-patterning principles, interactions of grammatical structure with
conversational implicatures and focus, etc. Also, semantic-interactive
accounts are said to "avoid, instead of providing answers to, the
difficulties inherent in the notion of Theme/Topic" (p. 30) because they
are concerned with identifying the formal markers of Topic shifts, and not
with identifying Topics as parts of sentences, which is the objective of
the author. A similar critique is given to informational approaches,
arguing that "the notion of Theme/Topic is not defined directly but rather
is described [..] in relation to such elusive concepts as
"recoverability", "predictability", "shared knowledge" and "saliency""
(pg. 44). Surprisingly, the author herself cites at length some notable
attempts of formalising some of this 'elusive concepts', such as Grosz,
Joshi and Weinstein (1995) or Vallduví and Engdahl (1996).
Another aspect that has to be taken into account is the functionalist
perspective of this overview, which could be considered as a source of
bias to the evaluation of the various approaches. However, given that very
heterogeneous accounts are dealt with, that the analysis of each of them
is grounded on sound arguments and that this theoretical position is made
explicit from the very beginning, one should consider the (moderate)
functionalist perspective as a characteristic of this evaluation rather
than as a limitation.
THE SECOND PART OF THE BOOK is a sympathetic critique of previous accounts
of pragmatic functions within the frameworks of three major functionalist
In the Prague School (Chapter 3), aboutness is considered as
co(n)textually recoverable information. A general lack of consistency is
remarked, which can be noted in the interchangeable use of theoretically
separated notions as Given and Theme and the fusion of relational (what
the clause is about) and referential-semantic (what the text is about)
perspectives to achieve illusory solutions of problematic issues.
Moreover, little data is provided to support theoretical claims, and most
of it is clause centred, with no co(n)textual evidence.
In Systemic Functional Grammar (SFG)(Chapter 4) the subject is addressed
from a relational-semantic perspective, identifying it with clause-initial
position in English, therefore, from a perspective close to the author's.
Many contributions to the study of Theme/Topic are spotted in this
critique: firstly, the identification of a double-sided nature to topical
Theme, separating relational-semantic features (what the clause is about)
from syntactic ones (point of departure), which provides a better
explanation for many problematic phenomena. Other interesting issues
raised by SFG are the notion of 'displaced Theme' or the attempts to
delimit clause-initial position. However, the author points out that a
cross-linguistic study of Theme would supply quantitative and qualitative
evidence to solve some of the points that are still to be solved.
In contrast to SFG, Functional Grammar (Chapter 5) addresses the subject
from a referential-semantic perspective, with Topic designating the entity
about which the predication predicates something in the whole discourse,
and Theme representing an initial predication-external entity about which
the predication is about. A third concept is introduced, Tail, a
right-most predication external element modifying the predication. A
generalised merging of syntactic, semantic and informational criteria is
found to result in inconsistent conclusions and controversy, mostly about
the criterion of aboutness/relevance, the criterion of initial position
and the treatment of givenness and the assignment of Topic and Focus.
THE THIRD PART OF THE BOOK is devoted to the author's own account of
syntactic Theme. In Chapter 6, the theoretical foundations and methodology
of the study are presented, whereas Chapter 7 discusses the results
obtained from the analysis of the corpus.
The main aim of the author is to demonstrate the functional relevance of
the Theme-zone, or clause-initial position. As required in functionalist
models, Theme is described not only in relation to features in the same
linguistic level, namely syntactic level, but also in relation to the
morphological, cognitive or socio-pragmatic levels. Thus, issues as the
cognitive salience of clause-initial position, subjectivity, themes as
discourse markers and others are discussed to adequately describe Theme.
This network of interrelationships constitutes the basis for an
empirically-grounded and systematic account of features in any of the
levels, which results in a taxonomy of classes of Theme that tries to
reconcile conflicting accounts of Theme/Topic and to overcome some of the
problems spotted in the preceding sections. In the proposed classification
(pg. 181), major clauses are defined as (+Process, +Predicator, +Theme),
and one can distinguish the following features:
3.1. Theme selection
3.1.1. Theme unmarked -- +subject, finite, wh-word, etc.
3.1.2. Theme Marked -- adjunct, complement, process, etc.
3.2. Theme special
3.2.1. identification -- pseudo-cleft clauses
3.2.2. predication -- cleft-clauses
3.2.3. substitution -- right detachment
3.2.4. reference -- left detachment
3.2.8. non-special theme
Although this classification is mainly inspired by the SFG model,
discrepancies between the two arise from a new approach to marked and
unmarked Theme. The author defines her own taxonomy as a 'survey of
thematic options', where distinctions are established between default
thematic options and those which the speaker chooses to perform a
noteworthy communicative function. This motivated choice constitutes the
principal source of the semantics of Theme, and is defined by parameters
such as internal structure, non-special vs. special thematic constructions
(including a novel account of extended multiple themes) and unmarked vs.
marked Theme-Rheme patterns, the latter taking into account mood, voice,
canonical word order and relative ordering of topical, textual and/or
interpersonal elements in the Theme zone. In a conciliatory spirit, the
characterisation of non-special thematic constructions is compatible with
previous accounts. As for special thematic constructions, the ones studied
in detail are existential-there constructions, it-extrapositions,
inversions, left detachments, right detachments, clefting and
An extensive corpus study is presented that provides empirical evidence to
support the theoretical claims made by the author. The corpus used is the
Lancaster IBM Spoken English Corpus (LIBMSEC), which consists of 49285
words of radio broadcast, divided in ten textual categories that provide a
certain delimitation of the socio-pragmatic parameters, invaluable to
account for speaker's roles, intentions, etc. The main disadvantage of
this corpus is that it does not reflect spontaneous oral language. Its
relatively small size constitutes both an advantage and a disadvantage: on
the one hand, since automatic searching of Themes as they are
characterised by the author is impossible, a manual analysis is the only
option left, and in that respect the LIBMSEC is a human-sized corpus: 4097
tokens of syntactic Themes in major clauses are obtained. On the other
hand, it does not contain instances of all the subclasses postulated in
the taxonomy, and some are represented in a very small number. To avoid
that statistical significance is reduced for seldom occurring classes, the
notion of peripheral member is resorted to.
The description of each of the syntactic theme classes and instances is
translated to 27 features (Appendix), which are filled for every one of
the 4097 tokens, so that they are treatable from a quantitative point of
view. This translation illustrates the possibility and productivity of the
interaction between complex grammatical description and quantitative
methods by applying corpus linguistics techniques in an area traditionally
devoted to non-quantitative description. The analysis of these variables
was done by means of three statistical tests: Chi Square association test,
Fisher's Exact test and Stepwise Logistic Regression procedure. These
tests are appropriate to exploit raw frequencies of nominal variables, by
classifying different tokens into categories based upon some of their
defining features. In this case, the tests provided a classification of
the tokens of syntactic theme which constitutes a strong empirical
evidence to establish a classification of syntactic Theme.
Some significant corpus-based conclusions are:
- Unmarked, non-special themes are characterised by being the initial
Transitivity/Mood constituent demanded by each active mood pattern,
expressed by a noun group which is the agent/subject of a declarative
clause which occupies the initial predication-internal position. The
criterion of markedness is supported by the fact that unmarked,
non-special themes are more frequent than marked ones (with a result of
p=0.144% in the Chi Square test), thus constituting the speaker's default
choice within the available thematic options. They introduce informative
messages and convey co(n)textually recoverable information.
- within special Theme constructions, the most frequent are
There-existential, followed by Subject It-Extrapositions, Inversions,
It-Clefts, left detachments and right detachments. All of them tend to
have a high amount of preposings and tend to be realized
clause-externally. They generally occur in subjective texts because they
convey conventional implicatures that re-orient the typical discourse
flow. A detailed characterisation of each of the seven classes of special
Theme construction is given in section 7.4.
- in instances of multiple themes (Extended Multiple Themes), topical
themes tend to be triggered by the presence of structural rather than
interpersonal elements in the Theme zone. Since they code experiential
meaning intervening in choices of mood, topical Themes are congruently
within the scope of both interpersonal and logico-conjunctive Themes,
which occupy outer slots within the Theme zone, in conformity with their
increasing scope potential, in a pattern as follows:
- preposings are typically realized by prepositional, adverbial or
clausal circumstantial Adjuncts expressing condition, place or time, and
are typical of formal and planned discourse
- passive serves to place unmarked Focus on a final constituent which
receives thematic highlighting as expressing the speaker's point of view.
Besides, in an effort to overcome give a satisfactory explanation to some
of the issues regarded as inconsistent in previous approaches, the notion
of syntactic Theme is precisely delimited. It is first differentiated from
Topic, a term that is used for the reference points that a speaker has at
hand at a given point of discourse. The relation between this newly
defined Topic and syntactic Theme is accounted for in terms of Thematic
Progression, taking Theme as a discourse structuring device. Consequently,
one of the main criticisms made to semantic-interactive accounts is
solved. Secondly, to avoid making the same mistake attributed to the
Prague School, a clear distinction is made between referential givenness
(what is recoverable in a certain point of discourse) and relational
givenness (the addressee's current focus of attention).
To conclude, Chapter 8 is an excellent synthesis, in which one can get a
global picture of the whole book together with an critical evaluation of
the main contributions and shortcomings of the work discussed, including
the author's. Owing to the huge amount of information that is condensed,
there is a high concentration of concepts, specific terms and references
to the theoretical apparatus built throughout the book, but the clarity of
the exposition facilitates the reading. A number of directions for further
research are proposed, namely the extension of the thematic constructions
studied and the linguistic levels considered for their analysis, also the
research on Themes at levels other than the clause or the category of
Rheme. Much stress is given to the explanatory power of cross-linguistic
and cross-textual evidence for the validation of clause-initial position
as a linguistic universal. However, no remark is made as to the
quantitative and qualitative shortcomings of the corpus. Enhancing the
presented study by applying it to a larger corpus or to a corpus that is
more representative of spontaneous oral language might yield significant
improvements on the results.
Throughout the whole book there is a moderate but persistent vindication
of the use of statistics, which can constitute a qualitative improvement
in a field that has traditionally relied on introspective data. An
explanation of the possible contributions of statistics to the field is
provided in section 6.4.3, and some of the related shortcomings are
pointed out in section 6.4.2., most of all in relation to the lack of
corpus exploitation tools that can provide a data of linguistic quality in
a statistical significant quantity. However, some of the proposals for the
use of statistics in Chapter 2 seem not to take account of these
drawbacks, as for example the suggestion that the scales for entity
Topics/Themes adduced by semantic referential approaches to Theme could
strengthen their position by means of statistically significant empirical
evidence. It is not clear whether this strengthening would be provided
after clearly defining these scales or as a way to define them, and the
difficulties of applying statistics to such complex and fuzzy phenomena
are not even mentioned. Nevertheless, the author has sufficiently proved
that quantitative methods can satisfactorily account for highly complex
linguistic phenomena, so there is no reason to doubt that they could be
successfully applied to other spheres of analysis.
The use of statistical methods for describing non-paradigmatic phenomena,
such as syntactical ones, is certainly a very significant contribution,
however, collaboration between the two kinds of knowledge is not
symmetrical. Since statistics is clearly subordinate to grammatical needs,
the features used are motivated on grammatical theoretical claims only,
with their statistical relative relevance, significance and productivity
are left unexploited. A deeper exploration in this aspect might yield
surprisingly good results in further research.
Givón, Talmy, (1993), English Grammar (vol. 2), Amsterdam/Philadelphia:
Grosz, Barbara J., Joshi, Aravind K., and Weinstein, Scott, (1995),
"Centering: A framework for modelling local coherence in discourse",
Computational Linguistics 21 (2): 203-26.
Gundel, Janette, (1994), "On the different kinds of Focus", Focus and
Natural Language Processing, vol. 3, P. Bosch and R. A. van der Sandt
(eds.), 457-466. Heidelberg. IBM Deutschland: IBM Working Papers of the
Institute for Logic and Linguistics 8.
Vallduví, Enric, and Engdahl, E., (1996), "The linguistic realization of
information packaging", Linguistics 34: 459-519.
About the reviewer:
Laura Alonso Alemany is a doctorate student at Last but not least.
She is currently affiliated to the CLiC (Centre for Language and
Computation), in the Department of General Linguistics of the University