Review of  Discourse Patterns in Spoken and Written Corpora

Reviewer: 'Claudia Sassen' ['Claudia Sassen'] Claudia Sassen
Book Title: Discourse Patterns in Spoken and Written Corpora
Book Author: Karin Aijmer Anna-Brita Stenström
Publisher: John Benjamins
Linguistic Field(s): Pragmatics
Text/Corpus Linguistics
Date: Mon, 27 Sep 2004 15:44:24 +0200
From: Claudia Sassen <>
Subject: Discourse Patterns in Spoken and Written Corpora

EDITOR: Aijmer, Karin; Stenström, Anna-Brita
TITLE: Discourse Patterns in Spoken and Written Corpora
SERIES: Pragmatics & Beyond New Series 120
PUBLISHER: John Benjamins
YEAR: 2004

Claudia Sassen, Institut für Deutsche Sprache und Literatur,
Universität Dortmund, Germany
Universität Dortmund, Germany

This book is an edited collection of 12 papers which seeks to bring
together corpus-based empirical studies on discourse patterns in speech
and writing. It is a selection of papers presented at the 5th ESSE
Conference in Helsinki 25-29 August, 2000 with some additional
contributions. The papers represent new trends within the framework of
text and discourse which is mirrored in the alliance of text
linguistics and fields such as corpus linguistics, genre analysis,
literary stylistics and cross-linguistic studies and look at the status
and meaning of the terms "text, discourse" and "function" in modern
linguistic theory. The book is divided into four parts: Part I is about
cohesion and coherence, Part II covers metadiscourse and discourse
markers, Part III offers a discussion on text and information structure
while Part IV treats metaphor and text. An index of names and an index
of terms are appended.

Aijmer & Stenström discuss in their introductory paper whether the use
of different terminology, e.g. "text" for the printed record of
communication and "discourse" for spoken texts, reflects different
perspectives on the same area of research. The discussion begins with
brief reviews ranging from text linguistics to linguistic theory and
function to discourse analysis. Aijmer & Stenstroem go on by summing up
recent trends in the linguistic study of text and discourse which
includes the use of corpora for text-linguistic purposes, the interface
between speech and writing, contrastive studies, and concludes with
future prospects. Within this framework, Aijmer & Stenstroem describe
how the papers of the collection go together.

Part I. Cohesion and coherence

Baicchi reports on a topic that constitutes one part of an Italian
research project. She explores the interplay of indexical efficiency,
complexity and markedness which she considers as three faces of the
same object. With a corpus based on the Online Books Page and the
English Server, she concentrates on the problem of the marked status of
cataphoric titles. Baicchi analyses titles as textual phenomena and
proposes a taxonomy based upon parameters that belong to semiotics.
With this, Baicchi seeks to supply an alternative classification of
titles and a foundation of their detailed analysis on a larger scale.
She assigns titles to a hierarchical scale of complexity and thereby
involves the evaluation of three basic criteria: (i) the quantity of
indexical cataphors contained in the title, (ii) quality of the
cataphors and (iii) the distance between the cataphors in the title and
their co-referents in the text base. As types of transparency she
identifies titles of total transparency which clearly identify the
referent thus scoring a high value of iconicity, of partial
transparency which displays some degree of indeterminacy to be
clarified through the reading process, titles which are symbolically
related and thus have a metaphoric link to their co-referents and
opaque or unrelated titles with a lowest degree of transparency and

Bruti explores cataphoric relations and complexity in a markedness
framework. Her research is related to the same project as Baicchi's
with the difference that Bruti focusses on spoken discourse. She sets
up an inventory of cataphoric modalities to demonstrate a scope of
cataphoric indexicality which ranges from more empty signs, mainly the
demonstrative pronoun "this", to various degrees of indeterminacy, most
notably the general noun "thing". The London-Lund corpus and the
British National Corpus are the databases for her concordance-based
analysis of the reduced clausal element "you know what?", which
functions as an attention-getter. Bruti opts for a broader definition
of cataphora, describes differences between the cataphoric devices she
identified and proposes and applies a calculating grid to determine
cataphoric complexity of three different types. With this she seeks to
shed light on the variation of cataphoric instances and to propose a
method to predict how cataphoric devices contribute to text complexity.

Hasselgard's goal is to investigate how two quite distinct aspects of
texture, viz. cohesion and thematic structure, interact in sentences
with "multiple themes". Unfortunately, it remains unclear what she
means by this term. Hasselgard makes a cross-linguistic comparison by
means of translations from English into the verb-second (V2) languages
Norwegian and German, since in V2 languages restrictions on the number
of constituents that can appear in the thematic field are higher than
in English. There is a tendency that Norwegian and German rely less on
conjuncts than English and use either conjunctions or no overt
conjunctive relation to mark certain cohesive relations. One of
Hasselgards major findings is that over 90 per cent of multiple themes
in English contain at least one cohesive tie which strongly suggests
that an essential function of multiple themes is to tie a sentence
explicitly to the preceding context. She finds the bulk of multiple
themes consisting of a cohesive element coupled with at least one
element that is not cohesive. She thus infers that the use of multiple
themes can bring non-cohesive elements into thematic position without
making the sentence seem unconnected with the preceding context. She
also finds that a multiple theme can mark two or more cohesive
relations at the same time. Future studies might benefit from taking
all three aspects of texture into account: cohesion, thematic structure
and information structure.

Tanskanen discusses cohesion patterns in spoken and written dialogue,
specifically in face-to-face conversation and email mailing list
messages. She explores the use of explicit cohesive markers
(reiteration and collocation relations) and their effects on
collaboration. Comparing the number of cohesive pairs of two-party
conversations with three party conversations, Tanskanen states a higher
number of relations for the former, particularly a use of simple
repetition pairs produced by the same speaker. Tanskanen concludes that
a dominance in same-speaker devices does not necessarily undermine
collaboration, since communicators can use longer turns in which
creating cohesive relations is possible. In three-party conversations,
collaboration is evident from the negotiating communicants. Owing to
the higher number in speakers, opportunities to produce cohesive
relations is smaller for a single speaker; however, it is easier for
them to jointly produce cohesive relations. In terms of cohesion,
mailing lists display a profile that comes close to the one of dyadic
conversations, although there is an increase in the number of
collocation pairs, which is a difference that may lie in the less
strict temporal constraints of email communication. In terms of
collaboration, mailing lists show the highest degree of monologic
properties of the entire corpus data. Despite the differences in number
of participants and context of dyadic conversations and mailing lists,
the communicators' interaction with their interlocutors and the context
results in a similar outcome.

Part II. Metadiscourse and discourse markers

Bamford looks at the interplay of the visual and verbal in
communication, with emphasis on patterns of gestural and symbolic uses
of the deictic "here" in lectures. She finds that gestural deixis is
almost invariably associated with the use of visuals. Bamford confirms
that both gesture and prosody are often to be found associated with
deictics despite variations in the closeness of their attachment to
these. Gestural deixis has a precise referent which is interpretable
when the visual context is available. Less precise are the referents of
symbolic deixis as they form part of the common cognitive space of the
speakers and their student audiences. The referent of symbolic "here"
is often abstract and belongs to the realm of concepts and ideas.
Bamford furthermore claims that since the referent of symbolic "here"
is vague it enables a variety of meanings and associations to be
attached to it. For this reason, lecturers can use symbolic "here" to
create rapport with their student listeners. Based on her tentative
hypothesis that gestural reference is more common in lectures than in
ordinary conversation, Bamford presumes that deictic use of lexical
items is a promising field for genre-specific further research.

Bondi's paper elucidates the relationship between metadiscourse and
specific disciplinary cultures in the use of connectors, highlighting
the contrastive connector "however" in historical abstracts. Bondi
takes the view that contrastive connectors do not only enable monologic
discourse to be interactive, but also imply evaluation by assuming a
common ground between reader and writer in terms of what is expected or
unexpected at any given point in the discourse. Her quantitative and
qualitative analysis was carried out by means of small corpora designed
for the study of abstracts and consisted of the following steps:
setting up and exploring frequency lists and key-words by Wordsmith
Tools followed by a concordance-based study of "however" to isolate
contrastive connectors and to explore patterns and meanings from a
comparative point of view. Finally, an in-depth textual analysis on the
use of "however" ensued to identify the core-meanings of the connector
and the textual/positional patterns in which these meanings were found.
The elements that precede "however" are identified as having a text-
structuring function, while "however" itself contributes to claiming
significance and credibility, e.g. by problematising or signalling
stance. On the evidence of her findings, Bondi opts for considering
multiple dimensions of language variation in the analysis of discourse
patterns and their markers.

Starting out from the "general expectation that speakers cooperate and
use language to facilitate the conveyance of information", Diani claims
that "I don't know", does not live up to this expectation when
considered as a discourse marker. What makes "I don't know" interesting
is the observation that speakers tend to use the phrase even when the
speaker is able to apply the information asked for. Diani describes the
various pragmatic functions of "I don't know" in three respects: (i)
its use within the framework of politeness and saving one's face, (ii)
its meaning and pragmatic functions, (iii) how its pragmatic function
is influenced in conjunction with the discourse markers "well, oh, I
mean" and "you know". Diani backs her analysis by instances from the
spoken language corpus of the Collins Birmingham University
International Language Database. Diani concludes that although "I don't
know" has significantly different functions they are unified by the
central meaning of declaring insufficient knowledge. She considers her
analysis as not exhaustive. There would indeed be options to further it
e.g. with focus on positional constraints within the turn and a
concomitant functional change.

Mauranen takes a micro-level approach to hedging. She selects some
typical hedges such as "sort of, or something, somewhat" and "just" and
analyses their profiles of use in the Michigan Corpus of Academic
Spoken English which she compares with data of the British National
Corpus and the Bank of English. She explores individual expressions to
find out to what extent a functional distinction into "epistemic" and
"strategic" is relevant in their usage and to what extent their primary
use falls into one or the other category. Mauranen reports that she
could maintain her distinction with sufficient ease to warrant its
application, although overlappings and bifunctional cases also occur.
For preferences of use she comes up with the following results: of a
highly epistemic use are "or so, or something" and "somewhat" while the
most strategic one is "a little bit". What had been initially
classified as vagueness indicators tends to display epistemic uses.
Mauranen opts for a genre distinction, since the more dialogic genres
tend to have more strategic hedges and the lectures more epistemic

Having in mind that over the years academic writing has been required
to be impersonal and objective, Samson explores how academic economic
writers convey their knowledge of economics and construct their written
lectures by adopting a personal stance and projecting themselves in
their texts. Thereby they challenge "what according to many should be
written, detached, decontextualised, and autonomous academic language".
Samson's reflections are based on results of a qualitative and
quantitative analysis of 10 written economics lectures on various
topics of macroeconomics which were all constructed in the same way:
they contain an introduction to announce the direction the lecture will
take, a middle to develop hypotheses, theses and model-worlds and a
conclusion. Originally, the lectures existed in the spoken form and
have been expanded by their authors for the written medium. Samson
seeks to show that personal markers (e.g. "I", inclusive and exclusive
"we") in the written economics lectures which she compares to planned
monologues are highly frequent and carry the functions of expressing
authorial and authoritative prominence. She also intends to show that
they take on different meta-discursive roles in order to aid the less
expert reader with comprehension, reinforce the interactional
relationship with the addressee and create a sense of solidarity. The
choice of personal markers mirrors the degree to which an author wants
to involve the reader in what is conveyed which requires to some extent
shared knowledge.

Part III. Text and information structure

Kaltenboeck focusses on functional properties and use of non-
extraposition whose communicative function has largely been
disregarded. Non-extraposition is statistically a marked construction
as its occurrence is by far outnumbered by its counterpart
extraposition, particularly in spoken language. The distribution of the
two constructions is tied to specific contexts which do not normally
allow interchangeability. Non-extraposed subject clauses are generally
not shifted into construction-final position with the result that the
matrix predicate is in the focus and that a more balanced distribution
of information within the construction is attained. This creates a
strong cohesive link with the foregoing context. Sometimes, non-
extraposition may serve the introduction of a new topic while
presenting it as if it were generally known and hence fulfills a
rhetorical purpose.

Part IV: Metaphor and text

On the evidence that translators occasionally fail to translate English
metaphors, Wikberg pursues the question of how qualitative corpus-based
research helps throw light on metaphor in translation. For his
analysis, Wikberg takes instances from the Oslo multilingual corpus and
distinguishes three uses of the term "metaphor": as linguistic
expressions, cognitive concept and discourse element. He argues that a
linguistic approach to metaphor has to pay due attention to the textual
and communicative aspects of metaphor and aims at the inclusion of
metaphor in an overall discourse model which goes contrary to earlier
approaches that limit the study of metaphor to the sentence or clause
level. Wikberg, who doubts that it is always possible to envisage the
existence of underlying propositions for all sorts of metaphors,
ignores the propositional level in his paper. What is crucial for the
correct translation into the target language is understanding of the
original metaphorical expression and its pragmatic function, whereby,
the respective semantic fields and their interpretation play an
important role. Wikberg proposes an ideal case with translators and
researchers having access to a word list for the original expression
and a thesaurus and a collocation dictionary for each language.


The book makes a good and interesting read. Its structure as given by
the editors is helpful as a guide through the interrelations of the
different topics. However, the two final sections "Text and information
structure" and "Metaphor and text" only subsume one paper each and thus
appear a bit odd and artificial. It might be argued whether it would
have been wise to include the papers somewhere else among the other
papers. This would probably have resulted in alternative section titles
and a different emphasis of research. Proof reading was nearly flawless
with a few negligible typos. These are however minor issues and what we
should definitely remember about the book is that it offers an
informative insight into recent trends and topics in present-day
linguistics. It is perfectly designed for everybody who is into corpus
linguistics and allied fields. A wealth of data from many different
genres has been used for investigation. Particular delight arises from
the fact that the analyses seek to bridge gaps between different
linguistic disciplines, most notably text linguistics and corpus
research. Each area of study will undoubtedly benefit from an approach
like this.


Claudia Sassen is a researcher in linguistics at Universitaet Dortmund.
She holds a doctorate in computational linguistics. In her doctoral
dissertation she explored and formalised constraints in a controlled
language, i.e. cockpit voice recordings of airplane accidents. Her
research interests are computational linguistics, corpus linguistics
and in particular constraints in discourse.