LINGUIST List 19.3454|
Wed Nov 12 2008
Review: Text/Corpus Linguistics: Hoey et al (2007)
Editor for this issue: Randall Eggert
This LINGUIST List issue is a review of a book published by one of our
supporting publishers, commissioned by our book review editorial staff. We
welcome discussion of this book review on the list, and particularly invite
the author(s) or editor(s) of this book to join in. If you are interested in reviewing
a book for LINGUIST, look for the most recent posting with the subject "Reviews: AVAILABLE FOR REVIEW", and
follow the instructions at the top of the message. You can also contact the
book review staff directly.
Text, Discourse, and Corpora
Message 1: Text, Discourse, and Corpora
From: Stacia Levy <callmesalmsn.com>
Subject: Text, Discourse, and Corpora
E-mail this message to a friend
Discuss this message
Announced at http://linguistlist.org/issues/18/18-2575.html
AUTHORS: Hoey, Michael; Mahlberg, Michaela; Stubbs, Michael; Teubert, Wolfgang
TITLE: Text, Discourse, and Corpora
SUBTITLE: Theory and Analysis
SERIES TITLE: Studies in Corpus and Discourse
Stacia Levy, Ed.D., the University of the Pacific
Arising out of a course on corpus linguistics, the papers in this book bring
together the writings of leading experts in the field of corpus linguistics,
addressing the methodology from a variety of perspectives. Lexical priming
theory, parole-linguistics, or actual language use open to analysis, and local
textual functions are among the topics addressed. A variety of corpora are used,
such as the BNC (British National Corpus), smaller purpose-built corpora, as
well as Google searches.
By leading linguist John Sinclair, the introduction addresses a number of issues
of corpus linguistics covered in the book: empirical linguistics, lexical
''priming,'' or the expectations set by each encounter with a word; the way words
are used within social groups; methodology for examining literary texts, and key
issues of corpus linguistics, such as collocation, the local context in which
words occur. Both large corpora, such as the BNC (British National Corpus) and
small, such as the one of church writings designed by Teubert, are investigated.
Methodological issues such as corpus design and concordance analysis of words in
their context are also examined.
Chapter 1: Lexical Priming and Literary Creativity
Here author Michael Hoey uses sequences from Lewis Caroll's _Through the Looking
Glass and What Alice Found There_, a novel by Michael Moorcock, and a poem by
Philip Larkin to discuss how lexical priming can account for creativity found in
literary texts - that is, it is by first setting up and then overriding reader
expectation on how words will be used that something creative is formed through
both common and novel combinations of words.
Chapter 2 Grammatical Creativity: a Corpus Perspective
In this chapter author Hooey continues the examination of lexical priming theory
by addressing the relationship of lexis and grammar. According to Hooey, grammar
is the result of priming: collective, community expectation of the collocation
and colligation of words (Sinclair, 1991).
Hooey here looks particularly at the numeral system and the expected lexical and
grammatical patterns of specific numbers, focusing on the priming of children,
suggesting that perhaps children's grammars are primed by the stories and rhymes
they are read.
Chapter 3 Parole-linguistics and the Diachronic Dimension of the Discourse
Here author Wolfgang Teubert addresses discourse as one of the important
concepts to corpus linguistics, suggesting it should be approached from a
sociological perspective, as discourse is created and used within a community.
Examined in this section is the relationship of language and society, language
and meaning, and the ''diachronic dimension'' of discourse, or view of discourse
over time, as well as hermeneutics, or ''the discipline of text interpretation as
it was developed over the last centuries on the Continent'' (p. 57) and what
linguistics can contribute to text interpretation. The author sees text
interpretation as an ultimately democratic process, involving the negotiation of
meaning of a variety of participants, in which linguists have no privileged
status and that ''only a relativist perspective can be the basis of a
pluralistic, democratic society'' (p. 57).
Chapter 4 Natural and Human Rights, Work and Property in the Discourse of
Catholic Social Doctrine
In this chapter, Teubert analyzes a diachronic corpus linguistic methodology, or
corpus research over time: specifically a study of the use of the words ''work''
and ''property'' in a corpus of Church texts: the social encyclicals, official
Church policy on a number of fronts, and how those terms evolve over time in
relation to the concepts of ''natural law'' and ''human rights.'' Teubert states
that diachronic corpus linguistics is interested in ''investigating the change
that socially constructed discourse objects undergo over time'' (p. 89). In this
research, the author examines how the meaning of specific words change over time
and what that reveals about social change, for example, that ''work and property
were not always seen as related concepts'' (p. 90), that being a wealthy
landowner in the past, for example, actually precluded a need to work; only in
modern times have those terms become linked as property began to be seen as ''the
result of work and is legitimated thereby'' (p. 94). Similarly, rights and
responsibilities as dictated by the Church went a through a change of being
explained and justified in terms of ''natural law'' to ''human rights.''
Chapter 5 On Texts, Corpora, and Models of Language
In this chapter, author Stubbs addresses using corpora to develop a model of
language use and resolving differences between the theoretical language system
and actual use. One of the major contributions of corpora is their ability to
reveal large quantities of data of actual language use rather than using native
speaker intuition and introspection, which is often unreliable, namely because
native speakers are often, through their reflections, unaware of what is
''expected, predictable, usual, normal, and typical'' (p. 155) in language. It is
these typical patterns, however, that corpus studies, especially through the use
of concordancers, can reveal. Stubbs uses the example of the word ''tolerate'' to
show what corpora can reveal about language use through showing the different
usage patterns of ''tolerates'' and the context in which it typically occurs.
Corpus studies, in its concern with such patterns of use by actual speakers of
the language, is therefore inherently ''empirical'' and ''sociolinguistic,''
concerned with actual behavior of speakers from specific language communities.
Chapter 6 Quantitative Data on Multi-Word Sequences in English: the Case of the
In chapter 6, Stubbs continues the discussion of the contributions of corpus
studies toward revealing language patterns in a study of the specific word
''world'' and its usage. This is done by first forming the corpus, using software
to search for patterns, and then making generalizations about those patterns.
Stubbs uses the BNC (British National Corpus), a well-known and large corpus to
study one of its top ten nouns, ''world'' and shows how this word occurs in many
fixed and semi-fixed phrases, such as ''World War'' and ''in the world,'' which
accounts for its being one of the most common nouns. Such phrases are a
combination of lexis and grammar: that is, they have both semantic and syntactic
functions. They function together, as in ''most natural thing in the world,''
containing ''obligatory grammar lexis... and grammar'' (p. 165). These phrases also
have pragmatic function, Stubbs points out: in this case, the use is positive
evaluation. Stubbs also discusses at length the use of the PIE (Phrases in
English) database (Fletcher, 2003-2006) to find and analyze such frequently used
phrases, again using the example of phrases with ''world'' in them. Stubbs also
addresses problems in such analyses, such as determining cut-off point for
frequent use and the similarity of many phrases. He also addresses what he
believes to be one of the most important contributions of corpus linguistics:
Sinclair's model of extended language units, that meaning lies in the patterns
in which words occur more than the individual words themselves. He ends with
discussing important previous research in this area of case studies of
Chapter 7 Lexical Items in Discourse: Identifying Local Textual Functions of
In this chapter, the first of the final two chapters written by her, Mahlberg
studies the use of a specific phrase ''sustainable development'' in a corpus on
news articles, a phrase ''increasingly important in our society'' (p. 197). She
looks at the patterns in which it typically occurs
Chapter 8 Corpus Stylistics: Bridging the Gap between Linguistic and Literary
In this last chapter, Mahlberg shows corpus studies as a ''way of bringing the
study of language and literature closer together'' (p. 219), using a corpus of
Dickens texts, looking specifically at long clusters of language, eight-word
phrases, such as ''not to put too fine a point upon'' (p. 227) that reoccur in
Dickens, the characters the phrases are associated with, and the phrases'
functions within the text.
This is valuable reading for anyone planning to design a research study based on
corpus methodology; it is very detailed, with concrete examples of types of
research that can be done through corpus methodology and concerns researchers
typically encounter. The editor does a good job including different perspectives
and a variety of methodology; it is comprehensive. In some chapters, such as the
Stubbs' chapter on the word ''world,'' the reader is walked through designing a
small study using corpora. Most chapters include both an introduction and
conclusion, making the main ideas of that chapter more accessible.
However, this text is not, for the most part, written for the layperson but for
scholars with some familiarity with corpus linguistics: it assumes reader
familiarity with terms like ''discourse,'' ''priming,'' and ''collocation,'' terms
familiar to linguists, particularly corpus linguists, but not to a general
audience. The book would benefit from a glossary defining such terms.
Indeed, the book seems at times to plunge into specialized fields outside of
linguists, albeit related to it, such as philosophy: for example, in chapter 3
on parole linguistics, the author notes, ''We can understand society as a
structure of and for the interactions between people, human beings with
autonomous minds, with a sense of self-awareness and intentionality. This
implies that the people themselves and their consciousness are not a part of the
society as defined here'' (p. 58). This observation is eventually connected to
the topic, but this is dense, specialized text. For those specialists, however,
especially those planning to develop a research project using corpus
methodology, this is a useful book.
Fletcher, (2003-2006). _PIE: Phrases in English_ [Database]. http://pie.usna.edu.
Sinclair, J. (1991). _Corpus, concordance, collocation_. Oxford: Oxford
ABOUT THE REVIEWER
Stacia Levy, Ed.D., teaches writing at the University of the Pacific, where she
earned her doctorate. Her areas of expertise and research are academic writing
and corpus studies.
Read more issues|LINGUIST home page|Top of issue
Please report any bad links or misclassified data
LINGUIST Homepage | Read
LINGUIST | Contact us
While the LINGUIST List makes every effort to ensure the linguistic relevance of sites listed
on its pages, it cannot vouch for their contents.