LINGUIST List 15.2427

Tue Aug 31 2004

Review: Psycholinguistics: Schmitt (2004)

Editor for this issue: Naomi Ogasawara <>

What follows is a review or discussion note contributed to our Book Discussion Forum. We expect discussions to be informal and interactive; and the author of the book discussed is cordially invited to join in. If you are interested in leading a book discussion, look for books announced on LINGUIST as "available for review." Then contact Sheila Dooley Collberg at


  1. TSCHICHOLD Cornelia, Formulaic Sequences

Message 1: Formulaic Sequences

Date: Tue, 31 Aug 2004 11:27:31 -0400 (EDT)
From: TSCHICHOLD Cornelia <>
Subject: Formulaic Sequences

EDITOR: Schmitt, Norbert
TITLE: Formulaic Sequences
SUBTITLE: Acquisition, processing and use
SERIES: Language Learning & Language Teaching 9
PUBLISHER: John Benjamins
YEAR: 2004
Announced at

Cornelia Tschichold, Department of English, Universit� de Neuch�tel,

''Formulaic Sequences'' is an edited volume of twelve papers, with a
focus on the acquisition of multi-word lexemes by non-native, adult
learners. All the contributions to the volume broadly assume
Sinclair's (1991) notion of language as functioning under a
combination of the open-choice principle and the idiom principle, and
most of them also draw on Wray (2002) for the consequences of this
assumption for learners of a foreign language. The term ''formulaic
sequence'' is adopted by most of the authors as a term that covers
other more specific terms (such as ''phrasal lexeme'', ''lexical
chunk'', ''collocation'', ''prefabricated language'', etc.) and is
much more wide-ranging than the traditional phraseological units,
i.e. phrasal verbs, idioms and metaphors.


In the introduction, Norbert Schmitt and Ronald Carter give some
background to the volume, outline the main problems in relation to
formulaic sequences (definition, formal adaptability, psycholinguistic
reality, functions in the discourse, learning burden) and provide an
overview of the chapters. They also point out the numerous questions
still totally open to research.

In a paper on measurement methodology, John Read and Paul Nation
describe the various difficulties linguists and lexicographers face
when trying to decide what to include in their inventory of formulaic
sequences. To ensure reliability in the process of deciding what is to
be included in the phraseological lexicon, several trained raters need
to arrive at the same conclusion for specific word groups. One of the
main problems here is the amount of variation phraseological lexemes
are subject to and the challenge this poses for both purely
computational, corpus- based approaches and the definition of what to
include within one's phraseological lexicon. (Wray's (2002) definition
of formulaic sequence does not include sequences that have undergone
transformations or substitutions of individual words.)

Koenraad Kuiper, in a chapter on conventionalized varieties of speech,
investigates the language used in professional fields where highly
conventionalized phrases are an integral part of the speech people
produce. He compares the language learnt and used by auctioneers and
(certain) sports reporters to the linguistic apprenticeship that
traditional story tellers and oral poets need to go through. Talking
constitutes a significant part of their work and in order to produce
fluent speech, a number of highly formulaic sequences and other
conventions are used. These and examples from other professions
(supermarket checkout operators, weather forecasters, script writers)
show that newcomers must be initiated into the formulaic tradition,
before they can use it and introduce their own variations. Kuiper also
argues that many groups and subgroups of human societies have their
own smaller or larger oral tradition. By looking at these rather
extreme cases of formulaic sequences in use, it is hoped that some
light can be shed on the more everyday varieties of conventionalized

The next paper, by Norbert Schmitt, Zolt�n D�rnyei, Svenja Adolphs,
and Valerie Durow, is the first in a series of studies from the
University of Nottingham. The authors report on the acquisition of a
set of formulaic sequences by international students learning English
as a foreign language and preparing for their studies at a British
university. The subjects in the study made progress during the period
tested, but the results did not correlate with standard measurements
of motivation.

This somewhat surprising result gave rise to the next study, described
here by Zolt�n D�rnyei, Valerie Durow, and Khawla Zahran. In a
qualitative study, seven of the international students whose progress
was followed in the initial study were interviewed in order to find
out more about their motivation and degree of acculturation. Given
that language aptitude on its own could not explain the degree of
progress the learners made, the authors conclude that sociocultural
adaptation and contact with the local native speakers were central to
the learners' success, and only very high degrees of both motivation
and language aptitude can make up for lack of acculturation.

Going a step further, Svenja Adolphs and Valerie Durow then look in
more detail at the progress in the use of formulaic sequences by two
students with a widely diverging degree of sociocultural
integration. They quantitatively investigate the students' use of
three-word sequences over a period of seven months. Their results show
that the student who integrated well into the host society made much
better progress in her use of the most frequent type of formulaic
sequences than the other student, who had relatively little social
contact with native speakers.

Following this group of studies on acquisition, Norbert Schmitt, Sarah
Grandage, and Svenja Adolphs introduce the next group of Nottingham
studies, directed at the processing of formulaic sequences. The
authors report on a study that aimed to test the psycholinguistic
validity of frequent word strings (derived from a corpus) for both
native and non-native speakers. They selected 25 so-called recurrent
clusters, based on several published sources of frequent clusters and
inserted these into a story that was then used in a dictation
task. The results from the native speaker group suggest that the
clusters differ in their psycholinguistic coherence, possibly due to
differing degrees of transparency. As could be expected, the non-
native speakers scored lower on the dictation task, producing fewer
wholly correct clusters, and more variation or hesitation, a result
which can be interpreted as pointing to a non-holistic storage of the

Geoffrey Underwood, Norbert Schmitt and Adam Galphin then used the
method of tracking eye movements during a reading task as the basis
for their study. Their assumption was that the last word of a
formulaic sequence would get less eye fixation time than the same word
outside a formulaic sequence. The authors show that the last word in a
formulaic sequence does indeed get less fixation time, thus confirming
the hypothesis that the last word was expected by the reader. But the
hypothesis was borne out only for the case of the native speaker
readers. The results from the experiments with the non-native speakers
were less conclusive and difficult to compare to the results of native
speakers. Non-native readers obviously have considerably more
fixations on the text as a whole, but the last words of formulaic
sequences received fewer, not shorter fixations. Theories on reading
and eye-movement do not seem to offer an explanation of this

In the follow-up experiment, Norbert Schmitt and Geoffrey Underwood
used self-paced reading (by clicking a key to see the next word on the
screen) to find out whether formulaic sequences were processed faster
than non-formulaic sequences. While the native speakers read faster
than the non-natives, the terminal words in the lexical chunks did not
show a difference. Given these inconclusive results, the authors point
out that it is doubtful whether the methodology is a useful approach
to their research question.

In the next contribution, Carol Sp�ttl and Michael McCarthy compared
subjects' knowledge of formulaic sequences across several
languages. They report on the results of think- aloud protocols by
multilingual participants who were asked to translate formulaic
sequences from English into their L1 (German) and then into their L3
(and L4). The authors show that only some well-known and frequent
expressions were translated holistically and without hesitation. Most
expressions gave rise to some analysis and evaluation on the part of
the participants.

Typographic salience provides the background for Hugh Bishop's study
on look-up behaviour and comprehension of formulaic sequences by
language learners. While studies on single word salience and ensuing
look-up behaviour do not show a clear advantage for marked texts, the
author set up an experiment for formulaic sequences based on the
assumption that learners do not necessarily recognize such unknown
lexemes in a running text and therefore miss out on the noticing stage
generally assumed to be essential to learning. Results show that there
is indeed a clear difference in students' look-up behaviour if
formulaic sequences are made salient. One reason for this is probably
the fact that single words within a printed texts are set off by
blanks, but multi-word units do not have this identifying feature and
thus go unnoticed much more easily. Typographic salience could thus
make a much more marked difference for multi-word units than for
single words.

Alison Wray's contribution is based on data from a beginning Welsh
learner, who spent a few intensive days memorizing Welsh in order to
appear on television. In the programme, ''Welsh in a Week'',
individual learners are taught enough Welsh phrases to get them
through a specific situation. The phrases were taught (and learnt
presumably) as holistic units, and at the end of the week, the learner
succeeded in giving her cookery demonstration in largely correct,
fluent Welsh. Five months later, she still remembered most of her
text, but despite the strongly holistic teaching approach, she
introduced a small number of errors, a phenomenon which must be due to
linguistic analysis. This suggests that learners, adults at least, do
analyse chunks of language, even if it would serve their immediate
goals better to just learn the text by heart.

In the last chapter of the volume, Martha Jones and Sandra Haywood
report on a study which tried to raise their students' awareness of
formulaic sequences in academic texts. After evaluating some widely
used textbooks for English for Academic Purposes (EAP), the authors
chose a corpus of lexical chunks typical for this genre and worked on
this set with their students, learners of English in a presessional
course at Nottingham. While they clearly reached their goal of raising
students' awareness, students' use of formulaic sequences in the
posttest hardly improved.


The volume as a whole is a very accessible collection of papers that
show a good range of empirical studies on the acquisition and
processing of formulaic sequences. In contrast to many other books on
multi-word lexemes, this volume does not concentrate on the selection
of the appropriate set of multi-word items, but focuses on second-
language learners and the possible processes that facilitate the
learning of formulaic sequences. While this is certainly one of the
strengths of this volume, this focus might also have led to a less
detailed consideration of the lexemes used in the various
studies. Multi-word lexemes obviously come in many different types and
sizes, with widely varying syntactic structures and vast differences
in semantic opaqueness. Some contributions are clearly inspired by
research on vocabulary and tend not to focus on the considerable
differences between single words and multi-word units. Schmitt and
Underwood's self-paced reading study, for example, does not seem to
take into account the syntactic structure of the formulaic sequences
used in the task, a factor that could certainly be expected to have an
impact on the speed of reading. Other factors, such as the fact that
translating is a rather specialized skill and does not come to even
very fluent bilinguals in a natural way, could help to explain some of
the results in the Sp�ttle and McCarthy study. Being forced to
activate several languages in one's brain might impede easy access to
long phrases. Given Sinclair's idiom principle and open- choice
principle, language users typically have both routes open to them and
choose the idiom principle for speed and ease of processing. For adult
second-language learners, the situation could well be
different. Analysing strings of language rather than learning them by
heart might facilitate long-term retention or provide an alternate
route to a formulaic sequence if holistic memory fails.

A number of the studies in the volume give rather inconclusive
results, an aspect which could be somewhat frustrating to the authors,
but also tends to highlight the fact that we still have much to find
out about formulaic sequences. We probably need an even more strongly
interdisciplinary approach to such lexemes in order to reach more
solid findings. But ending up with more questions after reading a book
than one started out with is not necessarily a bad thing after all and
should be seen as doing credit to this book.


Sinclair, J. (1991). Corpus, Concordance, Collocation. Oxford: OUP.

Wray, A. (2002). Formulaic language and the lexicon. Cambridge: CUP.


Cornelia Tschichold teaches English linguistics at the University of
Neuch�tel, Switzerland. Her research interests focus on English
phraseology, computational lexicography and intelligent
computer-assisted language learning.
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue