LINGUIST List 13.99

Thu Jan 17 2002

Disc: Phonetic Frequencies & "Corpus Phonetics"

Editor for this issue: Karen Milligan <>


  1. Mark Jones, Re: LINGUIST 13.88, Phonetic Frequencies & "Corpus Phonetics"

Message 1: Re: LINGUIST 13.88, Phonetic Frequencies & "Corpus Phonetics"

Date: Wed, 16 Jan 2002 09:16:33 +0000
From: Mark Jones <>
Subject: Re: LINGUIST 13.88, Phonetic Frequencies & "Corpus Phonetics"

Greg Kochanski makes a number of very valuable points and suggestions about 
the corpus based approach.

However, practical experience with non-University educated speakers in the 
field suggests to me that some of his suggestions may not work in practice, 
specifically asking a speaker to mark a text for potentially contrastive 
features like prosody. Although non-linguists are capable of doing such 
things, it often takes a great deal of time to explain this approach to 
language, and an ad hoc explanation in response to a particular speaker 
comment may itself, of course, prejudice different speakers to do things in 
different ways.

Secondly, his objection to laboratory speech is of course a very valid one, 
but it applies equally to reading a large text "seeded" with test words. 
Actual real world speech not only differs in phonetic terms from lab speech, 
but differs greatly in terms of the words and constructions used in writing. 
Anyone who has attempted to use something as uncontrolled as free 
conversation to elicit complex grammatical forms will testify to the amount 
of material needed to gain one subjunctive, for example. Using subjunctives 
in a reading text may therefore bias a speaker towards using a particular 
style, and style shifting may potentially take place within one text.

The same is true of vocabulary items, which may have a regional and social 
distribution not immediately apparent. As inflectional endings may also vary 
in their distribution in dialect (I'm thinking here of the use of present 
tense endings or past tense forms in some English dialects), the text may 
produce something which is a hybrid, and it may be a hybrid for different 
speakers at different points. This is uncontrolled and cannot be assessed 
without a knowledge of what counts as a representative token of one style as 
a yardstick.
In the end what one may gain from studying such a text may be less reliable 
as an indicator of any one style of speech, and it may be simpler and 
quicker to do a lab type elicitation task and some free conversation from 
the same speaker.

Obviously lab speech needs to be very controlled too in terms of the above 
as far as elicitation texts are concerned. However, as the focus of such 
elicitation sentences is usually one particular aspect of pronunciation, and 
therefore shorter, this is a much more manageable task.

Of course it is true that one gets different answers to different questions 
using different approaches. However, I feel that to regard lab speech as 
unnatural is incorrect. It is unspontaneous, but may be more representative 
of natural speech than a long text containing some of the non-local or 
socially and stylistically marked features I have referred to above.

In my own research, writing a dialct feature in one elicitation sentence 
(the reduced definite article t' of some northern English English dialects) 
prompted one speaker to say 'again' as [agen] in that sentence. In the 
identical sentence without the written dialect feature the same speaker 
consistently said [agein]. This suggests that the speaker was using more 
'natural' speech in the sentence with the dialect feature than in the 
sentence which lacked it, and indicates the degree of influence one 
social/regional feature can have on elicited data on a sentence by sentence 
and word by word basis.

Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue