LINGUIST List 13.50

Thu Jan 10 2002

Disc: New: Phonetic Frenquencies & "Corpus Phonetics"

Editor for this issue: Karen Milligan <>


  1. Yuri Tambovtsev, Phonetic frequencies and corpus phonetics as a part of corpus linguistics

Message 1: Phonetic frequencies and corpus phonetics as a part of corpus linguistics

Date: Sat, 5 Jan 2002 22:13:11 +0600
From: Yuri Tambovtsev <>
Subject: Phonetic frequencies and corpus phonetics as a part of corpus linguistics

Re: Linguist 12.1634

Dear colleagues, 

Thank you all who answered me. I'd like to answer your question that
was in all your messages. Why it is important to compute the phonemic
frequencies of occurrence in a language? Every language has this or
that unigue sound picture. One can intuitively feel that language A is
different from language B hearing the sound picture of a language. The
phonemic frequencies of occurrence create this or that sound mosaic of
a language. We can compare world languages with each other after we
obtain the sound picture of every world language. Now linguists
believe that there are about 4000 or 5000 languages in the
world. However, unfortunately, there are only 120 data on phonemeic
frequency of occurrence I that I could collect for world languages.

Recently there was a discussion what the corpus linguistics is. I
noticed that many linguists understand it in a narrow way: just as
corpus lexicology, rather than corpus linguistics. I propose to
consider corpus phonetics as a part of corpus linguistics. By corpus
phonetics I mean the part of corpus linguistics which studies
phonetical features that become transparent when the text in some
language is long enough. I have computed many long texts in different
languages. It allowed me to obtain some interesting typological
results on the one hand and corpus results on the other hand. If the
text is not long enough, one can't obtain corpus phonetics
results. I'd like the colleagues in the field of corpus linguistics to
share their ideas if one should include corpus phonetics in corpus
linguistics or if corpus linguistics should include only corpus
lexicology and corpus syntax? If not, how should long transcription
texts be called? 
Looking forward to your answers.

Yours sincerely,
Yuri Tambovtsev

Novosibirsk Ped.University, Russia
