Publishing Partner: Cambridge University Press CUP Extra Publisher Login
amazon logo
More Info

New from Oxford University Press!


Oxford Handbook of Corpus Phonology

Edited by Jacques Durand, Ulrike Gut, and Gjert Kristoffersen

Offers the first detailed examination of corpus phonology and serves as a practical guide for researchers interested in compiling or using phonological corpora

New from Cambridge University Press!


The Languages of the Jews: A Sociolinguistic History

By Bernard Spolsky

A vivid commentary on Jewish survival and Jewish speech communities that will be enjoyed by the general reader, and is essential reading for students and researchers interested in the study of Middle Eastern languages, Jewish studies, and sociolinguistics.

New from Brill!


Indo-European Linguistics

New Open Access journal on Indo-European Linguistics is now available!

Summary Details

Query:   Vocabulary Statistics
Author:  Richard Hudson
Submitter Email:  click here to access email
Linguistic LingField(s):   Text/Corpus Linguistics

Summary:   A few weeks ago I broadcast a double query about the statistics of English
vocabulary. My first question was about the number of morphemes compared
with the number of lemmas, but nobody offered an answer.

My second question was more successful. This was about the proportion of
lemmas in each of the main word classes, and how this proportion varied
with token frequency; I was particularly keen to check a guess that the
proportion of nouns was greater among rare lemmas than among common ones. I
received data from Gwillim Law and Jasper Holmes. It turns out that my
guess was right. I've presented and summarised the data at If anyone has
comments or further data (including data on other languages), I should of
course be most interested to hear from them.

LL Issue: 20.413
Date Posted: 09-Feb-2009
Original Query: Read original query


Sums main page