LINGUIST List 9.258

Sun Feb 22 1998

Sum: English Word Frequency

Editor for this issue: Julie Wilson <julielinguistlist.org>


Directory

  1. Alex Zheltuhin, Summary: English word frequency

Message 1: Summary: English word frequency

Date: Thu, 19 Feb 1998 14:29:39 -0500 (EST)
From: Alex Zheltuhin <alexzamber.biology.gatech.edu>
Subject: Summary: English word frequency


10 days ago I posted a query about recent English word frequency lists.
Counter my expectations, I received very few references to the relevant
on-line resources.
I would like to thank the following Linguist subscribers for their 
kind responses:
Julie Vonwiller
Lynn Santelmann
Timothy Jay
Barbara Pearson
Marie C. Egan

Suggestions that I received are given below in no particular order.

Julie Vonwiller:

Most of the major newspapers have their papers on line. Word
frequencies would be available that way. Otherwise check the
comp.speech site for references. I think they list that kind of
thing. Also most dictionary publishing forms have web sites now.



Lynn Santelmann:

If you go to the Website for the linguist list, they have links
to several on-line sources for word frequency. The LDC at UPenn
is the first that comes to mind, but there are others too.




Timothy Jay:

I have a chapter on word frequency in CURSING IN AMERICA (1992, John
Benjamins Pub Co - 1-800-562-5666). My research indicates how
frequency estimates exclude the usage of offensive words, along with
general problems of estimating word usage.

Barbara Pearson and Marie C. Egan referred to 

Francis, W. N. & Kucera, H. (1982). Frequency analysis of English
usage: Lexicon and grammar. Boston, MA: Houghton Mifflin.


I would like to extend this list of suggestions with additional 
references of interest:

Kucera, H. & Francis, W. N. (1967). Computational analysis of
present-day American English. Providence, RI: Brown University Press.

Bloom, P.A., & Fishler, I. (1980). Completion norms for 
329 sentence contexts. Memory and Cognition, 8, 631-642.

On letter/bigram/trigram frequency see: Solso, R. L., & King,
J. F. (1976). Frequency and versatility of letters in the Endlish
language. Behavior Research Methods and Instrumentation, 8, 283-286.

Solso, R. L., Barbuto, P. F. & Juel, C. L. (1979). Bigram and trigram
frequency and versatility in the English language. Behavior Research
Methods and Instrumentation, 11, 475-484.

Web sites:

http://rreck.sealsoft.com/landtools.html
There is a link to this site on the Linguist's web page.

HCRC Map Task Corpus (150,000 tokens)
http://www.cogsci.ed.ac.uk/elsnet/Resources/Map-Task/mt_corpus.html

ARTFL Project Word Frequency Search Form at
http://humanities.uchicago.edu/forms_unrest/ARTFL.wl.html

Statistics gathered for the most frequent words found on Usenet in
1992: www.sc.pdx.edu/~kenrick/cs350/assignments/program1/html

A follow-up on Kucera & Francis study:
http://hotspur.psych.yorku.ca/SCS/Online/paivio/density.html

I am very much looking forward to further references and suggestions
on the subject. If I receive additional information, I will certainly
update the summary. And once again, many thanks to those who
responded!

Regards,

Alexander Zheltukhin, Ph.D.
alexzamber.gatech.edu
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue