LINGUIST List 9.258

Sun Feb 22 1998

Sum: English Word Frequency

Editor for this issue: Julie Wilson <julielinguistlist.org>


Directory

  • Alex Zheltuhin, Summary: English word frequency

    Message 1: Summary: English word frequency

    Date: Thu, 19 Feb 1998 14:29:39 -0500 (EST)
    From: Alex Zheltuhin <alexzamber.biology.gatech.edu>
    Subject: Summary: English word frequency


    10 days ago I posted a query about recent English word frequency lists. Counter my expectations, I received very few references to the relevant on-line resources. I would like to thank the following Linguist subscribers for their kind responses: Julie Vonwiller Lynn Santelmann Timothy Jay Barbara Pearson Marie C. Egan

    Suggestions that I received are given below in no particular order.

    Julie Vonwiller:

    Most of the major newspapers have their papers on line. Word frequencies would be available that way. Otherwise check the comp.speech site for references. I think they list that kind of thing. Also most dictionary publishing forms have web sites now.



    Lynn Santelmann:

    If you go to the Website for the linguist list, they have links to several on-line sources for word frequency. The LDC at UPenn is the first that comes to mind, but there are others too.



    Timothy Jay:

    I have a chapter on word frequency in CURSING IN AMERICA (1992, John Benjamins Pub Co - 1-800-562-5666). My research indicates how frequency estimates exclude the usage of offensive words, along with general problems of estimating word usage.

    Barbara Pearson and Marie C. Egan referred to

    Francis, W. N. & Kucera, H. (1982). Frequency analysis of English usage: Lexicon and grammar. Boston, MA: Houghton Mifflin.

    I would like to extend this list of suggestions with additional references of interest:

    Kucera, H. & Francis, W. N. (1967). Computational analysis of present-day American English. Providence, RI: Brown University Press.

    Bloom, P.A., & Fishler, I. (1980). Completion norms for 329 sentence contexts. Memory and Cognition, 8, 631-642.

    On letter/bigram/trigram frequency see: Solso, R. L., & King, J. F. (1976). Frequency and versatility of letters in the Endlish language. Behavior Research Methods and Instrumentation, 8, 283-286.

    Solso, R. L., Barbuto, P. F. & Juel, C. L. (1979). Bigram and trigram frequency and versatility in the English language. Behavior Research Methods and Instrumentation, 11, 475-484.

    Web sites:

    http://rreck.sealsoft.com/landtools.html There is a link to this site on the Linguist's web page.

    HCRC Map Task Corpus (150,000 tokens) http://www.cogsci.ed.ac.uk/elsnet/Resources/Map-Task/mt_corpus.html

    ARTFL Project Word Frequency Search Form at http://humanities.uchicago.edu/forms_unrest/ARTFL.wl.html

    Statistics gathered for the most frequent words found on Usenet in 1992: www.sc.pdx.edu/~kenrick/cs350/assignments/program1/html

    A follow-up on Kucera & Francis study: http://hotspur.psych.yorku.ca/SCS/Online/paivio/density.html

    I am very much looking forward to further references and suggestions on the subject. If I receive additional information, I will certainly update the summary. And once again, many thanks to those who responded!

    Regards,

    Alexander Zheltukhin, Ph.D. alexzamber.gatech.edu