Editor for this issue: Marie Klopfenstein <marie
linguistlist.org>
Cronfa Electroneg o Gymraeg (CEG) A 1 million word lexical database and frequency count for Welsh Please circulate to those interested This is a word frequency analysis of 1,079,032 words of written Welsh prose, based on 500 samples of approximately 2000 words each, selected from a representative range of text types to illustrate modern (mainly post 1970) Welsh prose writing. It was conceived as providing a Welsh parallel to the Kucera and Francis analysis for American English, and the LOB corpus for British English, in the expectation that such an analysed corpus would provide research tools for a number of academic disciplines: psychology and psycholinguistics, child and second language acquisition, general linguistics, and the linguistics of Modern Welsh, including literary analysis. The sample included materials from the fields of novels and short stories, religious writing, children's literature both factual and fiction, non-fiction materials in the fields of education, science, business, leisure activities, etc., public lectures, newspapers and magazines, both national and local, reminiscences, academic writing, and general administrative materials (letters, reports, minutes of meetings). The resultant corpus was analysed to produce frequency counts of words both in their raw form and as counts of lemmas where each token is demutated and tagged to its root. This analysis also derives basic information concerning the frequencies of different word classes, inflections, mutations, and other grammatical features. Available on-line: Ellis, N. C., O'Dochartaigh, C., Hicks, W., Morgan, M., & Laporte, N. (2001). Cronfa Electroneg o Gymraeg (CEG): A 1 million word lexical database and frequency count for Welsh. [On-line], Available: http://www.bangor.ac.uk/ar/cb/ceg/ceg_eng.html http://www.bangor.ac.uk/ar/cb/ceg/ceg_cym.html ------------------------------------------------------------------- - - ----------------------------------------------------------------------- Nick Ellis, e-mail: N.EllisMail to author|Respond to list|Read more issues|LINGUIST home page|Top of issuebangor.ac.uk | /\ Professor of Psychology, | / \/\ University of Wales, Bangor, Tel: +44 (0)1248 382207 | /\/ \ \ Gwynedd, Fax: +44 (0)1248 382599 | / ======\=\ Wales, LL57 2DG, U.K. | B A N G O R - ----------------------------------------------------------------------- http://www.psych.bangor.ac.uk/index.html - -----------------------------------------------------------------------
Our North Frisian language course is now available on the Internet. A preliminary version can be found at: http://www.fa.knaw.nl > fakgroepen > taalkunde > noardfrysk. The site is bilingual German/West Frisian. We are working on interfaces for German, West Frisian, English and Dutch. A printed version and a cd-rom will appear by the end of 2002. Comments are welcome! Eric Hoekstra, Ingo Laabs, Henk Wolf - --------------------------------------------------------- Henk Wolf hwolfMail to author|Respond to list|Read more issues|LINGUIST home page|Top of issuefa.knaw.nl (Fryske Akademy) FryskeRie
fa.knaw.nl (Fryske Rie) henkwolf
altavista.com (privee) Fryske Akademy, Postbus 54, NL-8900 AB Ljouwert tel. 058-2336918 / 058-2131414, faks 058-2131409