Editor for this issue: Lydia Grebenyova <lydia
linguistlist.org>
For Query: Linguist 11.1459 Dear colleagues Recently I posted a query about where to download German word lists. I would like to thank the following people (in alphabetical order) for their kind assistance: Anna Braasch Damon Allen Davison Pius ten Hacken Agnes Muehlmeyer-Mentzel Noemi Preissner Markus Schulze In what follows I provide a list of the sites that were pointed out to me with some additional comments: http://www.linguistik.uni-erlangen.de/LAPTDA/laptda.html These wordlists were taken from seven corpora of the domains electronic data processing, geography, law, medicine, sports, linguistics, economics and a representative german corpus (LIMAS-corpus). Each of theses corpora contains roughly 1.000.000 wordforms. Downloadable are: o Frequency lists of morphemes, allomorphs, wordforms of the single corpora. o so-called "n-domain-lists" of morphemes, allomorphs, wordforms: n-domain-list: list of items that occured in n of the domain-specific corpora mentioned above) eg.: the 2-domain-list of medicine and law contains all morphems / allomorphs / wordforms that occured in both corpora together with their respective frequency information http://www.loria.fr/~bonhomme/sw/ A useful collection of lists for French, English and German (large word lists and smaller stop lists) http://services.canoo.com/MorphologyBrowser.html http://www.unibas.ch/LIlab/projects/wordmanager/wordmanager.html They offer not only a list of word forms, but also a morphological analysis module. In addition, word formation rules can be applied to recognise newly coined compounds and derivations, which is not a trivial advantage in German. Finally, Agnes Muehlmeyer was so kind to let me have a 360,000 words word list (generated on the basis of the German weekly newspaper Die Zeit (1986). Apart from the above-mentioned sites directly concerned with word lists, I was also directed to some sites with slightly different though related contents: http://www.kun.nl/celex/ http://www.ldc.upenn.edu/Catalog/LDC96L14.html http://www.cis.uni-muenchen.de/projects/CISLEX.html Once again, thanks to all contributors. S t e f a n T h . G r i e s - -------------------------------------------------------------------------- B u e r o / O f f i c e : Syddansk Universitet Institut for Erhvervssproglig Informatik og Kommunikation Grundtvigs All� 150 6400 Sonderborg Daenemark/DenmarkMail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue