LINGUIST List 3.508

Thu 18 Jun 1992

Misc: Serbo-Croat, German

Editor for this issue: <>


  1. Henning M|rk, YU-CORPUS
  2. , German Corpus Sources, Summary

Message 1: YU-CORPUS

Date: Thu, 18 Jun 92 13:29:56 +0200
From: Henning M|rk <>
Subject: YU-CORPUS

[From Humanist Discussion Group, Vol. 6, No. 0088. Thursday, 18 Jun 1992.]

Dear colleagues, Aarhus, Denmark, June 1992

 This message is to announce the first part of my YU-CORPUS (Yugoslav text
corpus) consisting of (mainly) contemporary fiction (prose) in Serbo-Croatian
with the main areas represented: Serbia, Croatia, Montenegro, and Bosnia-
 The corpus consists of 15 files containing together approximately 700 000
 These files are available by

 ftp at ( in the directory /home/ftp/pub/slav

 First get the text files yu-corp.txt, which among other things tells
about the chosen ASCII standard, and yu-index.txt, which identifies the
available texts by author(s) and size.
 The corpus files are zipped and must thus be transferred in binary mode.

 All comments are welcome

 Henning Moerk
 Slavisk Institut
 Aarhus Universitet
 Ny Munkegade 116
 8000 Aarhus C

 tel: +45 86 13 65 55
 fax: +45 86 19 21 55

Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue

Message 2: German Corpus Sources, Summary

Date: Thu, 18 Jun 1992 15:44:00 PDT
From: <>
Subject: German Corpus Sources, Summary

Thanks to those who responded to a request for German Corpus Sources. Here's a
For a copy of the 6 million Mannheim Corpus (which contains samples of
literature, novelettes, scientific texts, autobiographies, magazines and
newspapers from the Sixties and early Seventies) you can contact Ms. S.
Dickgiesser at the:

Institut fu"r Deutsche Sprache
Abteilung WD/LDV
Friedrich-Karl-Strasse 12
6800 Mannheim 1

Prof. Wolfgang
Lenders of the University of Bonn

reported to have significant corpora, but no addresses are currently available
LIMAS-corpus of written modern German, which has been
constructed following the same rules as have been used for the BROWN-corpus.
It consists of 1.1 millions of running words and is available on HD-floppies
either preindexed with WordCruncher containing the ASCII-file as well for
1,250.00 DM or as plain ASCII-file only for 1,000.00 DM (all including
mailing). Moreover we're offering vol. i - ix of the works of Immanuel Kant
on floppies and CD-ROM, a word-databank of German (300,000 entries) and a
morpheme dictionary of German.
If you send me your snail mail address I'll send you a list of the software
and data we're offering to the scientific community.

Gerd Willee UPK000ibm.rhrz.uni-bonn:de:Xerox
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue