LINGUIST List 12.2405

Thu Sep 27 2001

FYI: Czech Lang/Old Turkish Texts/Phonetics Freeware

Editor for this issue: Marie Klopfenstein <>


  1. LDC Office, New Release from the LDC
  2. Roger Billerey-Mosier, Free formant-plotting software
  3. Irina Nevskaya, Publication of VATEC CD 1.0 containing Old Turkic texts

Message 1: New Release from the LDC

Date: Wed, 26 Sep 2001 16:26:18 -0400
From: LDC Office <>
Subject: New Release from the LDC

 *** Prague Dependency Treebank 1.0. ***

The Linguistic Data Consortium (LDC) is pleased to announce the
availability of the Prague Dependency Treebank 1.0.

This single CD-ROM Czech language release contains the following

* Morphologically and syntactically annotated Czech data, 1.8MW 
* Czech-English parallel Corpus, aligned, 0.9MW/1MW 
* Czech raw texts (newspaper and journals), over 30MW 
* Czech NLP tools (morphology, tagging) 
* General annotation tools (tree editors, tree viewer) 

The Prague Dependency Treebank (PDT) is currently being developed
by Jan Hajic, Eva Hajicova, Petr Pajas, Jarmila Panevova, Petr Sgall,
Barbora Vidova Hladka at Charles University, Prague in the Czech
Republic. This long-term project consists of two major phases. During
the first phase (1996-2000), the morphological and syntactic analytic
layers of annotation were completed. This annotation, together with a
preview of tectogrammatical layer annotation, is available as PDT 1.0. 

PDT 0.5 ('half through') was released online by Charles University in
1998 and it contains 456,705 tokens (words and punctuation) in 26,610
sentences. This release has been downloaded by 90 researchers and/or
sites from 19 countries. The current release, PDT 1.0, contains about
three times more tokens and sentences than PDT 0.5. 

During the second phase (2000 - 2004), the tectogrammatical layer
of annotation will be conducted; at the conclusion of this phase, 
PDT 2.0 will be available.

Institutions that have membership in the LDC during the 2001 
Membership Year will be able to receive this corpus free of charge. 
Nonmembers may purchase this publication for $100. Please note 
that an online user agreement form must be completed for both member
requests and nonmember purchases. 

If you need additional information before placing your order, or 
would like to inquire about membership in the LDC, please send email to
<> or call (215) 573-1275.
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue

Message 2: Free formant-plotting software

Date: Wed, 26 Sep 2001 17:25:48 -0700
From: Roger Billerey-Mosier <>
Subject: Free formant-plotting software

Dear Phoneticians,

Free software to plot your formants, print your plots, save them in JPEG 
format to include in your papers, etc. is now available at:

Since this is a java program, it should run on Windows, Mac (OSX), 
Solaris, and Linux/Unix machines.

Send feedback to the author, Roger Billerey-Mosier, at

- Roger

- -
Roger Billerey-Mosier

Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue

Message 3: Publication of VATEC CD 1.0 containing Old Turkic texts

Date: Thu, 27 Sep 2001 09:13:20 +0200
From: Irina Nevskaya <>
Subject: Publication of VATEC CD 1.0 containing Old Turkic texts

Dear colleagues,
The members of the VATEC project are glad to announce the publication of
VATEC CD 1.0 presenting the results of the pilot phase of the VATEC
project ("Vorislamische Alttuerkische Texte: Elektronisches Corpus" -
Pre-Islamic Old Turkic Texts: Electronic Corpus).

The VATEC project is being
carried out at the Frankfurt and Goettingen Universities and at the
Berlin-Brandenburg Academy of Sciences. The heads of the project are
Prof. Marcel Erdal, Prof. Jost Gippert (Frankfurt), Prof. Klaus
(Goettingen), Prof. Peter Zieme (Berlin). The participants of the
project are Dr. habil. Irina Nevskaya, Dr. Ralf Gehrke (Frankfurt), Dr.
Michael Knueppel (Goettingen), Dr. Jakob Taube (Berlin). The project is
financed by the German Research Foundation (DFG - "Deutsche

The VATEC project is connected with the project of digitalization of Old
Turkic manuscripts that are stored in the Turfan Collection of the
Brandeburg Academy. The CD presents a number of texts with links
to digitalized manuscripts on the Internet sites of the Berlin Academy.

The CD presents a reedition or a first edition of the following Old
Turkic texts in the runiform, Uighur, Manichaean, Sogdian and Syriac
scripts. The text edition contains transliteration, rough
transcription (with emendations, conjectures, etc.), normalized
transcription (which was the basis for further morphological analysis),
morphological parsing and glossing, and translation into German or
English. All the texts are presented in the Shoebox, Word-Cruncher and
HTML formats providing various information retrieval opportunities
(concodances, word lists, sorting, filtering, etc.).

1. Several manuscripts of the Chuastuanift (Manichaean Confession of
text in the Uighur and Manichean scripts: the London scroll (in
full), the Berlin and the Saint-Petersburg manuscripts (in part), and
a compiled text
2. Irk Bitig (A Book of Omens) in the runiform script
3. A Nestorian Old Turkic text: Wedding blessings (Syriac script)
4. Manichaean Old Turkic texts (51 fragments, Manichaean and Uighur
5. A fragment of a cosmogonic lapidary (runiform script)
6. Buddhist texts in Sogdian script (34 fragments)
7. Panchatantra fragments (Fables, Uighur script)
8. The third book of the Xuanzang biography in Old Turkic
9. A number of manuscripts of Altun Yarok (an Uighur translation
of the Suvarnaprabhasottamasutra) and a compiled text
10. A number of manuscripts first edited by Peter Zieme in Berliner
Turfantexte 13.

The texts presented under No. 1-7 were dealt with by Irina Nevskaya
under the supervision of Marcel Erdal, the text under No. 8 - by
Knueppel (Klaus Roehrborn), the texts under No. 9-10 - by Jakob Taube
(Peter Zieme). Ralf Gehrke (Jost Gippert) was in charge of the
necessary soft- and hardware, the conversion of the texts between
formats, and production of this CD.

Dear colleagues, if you are interested in obtaining this CD, please
contact Marcel Erdal.

Prof. Dr. Marcel Erdal
Dept. of Turcology,
FB 11, J.W.Goethe University,
P.O.B. 11 19 32
D-60054 Frankfurt a.M.
Tlf.: +49-69-79 82 28 58
Fax: +49-69-79 82 49 7
Marcel Erdal <>

Please, forward this information to your colleagues who might be
interested in this CD.

Sincerely Yours,

The coordinator of the VATEC project
Irina Nevskaya
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue