LINGUIST List 15.1753

Wed Jun 9 2004

FYI: ELRA News

Editor for this issue: Anne Clarke <annelinguistlist.org>


Directory

  1. Magali Jeanmaire, ELRA - Language Resources Catalogue - Update

Message 1: ELRA - Language Resources Catalogue - Update

Date: Tue, 08 Jun 2004 16:50:20 +0200
From: Magali Jeanmaire <duclauxelda.fr>
Subject: ELRA - Language Resources Catalogue - Update


ELRA - Language Resources Catalogue - Update

We are happy to announce that new Language Resources are
now available in our catalogue:

Short descriptions of these resources are given below.
More detailed descriptions are available on our web sites,
at http://www.elda.fr or http://www.elra.info .

Written Language Resources

*** W0015 Le Monde Text Corpus - Update ***
Electronic archiving of "Le Monde" articles started on 1 January 1987.
The entire corpus is available in an ASCII text format.
Year 2003 is available in .XML format.

*** W0036/04 Le Monde Diplomatique Text corpus in Arabic ***
Electronic archiving of "Le Monde Diplomatique" articles in Arabic from 1998.
The corpus is available in an ASCII text format.
French and English versions also available.


Spoken Language Resources

*** S0158 Turkish OrienTel database ***
This speech database contains the recordings of 1,700 Turkish speakers
recorded over the Turkish fixed and mobile telephone network.
Each speaker uttered around 45 read and spontaneous items.

*** S0159 German spoken by Turkish OrienTel database ***
This speech database contains the recordings of 332 Turkish speakers
of German recorded over the German fixed and mobile telephone network.
Each speaker uttered around 53 read and spontaneous items.

*** S0160 Spanish Speecon database ***
The Spanish Speecon database comprises the recordings of 561 adult
Spanish speakers and 55 child Spanish speakers who uttered respectively
over 290 items and 210 items (read and spontaneous).

*** S0161 Russian Speecon database ***
The Russian Speecon database comprises the recordings of 550 adult
Russian speakers and 50 child Russian speakers who uttered respectively
over 290 items and 210 items (read and spontaneous).

*** S0162 Hempel ***
This corpus contains 25.5 hours of recordings by 3,909 German speakers
with a total of 184,240 spoken words, made via public phone lines (fixed
network only). The contents are free monologues answering the question:
"Was haben Sie in der letzten Stunde gemacht?" (What did you do within
the last hour?). The database is conformant with the SpeechDat Exchange
Format.
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue