LINGUIST List 13.3120

Wed Nov 27 2002

Qs: Computer Framework, Phonetic Corpora

Editor for this issue: Renee Galvis <>

We'd like to remind readers that the responses to queries are usually best posted to the individual asking the question. That individual is then strongly encouraged to post a summary to the list. This policy was instituted to help control the huge volume of mail on LINGUIST; so we would appreciate your cooperating with it whenever it seems appropriate. In addition to posting a summary, we'd like to remind people that it is usually a good idea to personally thank those individuals who have taken the trouble to respond to the query.


  1. Ahmed.I.S, Computer Framework
  2. Matthew Johnson, Phonetic corpora

Message 1: Computer Framework

Date: Tue, 26 Nov 2002 17:13:57 -0800 (PST)
From: Ahmed.I.S <>
Subject: Computer Framework

Dear colleagues,

I am doing research on "Computer in Education". I in need of the
"Computer Framework for both Multimedia and the Internet. I do hope
you may refer me to any site in the web. Looking forward to hearing
from you.

Thank you and Regards

Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue

Message 2: Phonetic corpora

Date: Wed, 27 Nov 2002 02:11:43 -0500
From: Matthew Johnson <matthew.r.johnsonYALE.EDU>
Subject: Phonetic corpora

I'm trying to do a project that involves using artificial neural
networks to process a stream of phonetic information, and I'm having a
hard time finding data. Ideally I'm looking for large bodies of
phonetically transcribed spoken language... and since the project is
by nature cross-linguistic, I need data for as many different
languages as possible.

I'm aware that I can get a database called TIMIT from the LDC for
English, but I can't find any other easily available phonetic corpora
for any other language. If anyone has phonetic transcriptions of
speech for any language (including English), or knows of a good place
to obtain them, I would be very grateful. The larger the corpus, the
better, but anything at all would help.

In addition, if I'm not able to scrape up the data this way, I'm also
looking at using a software phonetizer to artificially generate a
stream of phonemes from a written corpus. If you have any information
about the accuracy of these phonetizers in various languages or which
ones might be worth taking a look at, I would greatly appreciate it.

Many thanks,
Matt Johnson
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue