LINGUIST List 13.3120

Wed Nov 27 2002

Qs: Computer Framework, Phonetic Corpora

Editor for this issue: Renee Galvis <reneelinguistlist.org>

We'd like to remind readers that the responses to queries are usually best posted to the individual asking the question. That individual is then strongly encouraged to post a summary to the list. This policy was instituted to help control the huge volume of mail on LINGUIST; so we would appreciate your cooperating with it whenever it seems appropriate.

In addition to posting a summary, we'd like to remind people that it is usually a good idea to personally thank those individuals who have taken the trouble to respond to the query.

Directory

Ahmed.I.S, Computer Framework

Matthew Johnson, Phonetic corpora

Message 1: Computer Framework

Date: Tue, 26 Nov 2002 17:13:57 -0800 (PST)
From: Ahmed.I.S <ibhims20002yahoo.com>
Subject: Computer Framework

Dear colleagues,

I am doing research on "Computer in Education". I in need of the "Computer Framework for both Multimedia and the Internet. I do hope you may refer me to any site in the web. Looking forward to hearing from you.

Thank you and Regards

Ahmed.I.S

Message 2: Phonetic corpora

Date: Wed, 27 Nov 2002 02:11:43 -0500
From: Matthew Johnson <matthew.r.johnsonYALE.EDU>
Subject: Phonetic corpora

I'm trying to do a project that involves using artificial neural networks to process a stream of phonetic information, and I'm having a hard time finding data. Ideally I'm looking for large bodies of phonetically transcribed spoken language... and since the project is by nature cross-linguistic, I need data for as many different languages as possible.

I'm aware that I can get a database called TIMIT from the LDC for English, but I can't find any other easily available phonetic corpora for any other language. If anyone has phonetic transcriptions of speech for any language (including English), or knows of a good place to obtain them, I would be very grateful. The larger the corpus, the better, but anything at all would help.

In addition, if I'm not able to scrape up the data this way, I'm also looking at using a software phonetizer to artificially generate a stream of phonemes from a written corpus. If you have any information about the accuracy of these phonetizers in various languages or which ones might be worth taking a look at, I would greatly appreciate it.

Many thanks, Matt Johnson