Editor for this issue: Steve Moran <steve
linguistlist.org>
Dear Linguist Listers, I have two queries concerning English speech corpora. 1. I am looking for a speech corpus (language: English) that is part-of- speech tagged and has soundfiles, transcriptions and part-of-speech tags aligned. Furthermore, it needs to be of considerable size (> 100,000 word tokens, if possible). Can anyone point me towards pertinent corpora? So far I only found one corpus that meets all the criteria mentioned above, the Boston University Radio News Corpus. 2. In spite of hour-long efforts and the help of experienced colleagues I have not managed to open the example files of the BU Radio News Corpus properly, no matter whether I used PRAAT, Wavesurfer, or Transcriber. All three programs can open the sound file (.sph) without problems but neither of the programs can access the files with the transcription or the part-of- speech tags and align this information with the sound wave. Can anyone help? Which program(s) can do the job? Any help will be greatly appreciated. Many thanks in advance! Best regards, Ingo Plag ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Prof. Dr. Ingo Plag English Linguistics Fachbereich 3 Universitaet-Gesamthochschule Siegen Adolf-Reichwein-Str. 2 D-57068 Siegen http://www.uni-siegen.de/~engspra/ tel. 0271-740-2560 tel. 0271-740-2349 (secretary) fax 0271-740-3246 e-mail: plagMail to author|Respond to list|Read more issues|LINGUIST home page|Top of issueanglistik.uni-siegen.de tel.: 06422-2817 (home) office: room AR-K 103 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~