Editor for this issue: Ann Dizdar <dizdar
tam2000.tamu.edu>
De: LDC Office le Ven 31 Mai 1996 6:57 pm Objet: New Release from the LDC A: ldc-publicityMail to author|Respond to list|Read more issues|LINGUIST home page|Top of issueunagi.cis.upenn.edu Cc: ldc
unagi.cis.upenn.edu Announcing a NEW RELEASE from the LINGUISTIC DATA CONSORTIUM Acoustic-Phonetic Continuous Speech Corpus Far Field Microphone Recordings FFMTIMIT The FFMTIMIT corpus contains the previously-unreleased secondary microphone waveforms for the TIMIT Acoustic-Phonetic Continuous Speech corpus. The primary microphone waveforms, which were recorded using a close-talking noise-cancelling head-mounted Sennheiser microphone (model HMD-414), are available from the LDC on NIST Speech Disc 1-1.1 (LDC93S1). The secondary microphone used in the recording of the TIMIT corpus was a Breul & Kjaer 1/2" free-field microphone (model 4165). While the Sennheiser microphone recordings are relatively "clean" with respect to non-speech noise, the FFMTIMIT recordings includes significant low frequency noise, which was due to the HVAC system and mechanical vibration transmitted through the floor of the double-walled sound booth used in recording. Because it is noiser than its TIMIT counterpart, the data of FFMTIMIT may be used in the development of more noise-robust speech recognition systems. In addition, this data may be of value to researchers involved in vocal tract modeling because the B&K microphone has extremely flat free-field frequency response and calibration tones are provided. Note that the B&K TIMIT data contained with this release has not been processed through any highpass filter, (e.g., the 1581-point filter described in the paper "The DARPA Speech Recognition Research Database" by Fisher, Doddington and Goudie-Marshall in "DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus CD-ROM," NISTIR 4930 / NTIS Order No. PB93- 173938.) Institutions that have membership in the LDC during the 1996 Membership Year will be able to receive FFMTIMIT at no additional charge, in the same manner as all other text and speech corpora published by the LDC. Nonmembers can receive a copy of FFMTIMIT for research purposes only for a fee of $100. If you would like to order a copy of this corpus, please email your request to ldc
unagi.cis.upenn.edu. If you need additional information before placing your order, or would like to inquire about membership in the LDC, please send email or call (215) 898-0464. Further information about the LDC and its available corpora can be accessed on the Linguistic Data Consortium WWW Home Page at URL http://www.cis.upenn.edu/~ldc. Information is also available via ftp at ftp.cis.upenn.edu under pub/ldc; for ftp access, please use "anonymous" as your login name, and give your email address when asked for password.
As part of a 2 semester software project at the Department of Computational Linguistics, University of Heidelberg, Germany, we intend to design and implement a general accessing tool for large text corpora. In order to investigate the user's needs and wishes concerning such a tool, we provide the following questionnaire, It is addressed to anyone doing linguistic work or research using text corpora. Maybe your future work will benefit from our development. Therefore, we kindly please you to help us in the design of such an accessing tool by filling out our questionnaire. Feel free to make any annotations you regard as useful or important to the subject (including the questionnaire itself). Our questionnaire is located at http://www.gs.uni-heidelberg.de/~ebert/quest.html If you have any further questions, don't hesitate to send us a mail: swpMail to author|Respond to list|Read more issues|LINGUIST home page|Top of issuenovell1.gs.uni-heidelberg.de Thank you in advance for your cooperation! Department of Computational Linguistics University of Heidelberg, Germany Karlstr. 2 69125 Heidelberg
KEY WORDS: Corpus linguistics, Corpus tools, Grammar, Grammar development #### #### Ph.D. Thesis Announcement #### #### #### #### A LOGICAL APPROACH TO COMPUTATIONAL CORPUS LINGUISTICS #### #### #### #### Torbj=F6rn Lager = This is to announce the availability of my Ph.D. thesis: "A Logical Approach to Computational Corpus Linguistics". I have prepared a WWW page dedicated to the approach described in the thesis, from which machine readable versions of the thesis may be downloaded, and hard copies ordered. The relevant URL is: http://www.ling.gu.se/~lager/taglog.html You may also send mail directly to me: lagerMail to author|Respond to list|Read more issues|LINGUIST home page|Top of issueling.gu.se ABSTRACT The purpose of this thesis is to build a *corpus theory development environment* -- to discuss its design, use, and implementation. The proposed system is based on a logical approach to computational corpus linguistics where sentences of logic are used to express statements about texts and logical inference is used to manipulate these sentences in order to analyse the texts. The thesis demonstrates the remarkable ease with which the functionalities needed in a corpus system can be implemented when based upon adequate means of representing, querying, and reasoning. The proposed system implements hand coding, searching, concordancing, parsing, counting, tabling, collocating, automatic part-of-speech tagging, lemmatizing, excerpting, interpreting, treebanking, explanation, and various kinds of learning. By linking all this functionality into a common representational framework characterised by high expressive power, declarativity, and explicit reasoning strategies, and by embedding the whole concept in a particular philosophical and methodological context, including an ontology of text, an analysis of the notion of theory, an explication of the notion of truth, and other foundational issues, we arrive at an interactive system which is multi-functional and general, yet simple, consistent, and highly usable. Apart from being interesting from a practical point of view, the development of such a system raises intriguing philosophical and methodological questions: What is a corpus text? What is a corpus theory? What does it mean to develop a corpus theory? What does it mean for a corpus theory to be true about a corpus text? What is the link between the truth of such a theory and its usefulness for natural language processing purposes? These and related questions are discussed in the thesis. The system exists in a prototype implementation and the thesis contains numerous examples from this implementation in action. KEY WORDS: Corpus linguistics, Corpus tools, Grammar, Grammar development Torbjoern Lager E-mail: lager
ling.gu.se Department of Linguistics Phone: +46 31 7731175 University of Gothenburg Fax: +46 31 7734853 Renstroemsparken 412 98 Gothenburg Sweden
Lecture Richard KittredgeMail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue