LINGUIST List 14.33

Tue Jan 7 2003

FYI: Field Ling, NASSLI-2003, New Corpora

  1. Peter Cole, Field Linguistics at the U of Delaware
  2. NASSLLI'03 Bloomington, Indiana, NASSLLI-2003 ANNOUNCEMENT
  3. LDC Office, New Corpora from the LDC

Message 1: Field Linguistics at the U of Delaware

Date: Thu, 26 Dec 2002 16:41:01 +0000
From: Peter Cole <>
Subject: Field Linguistics at the U of Delaware

The Department of Linguistics of the University of Delaware is pleased
to announce a new Doctoral concentration in Field Linguistics and
Language Documemtation. Students in the program will take courses in
linguistic field work and language typology, and will engage in field
work on a (relatively) underdescribed language. It is exected that
the results of their research will contribute both to the description
of the language in question and to some area of linguistic theory.

For further information, please contact Prof. Satoshi Tomioka,
Director of Graduate Study. We hope to have information about the
concentration on our website in the near future.
Date: Sat, 4 Jan 2003 18:34:04 -0500 (EST)
From: NASSLLI'03 Bloomington, Indiana <>

 Second North American Summer School
 Logic, Language and Information
 June 17-21, 2003, Bloomington, Indiana


The NASSLLI Steering Committee is pleased to announce the Second North
American Summer School in Logic, Language and Information, to be held
in Bloomington, Indiana, June 17-21, 2003. The event follows on from
the successful first school at Stanford in June, 2002. The school is
focussed on the interfaces among linguistics, logic, and computation,
broadly conceived, and on related fields. Our sister school, the
European Summer School in Logic, Language, and Information, has been
highly successful, becoming an important meeting place and forum for
discussion for students and researchers interested in the
interdisciplinary study of Logic, Language and Information. We hope
that the North American schools will follow in this tradition.


 Marco Aiello, Guram Bezhanishvili, and Darko Sarenac
 Reasoning about Space (Workshop)

 Alexandru Baltag
 Logics for Communication: reasoning about information
 flow in dialogue games.

 Roman Bartak
 Foundations of Constraint Satisfaction

 Patrick Blackburn and Johan Bos
 Computational semantics for natural language

 Gerhard Jaeger and Reinhard Blutner
 Linguistic and computational issues in
 Optimality Theory

 Edward Keenan and Edward Stabler
 A Mathematical Theory of Grammatical Categories

 Daniel Leivant
 Logic of Programs

 Dov Monderer
 Games in Informational Form

 Yiannis Moschovakis
 Referential intensions: a logical calculus for synonymy

 John C. Paolillo
 Statistical models for language: structure and computation

 Dirk Pattinson
 An Introduction to the Theory of Coalgebras

 Ron van der Meyden
 Algorithmic Verification for Epistemic Logic

 Courses consist of five sessions of 90 minutes each.
 NASSLLI courses are aimed at graduate students or advanced 
 undergraduates in computer science, linguistics, logic, philosophy, and
 related areas.
 Course abstracts are available from

 In addition, there will be evening lectures and a session of student
 papers. A Call for Papers for the Student Session will be distributed

 RELATED EVENTS: NASSLLI'03 will be co-located with TARK'03, the 9th
 Conference on Theoretical Aspects of Knowledge and Rationality
 (see ). In addition, NASSLLI'03 will be co-located
 with MoL'03, the 8th Meeting on the Mathematics of Language (see ). Both of these conferences will take
 place June 20-22, 2003.

 should be available from our web site in January, 2003.

 WEB SITE FOR NASSLLI'03, to be held at Indiana University in June 2003:


 David Beaver
 Barbara Grosz
 Phokion Kolaitis
 Larry Moss
 Stuart Shieber
 Moshe Vardi


Message 3: New Corpora from the LDC

Date: Mon, 06 Jan 2003 12:09:42 -0500
From: LDC Office <>
Subject: New Corpora from the LDC

The Linguistic Data Consortium (LDC) is pleased to announce the
availability of three new corpora.

	 ** 1997 HUB5 Spanish Evaluation **

	 ** 2000 Communicator Evaluation **

 ** Grassfields Bantu Fieldwork: Ngomba Tone Paradigms **

1. The 1997 Hub-5 Spanish evaluation is part of an ongoing series
of periodic evaluations conducted by NIST. This evaluation focused
on the task of transcribing conversational speech into text. Each 
conversation is represented as a "4-wire" recording, that is, with
two distinct sides, one from each end of the telephone circuit. Each
side is recorded and stored as a standard telephone codec signal 
(8 kHz sampling, 8-bit mu-law encoding). The 1997 HUB5 Spanish 
Evaluation contain 426 Mbytes or hours of sphere data. 

For further information, including a link to additional documentation on
the NIST web site, please visit:

Institutions that have membership in the LDC during the 2002 
Membership Year will be able to receive this corpus free of charge. 
Nonmembers may purchase this publication for $1000. 

2. The original goals of the Communicator program were to support the
creation of speech-enabled interfaces that scale gracefully across 
modalities, from speech-only to interfaces that include graphics, 
maps, pointing and gesture. The original vision of the Communicator
systems included the ability of a user, during one ten-minute session,
to plan a three-leg trip, with the three flights/legs on three different
days, with rental car and hotel in each of the two "away" cities, plus
dictating/sending a voice-mail message. 

The actual research that led to the data collections in 2000 and 2001
explored ways to construct better spoken-dialogue systems, with which
users interact via speech-alone to perform relatively complex tasks such
as travel planning. During 2000 and 2001 two large data sets were
collected, in which users used the Communicator systems built by the
research groups to do travel planning. The 2000 Communicator Evaluation
publication consists of all the data from the 2000 collection. 

For the 2000 evaluation, each user called the nine different automated
travel-planning systems to make simulated flight reservations. All audio
files are in SPHERE format, recorded in 8 bit ulaw and pcm, at 8 KHZ.
The two-channel sphere files total ~62 hours of audio (3415 MB),
representing ~317K words in transcription. 

Institutions that have membership in the LDC during the 2002 
Membership Year will be able to receive this corpus free of charge. 
Nonmembers may purchase this publication for $900. 

3. Grassfields Bantu Fieldwork: Ngomba Tone Paradigms contains tone
paradigms of the language Ngomba, a Bamileke (Grassfields Bantu)
language spoken by some 63,000 people in the Western Province of
Cameroon. Ngomba's tone system is undescribed, but it has many
similarities with the closely related Y�mba language (also known as
Bamileke Dschang). 

This publication contains 755 audio files. The files in rawdata are 21
extended audio and laryngograph recordings with ESPS xlabel files; each
one of the raw sound files contains the complete recording of one of the
tenses. Transcriptions are provided for the audio clips using the
IPA-based orthography, and using phonetic and tonological transcription
systems. The verbal tone paradigms are also accessible over the
internet, along with an interface for browsing and editing
transcriptions, at 

For further information, please visit:

This publication is free of charge to 2001 and 2002 members. The cost
of the first 100 copies of this publication (not counting the copies
distributed to LDC members) is covered by NSF Grant Number 9983258.
These copies are, therefore, free of charge to qualified researchers;
a $30 shipping and handling fee applies. After these first 100 copies
are distributed, additional copies will be available for the production
cost of $150 per CD-ROM.


If you need additional information before placing your order, or 
would like to inquire about membership in the LDC, please send email to
<> or call (215) 573-1275.

- -------------------------------------------------------------------
Linguistic Data Consortium Phone: (215) 573-1275
3600 Market Street Fax: (215) 573-2175
Suite 810 email:
Philadelphia, PA 19104-2653 www:
