It was about one and a half years ago that I finally I arrived where I had always wanted to be and do what I had always wanted-- teach students, support small language communities and conduct research on African languages on my doorstep. The University of Cape Town and my new colleagues welcomed my efforts to establish the Centre for African Language Diversity-- CALDi as well as The African Language Archive-- TALA and I was recently appointed the Mellon Research Chair: African Language Diversity this initiative. The main aim of CALDi is to train young African scholars in descriptive linguistics and open up space for research into African languages at UCT with the hopes of countering the dominance of African linguistics outside the continent. It has been a great challenge for which my whole career has been a form of preparation...Read more
I would like to announce the formation of a list concerned with the teaching of linguistics. This list, "teach-ling," is a forum for the exchange of ideas, materials, solutions to common problems, syllabi, activities, etc. A particular emphasis of this list will be supporting the use of active learning methods in teaching. Members are welcome to post and respond to queries about problems and concerns in teaching their linguistics courses, setting up syllabi, and selecting texts and other materials.
Teach-ling is an unmoderated open list. To subscribe, just send a message
The Center for Spoken Language Understanding at the Oregon Graduate Institute of Science and Technology is releasing two new telephone speech corpora: 22 Language and Alphadigit. As always, the CSLU corpora are available at no cost to universities and other not-for-profit organizations. Companies may obtain speech corpora and other benefits through membership in CSLU's industrial affiliates program (see
Release 1.0 of the 22 Language Speech Corpus is a collection of telephone quality speech from over 2000 speakers in 22 differen languages. Collection, annotation and distribution of this corpus is supported in part by a grant from NSF and DARPA. The languages are: Arabic, Portuguese, Cantonese, Czech, English, Farsi, French, German, Hindi, Hungarian, Indonesian, Italian, Japanese, Korean, Mandarin, Polish, Russian, Spanish, Swahili, Swedish, Tamil, and Vietnamese. The speech includes utterances of varying lengths from three seconds to one minute long, produced in response to prompts recorded in each language. Each utterance in this corpus has been verified by two native speakers, with differences among the transcribers resolved, to determine (among other things) the gender, dialect, accent, and responsiveness of the caller. In addition, callers in each language were asked to speak in English for 20 seconds. The current release contains speech from 100 callers in each language. More information is available at:
The Alphadigit Corpus is a collection of about 78,000 examples from 3,031 talkers saying 6 digit strings of letters and digits over the telephone. A total of about 75 hours (2.3GB) of speech are included in Release 1.0. Each file has an orthographic transcription. More information is available at:
To order either of these corpora or any other CSLU corpora you can fill out the online order form: