LINGUIST List 12.1826

Mon Jul 16 2001

FYI: IViE Corpus On-line, Website: Endangered Lang

Editor for this issue: Lydia Grebenyova <>


  1. Esther Grabe, On-line Version of the IViE Corpus
  2. Peter Wittenburg, New Web-Site - Endangered Languages Project

Message 1: On-line Version of the IViE Corpus

Date: Thu, 12 Jul 2001 11:53:31 +0100 (GMT)
From: Esther Grabe <>
Subject: On-line Version of the IViE Corpus

Dear colleagues,

Thank you for your response to the release of the IViE Corpus (for information
about the corpus, please read the IViE summary below). We have had almost 60 
requests for the CD-ROM version of the corpus, and we will send out the CDs 
and the documentation to those of you who have written to us within the next 
two weeks. Unfortunately, our research budget does not allow us to make any 
further sets. Therefore, we have set up two on-line versions of the corpus:

1: Download page

The corpus has been divided into 45 packages, one for each speaking style from
each variety. Please use this page to download as little or as much from the 
corpus as you would like (but please note the disclaimer below).

2: On-line audio page

This page allows you to listen to the data in the corpus and to download 
individual files. The files can be sorted by variety, speaking style sand 
speaker gender. As on the download page, the complete set of data from the 
corpus are available here.

IViE homepage

ESRC grant R000237145
Department of Linguistics, University of Cambridge
Esther Grabe, Brechtje Post and Francis Nolan

- --------------------------------------------------------------------------


The IViE corpus and the associated documentation are copyrighted. The speech 
data and the texts cannot be copied or distributed in any format unless this 
paragraph is included. The speech data are available to any interested user, 
but only for non-commercial use. The ESRC and the Universities of Oxford and 
Cambridge make no warranty and accept no liability associated with the use of
these materials. 

- --------------------------------------------------------------------------

About the IViE Corpus

The IViE corpus contains data from nine modern or mainstream dialects of
English spoken in the British Isles in five speaking styles. The data
allow for investigations of cross-varietal and stylistic variation in
English intonation (IViE = Intonational Variation in English).

Varieties of English: Belfast English, Bradford Punjabi English,
Cambridge, Cardiff (Welsh-English bilingual speakers), Dublin, London
(speakers of West Indian descent), Leeds, Liverpool, Newcastle.

Speaking styles: Conversations, map task, read text & retold version of
the same text, controlled sentences.

Speakers: 12 speakers from each variety, 6 male, 6 female. 16 years of
age. Data recorded in local secondary schools.

Total duration: 36 hours of speech

Format: .wav

The corpus is available free of charge.

NB: A subsection of the corpus will be published with prosodic annotations
later this year, also on CD-ROM (6.5 hours of speech).

Dr. Esther Grabe,
Department of Phonetics, University of Oxford and Linguistics, Cambridge
Phonetics Laboratory, 41 Wellington Square, University of Oxford, OX1 2JF
Tel. +44 1865 270446
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue

Message 2: New Web-Site - Endangered Languages Project

Date: Fri, 13 Jul 2001 16:20:41 +0200
From: Peter Wittenburg <>
Subject: New Web-Site - Endangered Languages Project

 Documentation of Endangered Languages

 DOBES Project - New Web Site

We would like to draw your attention to a new web-site which gives 
information about the DOBES project (DOkumentation BEdrohter Sprachen). 
This project was setup to document a number of Endangered Languages, 
establish guidelines about how to do language documentation and archive 
all material. The archive will cover photos, sound and video recordings 
and, of course, texts of various sort. For details we refer to the 
content of the web-site.

The DOBES project was first started as a one-year pilot project in 2000 
with 8 linguistic teams and one "archiving" team. It is now entering its 
main phase with the intention to ultimately include about 20 linguistic 

The DOBES web-site presents the following information: 

- the teams involved in the project, 
- the languages they are documenting, 
- application guidelines for new projects, 
- the chosen linguistic and technological frameworks which 
 are the results of the project's internal discussions, 
- basic statements about legal and ethical aspects, 
- and many useful links to related sites. 

The web-site also includes information about tools being developed 
and/or used within the DOBES project. At this moment it does not yet 
contain material about the documented languages except some video 
and audio samples, photos and explaining texts. Many video and audio 
recordings have already been digitized and additional material is 
expected when other teams will return from their field trips. The 
linguistic teams are currently annotating and analyzing a substantial
part of the material. It is expected that the archive will be extended 

The web-site will be adapted continuously dependent on the state of the 
project. Therefore we would like to encourage you to bookmark the site 
and have a look from time to time. We also would like to encourage you 
to send us your comments on all matters raised on the web-site. The
DOBES group is aware of the fact that there are other comparable initiatives 
and is aiming at an open exchange of ideas, methods and tools.

The DOBES project is founded by the VolkswagenStiftung and the archive 
is housed at the Max-Planck-Institute for Psycholinguistics.

For comments and questions, please, use the DOBES email address:

Peter Wittenburg
Max-Planck-Institute for Psycholinguistics
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue