LINGUIST List 11.1539

Thu Jul 13 2000

FYI: New Corpus - Spoken American English

Editor for this issue: Lydia Grebenyova <lydialinguistlist.org>


Directory

  • LDC Office, New Corpus - Spoken American English/ from LDC

    Message 1: New Corpus - Spoken American English/ from LDC

    Date: Thu, 06 Jul 2000 10:53:17 EDT
    From: LDC Office <ldcunagi.cis.upenn.edu>
    Subject: New Corpus - Spoken American English/ from LDC


    ******************************************************** Santa Barbara Corpus of Spoken American English - Part I ********************************************************

    LDC is pleased to announce the availability of the Santa Barbara Corpus of Spoken American English - Part I. This CD-ROM release contains 14 speech files from the Santa Barbara Corpus of Spoken American English, which was collected by the University of California, Santa Barbara Center for the Study of Discourse under the direction of John W. Du Bois. Associate Editors were Wallace L. Chafe (UCSB), Charles Meyer (UMass, Boston), and Sandra A. Thompson (UCSB). The Santa Barbara Corpus of Spoken American English is part of the International Corpus of English (Charles W. Meyer, Director), representing the American Component.

    The Santa Barbara Corpus of Spoken American English is based on hundreds of recordings of natural speech from all over the United States, representing a wide variety of people of different regional origins, ages, occupations, and ethnic and social backgrounds. It reflects many ways that people use language in their lives: conversation, gossip, arguments, on-the-job talk, card games, city council meetings, sales pitches, classroom lectures, political speeches, bedtime stories, sermons, weddings, and more.

    Each speech file is accompanied by a transcript in which phrases are time stamped with respect to the audio recording. Personal names, place names, phone numbers, etc, in the transcripts have been altered to preserve the anonymity of the speakers and their acquaintances and the audio files have been filtered to make these portions of the recordings unrecognizable.

    For the latest information on this corpus, please refer to the UCSB and Linguistic Data Consortium (LDC) web sites devoted to it:

    http://linguistics.ucsb.edu/research/sbcorpus/default.htm http://www.ldc.upenn.edu/Publications/SBCSAE/

    These sites may also contain software or revised versions of data which may be downloaded.

    Institutions that have membership in the LDC during the 2000 Membership Year will be able to receive this corpus free of charge. Nonmembers may purchase the Santa Barbara Corpus of Spoken American English - Part I for $75.

    If you would like to order a copy of this corpus, please email your request to <ldcldc.upenn.edu>. If you need additional information before placing your order, or would like to inquire about membership in the LDC, please send email or call (215) 573-1275.