LINGUIST List 8.1209

Thu Aug 21 1997

FYI: LDC Collection, American Dialect Soc.

Editor for this issue: Martin Jacobsen <martylinguistlist.org>


Directory

  • LDC Office, New Collection from the Linguistic Data Consortium
  • Andrew & Diane Lillie, New American Dialect Society URL

    Message 1: New Collection from the Linguistic Data Consortium

    Date: Wed, 20 Aug 1997 17:55:55 EDT
    From: LDC Office <ldcunagi.cis.upenn.edu>
    Subject: New Collection from the Linguistic Data Consortium


    Announcing a NEW RELEASE from the LINGUISTIC DATA CONSORTIUM

    CALLHOME Collection in Six Languages

    The objective of the CALLHOME project is the creation of a multi-lingual speech corpus that will support the development of Large Vocabulary Conversational Speech Recognition (LVCSR) technology. The collection covers six languages, American English, Egyptian Arabic, German, Japanese, Mandarin Chinese, and Spanish.

    Each CALLHOME language includes telephone speech, transcripts and tables, and a lexicon. Each language can be distributed as a complete set of speech, transcripts, and lexicon (lexicons to be released in the near future) or the components can be ordered separately.

    The telephone speech consists of either 100 or 120 unscripted telephone conversations between native speakers of the specific language. All calls, which lasted up to 30 minutes, originated in North America. Participants typically called family members or close friends. Most calls were placed to various locations overseas, but some participants placed calls within North America.

    The transcripts cover a contiguous 5 or 10 minute segment taken from a recorded conversation. The transcripts are timestamped by speaker turn for alignment with the speech signal, and are provided in standard orthography.

    The lexicons, which are not yet available, contain tab-separated information fields with orthographic, morphological, phonological, stress, source, and frequency information for each word. The lexicons will be covered by a special license agreement.

    Institutions that have membership in the LDC during the 1997 Membership Year will be able to receive the CALLHOME materials at no additional charge, in the same manner as all other text and speech corpora published by the LDC. Due to a delayed release, 1996 members are entitled to CALLHOME Japanese, Mandarin Chinese, and Spanish.

    Nonmembers can purchase CALLHOME materials for research purposes only. The cost of the CALLHOME collection is $3000 per language. The various components of this collection can be purchased separately; Speech databases are $1000, transcripts are $500, and lexicons are $1500 each. If you would like to order a copy of this corpus, please email your request to ldcunagi.cis.upenn.edu. If you need additional information before placing your order, or would like to inquire about membership in the LDC, please send email or call (215) 898-0464.

    Further information about the LDC and its available corpora can be accessed on the Linguistic Data Consortium WWW Home Page at URL http://www.ldc.upenn.edu/. Information is also available via ftp at ftp.cis.upenn.edu under pub/ldc; for ftp access, please use "anonymous" as your login name, and give your email address when asked for password.

    Language Speech Transcripts Lexicon Membership $1000 $500 $1500 year - --------------------------------------------------------------------- - --------------------------------------------------------------------- American LDC97S42 LDC97T14 LDC97L20 97 English (PRONLEX) - --------------------------------------------------------------------- Egyptian LDC97S45 LDC97T19 LDC97L19 97 Arabic - --------------------------------------------------------------------- German LDC97S43 LDC97T15 LDC97L18 97 - --------------------------------------------------------------------- Japanese LDC96S37 LDC96T18 LDC96L17 96/97 - --------------------------------------------------------------------- Mandarin LDC96S34 LDC96T16 LDC96L15 96/97 Chinese - --------------------------------------------------------------------- Spanish LDC96S35 LDC96T17 LDC96L16 96/97 - --------------------------------------------------------------------- - ---------------------------------------------------------------------

    Message 2: New American Dialect Society URL

    Date: Wed, 20 Aug 1997 09:24:56 -0600
    From: Andrew & Diane Lillie <andrewlbyu.edu>
    Subject: New American Dialect Society URL


    Dear colleagues,

    Because of continuing server problems, the American Dialect Society webpage has moved. You can now find it at: http://www.et.byu.edu/~lilliek/ads/index.htm

    I apologize to those who have contacted me about the web site problems and thank you for your patience.

    Diane Lillie ADS Webmaster