LINGUIST List 31.3932
Fri Dec 18 2020
FYI: December 2020 Newsletter - LDC
Editor for this issue: Everett Green <everettlinguistlist.org>
Date: 15-Dec-2020
From: Membership Coordinator <ldc
ldc.upenn.edu>
Subject: December 2020 Newsletter - LDC
E-mail this message to a friend In this newsletter:
LDC 2021 Membership Discounts Now Available
Approaching Deadline for Spring 2021 Data Scholarship Applications
LDC Closed for Winter Break Dec. 24- Jan. 5
New Publications:
BOLT English Co-reference – Discussion Forum, SMS/Chat, and Conversational Telephone Speech
Phonemes of Arabic
Global TIMIT Mandarin Chinese – Guanzhong Dialect
________________________________________
LDC Closed for Winter Break Dec. 24-Jan. 5
LDC will be closed from Thursday, December 24, 2020 through Tuesday, January 5, 2021 in accordance with the University of Pennsylvania Winter Break Policy. Our offices will reopen on Wednesday, January 6, 2021. Requests received by the Membership Office during Winter Break will be processed when the office reopens.
________________________________________
New publications:
(1) BOLT English Co-reference – Discussion Forum, SMS/Chat, and Conversational Telephone Speech was developed by Raytheon BBN Technologies for the BOLT co-reference task and consists of co-reference annotation on English discussion forum, SMS/Chat, and conversational telephone speech.
BOLT English Co-reference – Discussion Forum, SMS/Chat, and Conversational Telephone Speech is distributed via web download.
2020 Subscription Members will automatically receive copies of this corpus. 2020 Standard Members may request a copy as part of their 16 free membership corpora. Non-members may license this data for a fee.
*
(2) Phonemes of Arabic was developed at the Florida Institute of Technology. It contains approximately one hour of speech from native Arabic speakers that includes all Arabic sounds (consonants and vowels) and 24 words with specific consonant-vowel patterns.
Phonemes of Arabic is distributed via web download.
2020 Subscription Members will automatically receive copies of this corpus. 2020 Standard Members may request a copy as part of their 16 free membership corpora. Non-members may license this data for a fee.
*
(3) Global TIMIT Mandarin Chinese – Guanzhong Dialect was developed by LDC and Xi’an Jiaotong University and consists of approximately five hours of read speech and transcripts in the Guanzhong dialect of Mandarin Chinese as spoken in Shannxi province. It is comprised of 50 speakers reading 120 sentences from Chinese Gigaword Fifth Edition (LDC2011T13). Among the 120 sentences, 20 sentences were read by all speakers, 40 sentences were read by 10 speakers, and 60 sentences were read by one speaker, for a total of 3220 sentence types.
Global TIMIT Mandarin Chinese – Guanzhong Dialect is distributed via web download.
2020 Subscription Members will automatically receive copies of this corpus. 2020 Standard Members may request a copy as part of their 16 free membership corpora. Non-members may license this data for a fee.
Membership Coordinator
Linguistic Data Consortium
University of Pennsylvania
T: +1-215-573-1275
E: ldc
ldc.upenn.edu
M: 3600 Market St. Suite 810
Philadelphia, PA 19104
Linguistic Field(s): Computational Linguistics
Page Updated: 18-Dec-2020