LINGUIST List 33.2850
Wed Sep 21 2022
FYI: September 2022 Newsletter - LDC
Editor for this issue: Everett Green <everettlinguistlist.org>
Date: 15-Sep-2022
From: Membership Coordinator <ldc
ldc.upenn.edu>
Subject: September 2022 Newsletter - LDC
E-mail this message to a friend In this newsletter:
Upcoming Policy Change to LDC’s Open Memberships
LDC at Interspeech 2022
LanguageARC: Citizen Science for Language
30th Anniversary Highlight: Switchboard
New publications:
Xi’an Guanzhong Object Naming
MASRI Synthetic
Upcoming Policy Change to LDC’s Open Memberships
LDC is changing Its open membership year policy beginning January 1, 2023. Only one membership year will be open for joining – the current membership year. The 2022 membership year will close for joining on December 31, 2022. We expect this change to have a minimal impact on members, while allowing us to streamline our processes to serve members better. LDC’s many membership benefits will remain the same and organizations choosing to join membership years in advance will still be able to do so. If you have any questions about this change, please don’t hesitate to contact our membership office.
LDC at Interspeech 2022
LDC is proud to sponsor the Workshop for Young Female Researchers in Speech (YFRSW) to be held in-person as an Interspeech 2022 pre-conference satellite event on September 17. Also, be sure to check out the collaborative work of LDC’s Mark Liberman, “The mapping between syntactic and prosodic phrasing in English and Mandarin”, presented during the On-Site Oral Session: Phonetics and Phonology on Wednesday, September 21, 13:30-15:30 KST.
LanguageARC: Citizen Science for Language
LanguageARC is a citizen science web portal for language research developed by LDC with the support of the National Science Foundation (grant #1730377).
LanguageARC brings together researchers and participants from the general public interested in language to form a community dedicated to support and advance language-related research and development. Contributors to this online community can participate in a variety of language-related tasks and activities such as reading text, answering questions, describing images or video, creating or evaluating transcriptions for audio clips, or developing translations into their native languages. LanguageARC includes projects in languages other than English, such as French, Sesotho, and Swedish. Xi’an Guanzhong Object Naming LDC2022S09, released this month in LDC’s Catalog and described below, is an example of a data set developed using LanguageARC. New projects will be added on an ongoing basis.
https://www.facebook.com/languagearc New publications:
Xi’an Guanzhong Object Naming is comprised of 15 hours of audio recordings from speakers of the Guanzhong dialect of Mandarin Chinese living in or near Xi’an in Shaangxi Province (China) naming objects that appeared in colored line drawings. The corpus was developed to support traditional and computer aided language documentation.
Xi’an Guanzhong Object Naming is distributed via web download.
2022 Subscription Members will automatically receive copies of this corpus. 2022 Standard Members may request a copy as part of their 16 free membership corpora. Non-members may license this data for a fee.
MASRI Synthetic MASRI (Maltese Automatic Speech Recognition I) Synthetic was developed by the MASRI team at the University of Malta and contains 99 hours of synthesized Maltese speech.
Source sentences were extracted from the Maltese Language Resource Server (MLRS) corpus, comprised of written or transcribed Maltese covering various genres, including parliamentary debates, news, law, opinion, sports, culture, academic, literature, and religious texts. Text was processed through the CrimsonWing text-to-speech system to generate speech files. Synthesized speech was created with 210 voices.
MASRI Synthetic is distributed via web download.
2022 Subscription Members will automatically receive copies of this corpus provided they have submitted a completed copy of the special license agreement. 2022 Standard Members may request a copy as part of their 16 free membership corpora. Non-members may license this data for a fee.
Linguistic Field(s): Computational Linguistics
Page Updated: 21-Sep-2022