LINGUIST List 32.575

Tue Feb 16 2021

FYI: February 2021 Newsletter - LDC

Editor for this issue: Everett Green <everettlinguistlist.org>



Date: 15-Feb-2021
From: Membership Coordinator <ldcldc.upenn.edu>
Subject: February 2021 Newsletter - LDC
E-mail this message to a friend

In this newsletter:
2021 Membership Discounts Expire March 1

New Publications:
Althingi Parliamentary Speech
Penn Discourse Treebank 2.0 – German Translation
TAC-KBP English Surprise Slot Filling – Comprehensive Training and Evaluation Data 2010


2021 Membership Discounts Expire March 1
Time is running out to save on 2021 membership fees. Renew your LDC membership, rejoin the Consortium, or become a new member by March 1 to receive a discount of up to 10%. For more information on membership benefits and options, visit Join LDC.


New Publications:

(1) Althingi Parliamentary Speech consists of approximately 540 hours of recorded speech from Althingi, the Icelandic Parliament, along with corresponding transcripts, a pronunciation dictionary, and language models. Speeches date from 2005-2016. This data set was collected in 2016 by the ASR for Althingi project at Reykjavik University in collaboration with the Althingi speech department. The purpose of that project was to develop an ASR (automatic speech recognition) system for Icelandic parliamentary speech to replace the procedure of manually transcribing performed speeches.

Althingi Parliamentary Speech is distributed via web download.

2021 Subscription Members will automatically receive copies of this corpus provided they have submitted a completed copy of the special license agreement. 2021 Standard Members may request a copy as part of their 16 free membership corpora. Non-members may license this data for a fee.

*

(2) Penn Discourse Treebank 2.0 – German Translation was developed at the University of Potsdam’s Applied Computational Linguistics group and consists of approximately one million tokens derived from Penn Discourse Treebank Version 2.0 (LDC2008T05) translated into German and annotated for shallow discourse relations. The aim of the Penn Discourse Treebank project is to annotate the Wall Street Journal section in Treebank-2 (LDC95T7) with discourse relations. PDTB-German is based on a subset of PDTB2.0 used in the 2016 CoNLL Shared Task on Multilingual Shallow Discourse Parsing.

Penn Discourse Treebank 2.0 – German Translation is distributed via web download.

2021 Subscription Members will automatically receive copies of this corpus. 2021 Standard Members may request a copy as part of their 16 free membership corpora. Non-members may license this data for a fee.

*

(3) TAC-KBP English Surprise Slot Filling – Comprehensive Training and Evaluation Data 2010 contains the training and evaluation data (queries, manual runs, final assessment results) produced by LDC to support the 2010 Surprise Slot Filling Track, the only year in which the track was run.

TAC-KBP English Surprise Slot Filling – Comprehensive Training and Evaluation Data 2010 is distributed via web download.

2021 Subscription Members will automatically receive copies of this corpus. 2021 Standard Members may request a copy as part of their 16 free membership corpora. Non-members may license this data for a fee.


Membership Coordinator
Linguistic Data Consortium
University of Pennsylvania
T: +1-215-573-1275
E: ldcldc.upenn.edu
M: 3600 Market St. Suite 810
Philadelphia, PA 19104

Linguistic Field(s): Computational Linguistics


Page Updated: 16-Feb-2021