LINGUIST List 32.917

Fri Mar 12 2021

FYI: SIGMORPHON 2021 Shared Task 0 on Morphological Inflection - Call for Participation

Editor for this issue: Everett Green <everettlinguistlist.org>



Date: 10-Mar-2021
From: Edoardo Maria Ponti <edoardomaria.pontigmail.com>
Subject: SIGMORPHON 2021 Shared Task 0 on Morphological Inflection - Call for Participation
E-mail this message to a friend

We invite you to participate in SIGMORPHON’s 6th installment of its inflection generation shared task, which will be divided into two parts:

Part 1: Generalization Across Typologically Diverse Languages
Part 2: Are We There Yet? A Shared Task on Cognitively Plausible Morphological Inflection

Please join our Google Group to stay up to date: https://groups.google.com/forum/#!forum/sigmorphon2021-sharedtask0/join
Click here to register for the task: https://forms.gle/tu4tX648F9kA9eps7
Consult our website for additional information: https://github.com/sigmorphon/2021Task0
Contact the organizers at the following email address: sigmorphon+workshop2021gmail.com

The shared task will be part of the SIGMORPHON workshop, co-located with ACL-IJCNLP 2021 in Bangkok, Thailand, on either August 5 or 6, 2021.

Part 1: Generalization Across Typologically Diverse Languages

For the first part of the shared task, participants will design a model that learns to generate morphological inflections from a lemma and a set of morphosyntactic features of the target form. Each language has its own training, development, and test splits. Training and development splits contain triples, each consisting of a lemma, a target form, and a set of morphological features, provided in the UniMorph format. Test splits only provide lemmas and morphological tags: the participants' models will need to predict the missing target form.

The model should be general enough to work for natural languages of any typological patterning. For example, Tagalog verbs exhibit circumfixation; thus, a model with a strong inductive bias towards suffixing will likely not work well for Tagalog.

As part of the task, we will release data for 50 new languages annotated in the Unimorph schema. The data for the 35 development languages are already available on the shared task website. These include a number of languages indigenous to Russia, such as Itelmen and Chukchi, as well as many languages from the Americas, such as Aymara and Seneca.

Important Dates:

- February 28, 2021: Training and development splits for development languages, baselines released.
- March 7, 2021: Development language data are frozen.
- April 20, 2021: Training and development splits for surprise languages released.
- April 27, 2021: Test splits for all languages (both development and surprise) released.
- May 4, 2021: Participants submit test predictions on all languages.
- June 1, 2021: Participants’ system description papers due.
- June 7, 2021: Participants’ system description papers camera ready due.

Part 2: Are We There Yet? A Shared Task on Cognitively Plausible Morphological Inflection

An open question in the use of neural networks for the study of language is to what degree they resemble humans in how they generate language.

This shared task adopts the experimental paradigm introduced by Albright and Hayes (2003). We have created a large number of new nonce words in four languages: English, German, Portuguese and Russian. To the best of our knowledge, this will be the largest and most multilingual collection of nonce words in existence. The goal of the participants in the shared task is to design a model that morphologically inflects the nonce words according to the grammar of the given languages.

Important Dates:

- February 25, 2021: Training data for English, German, Portuguese and Russian are released.
- March 8, 2021: Neural and non-neural baselines for development languages released.
- May 1, 2021: Development data for nonce inflections are released. (This includes human judgments.)
- May 23, 2021: Test data for the nonce inflections are released. (This includes human judgments.)
- June 1, 2021: Users submit their system output.
- June 7, 2021: Users submit their system description paper.

Linguistic Field(s): Cognitive Science; Computational Linguistics; Morphology


Page Updated: 12-Mar-2021