LINGUIST List 25.4260

Mon Oct 27 2014

Software: Universal Dependencies, Version 1

Editor for this issue: Damir Cavar <damirlinguistlist.org>


Date: 05-Oct-2014
From: Joakim Nivre <joakim.nivrelingfil.uu.se>
Subject: Universal Dependencies, Version 1
E-mail this message to a friend

We are happy to announce the release of the annotation guidelines for Universal Dependencies at http://universaldependencies.github.io/docs/.
Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages with the goal of facilitating multilingual parser development, cross-lingual learning, and parsing research from a language typology perspective. The annotation scheme is based on (universal) Stanford dependencies (de Marneffe et al., 2006, 2008, 2014), Google universal part-of-speech tags (Petrov et al., 2012), and the Interset interlingua for morphosyntactic tagsets (Zeman, 2008). The general philosophy is to provide a universal inventory of categories and guidelines to facilitate consistent annotation of similar constructions across languages, while allowing language-specific extensions when necessary.

We intend to treat version 1 as stable for at least the next year, but we may subsequently make further revisions based on experiences using it to treebank a range of languages. Our goal is to make a first release of data sets with language-specific documentation by January 1, 2015. If you are interested in contributing to this effort, please get in touch.

Jinho Choi, Marie-Catherine de Marneffe, Tim Dozat, Filip Ginter, Yoav Goldberg, Jan Hajic, Christopher Manning, Ryan McDonald, Joakim Nivre, Slav Petrov, Sampo Pyysalo, Natalia Silveira, Reut Tsarfaty, Dan Zeman


Linguistic Field(s): Computational Linguistics

Page Updated: 27-Oct-2014