LINGUIST List 25.1640

Tue Apr 08 2014

Calls: Computational Linguistics, Text/Corpus Linguistics, Syntax/Ireland

Editor for this issue: Bryn Hauk <brynlinguistlist.org>

Date: 08-Apr-2014
From: Ines Rehbein <irehbeinuni-potsdam.de>
Subject: Special Track on the Syntactic Analysis of Non-Canonical Language
Full Title: Special Track on the Syntactic Analysis of Non-Canonical Language
Short Title: SPMRL-SANCL 2014

Date: 24-Aug-2014 - 24-Aug-2014
Location: Dublin, Ireland
Contact Person: Ines Rehbein
Meeting Email: < click here to access email >
Web Site: http://www.spmrl.org/spmrl-sancl2014.html

Linguistic Field(s): Computational Linguistics; Syntax; Text/Corpus Linguistics

Call Deadline: 06-Jun-2014

Meeting Description:

The SANCL 2014 Special Track aims to provide a forum for all researchers interested in syntactic analysis and parsing of language that is non-canonical. By that term we mean structures with characteristics deviating from the standard written form of the language. A case in point is spoken language, but also the language of social media, computer-mediated communication in general, the interlanguage produced by language learners or historical data. These all have their own specific properties and all pose challenges for parsing models trained on edited newspaper text as well as for the theoretical analysis of these structures.

The SANCL Special Track is part of the Joint Workshop on Statistical Parsing of Morphologically Rich Languages and Analysis of Non-Canonical Languages (SPMRL-SANCL 2014), co-located with COLING 2014.

2nd Call for Papers:

Special Track on the Syntactic Analysis of Non-Canonical Language
Endorsed by SIGPARSE

Submission deadline: June 06, 2014

Main workshop: http://www.spmrl.org/spmrl-sancl2014.html
SANCL Special Track: http://www.spmrl.org/sancl-posters2014.html

SANCL Poster Submissions:

In addition to regular paper submissions, we solicit poster submissions addressing the syntactic analysis of frequent phenomena of non-canonical languages which are difficult to annotate and parse using conventional annotation schemes. A case in point are the representation of verbless utterances in a dependency scheme, the pros and cons of different representations of disfluencies for statistical parsing, or the analysis of complex hashtags which incorporate and merge different syntactic arguments into one token.

Poster submissions should focus on one or more of the topics listed below. They should either be submitted as a short paper (up to 7 single-column pages + references, to be included in the proceedings and presented as a poster at the workshop) or be submitted as an abstract (max. 500 words excluding examples/references, to be presented as a poster at the workshop). Abstract submissions should sketch an analysis for a given problem while short paper submissions should also present at least preliminary experimental results showing the feasibility of the approach.

Topics for Poster Submissions:

Unit of Analysis:

We ask for contributions on the optimal unit of analysis for non-canonical languages which do not come already separated into sentence-like units (e.g. spoken language, tweets, historical data), and for contributions on best practices for tokenizing spoken language and CMC.

Elliptical Structures and Missing Elements:

Non-canonical languages often include sentences where syntactic arguments are not expressed at the surface level. This raises the question how we can provide a meaningful analysis for these structures, especially in a dependency grammar framework. We ask for contributions discussing the optimal representation for elliptical structures.

Hashtags & Friends:

We are interested in approaches towards a syntactic analysis of hashtags and related phenomena which allow us to make use of the information encoded in hashtags.


Disfluencies (e.g. fillers, repairs) are a common phenomenon in spoken language and also occur in written, but conceptually spoken language such as CMC. We ask for contributions discussing the best way of representing disfluencies in the syntax tree.

Code Mixing:

In informal spoken language as well as in CMC, a considerable amount of the data includes code mixing. We ask for contributions discussing best practices for the syntactic analysis of code mixing.

For more detailed information, please visit: http://www.spmrl.org/sancl-posters2014.html

SANCL Special Track Organizers:

Özlem Cetinoglu (IMS, Germany)
Ines Rehbein (Postdam University, Germany)
Djamé Seddah (Université Paris Sorbonne & Inria's Alpage project)
Joel Tetreault (Yahoo! Labs, US)

