LINGUIST List 28.1288

Wed Mar 15 2017

FYI: Build It Break It: a New Shared Task for CL

Editor for this issue: Kenneth Steimel <>

Date: 14-Mar-2017
From: Damir Cavar <>
Subject: Build It Break It: a New Shared Task for CL
E-mail this message to a friend

Emily Benders pointed this out to us:

Build It Break It is a new type of shared task for AI problems that pits AI system ''builders'' against human ''breakers'' in an attempt to learn more about the generalizability of current technology and the brittleness of current NLP systems.

The goals are multi-fold: we want to...

- Build more reliable NLP technology
- Learn what linguistic phenomena our systems are capable of handling
- Encourage researchers to think about model assumptions
- Build an interesting test collection of examples
- Increase cross-talk between linguistics and NLPers

Our shared task will run in three rounds:

1. Building Round: We will release training data (which you can choose to use or ignore as you like) and Builder Teams will build systems for solving that task.

2. Breaking Round: Breakers must construct minimal pairs that they think will fool the Builders' systems.

3. Judgment Round: All minimal pair sentences will be collected, shuffled, and sent back to the builder teams. They must run their system as is and upload the predictions on all the Breaker data.

We will run two tasks in parallel:

1. Sentence-level sentiment analysis, based on movie reviews originally collected by Pang+Lee+Vaithyanathan.

2. Semantic role labelling as question answering: This task and data are derived from the He+Lewis+Zettlemoyer's work on Question-Answer Driven Semantic Role Labelling. The input is a sentence and a question related to one of the predicates in the sentence, and the output is a span of the sentence that answers the question.

The contest will run from late March through the end of May and you're welcome to participate as a builder, a breaker, or both!

Linguistic Field(s): Computational Linguistics

Page Updated: 15-Mar-2017