LINGUIST List 25.4407

Tue Nov 04 2014

FYI: SemEval-2015 Task 3: Community Question Answering

Editor for this issue: Uliana Kazagasheva <>

Date: 03-Nov-2014
From: Preslav Nakov <>
Subject: SemEval-2015 Task 3: Community Question Answering
E-mail this message to a friend

SemEval-2015 Task 3: Answer Selection in Community Question

Google Group:!forum/semeval-cqa
Evaluation period: December 5 - 22, 2014
Paper submission: January 30, 2015

Answer selection in community question answering data (i.e., user generated content).

- The task is related to an application scenario, but it has been decoupled from the IR component to facilitate participation and focus on the relevant aspects for the SemEval community
- More challenging task than traditional question answering
- Related to textual entailment, semantic similarity, and NL inference
- Multilingual: Arabic and English

We target semantically oriented solutions using rich language representations to see whether they can improve over simpler bag-of-words and word matching techniques.

Answer selection in community question answering data (i.e., user generated content).

Task Description:
Community question answering (QA) systems are gaining popularity online. Such systems are seldom moderated, quite open, and thus they have little restrictions, if any, on who can post and who can answer a question. On the positive side, this means that one can freely ask any question and expect some good, honest answers. On the negative side, it takes efforts to go through all possible answers and make sense of them. For example, it is not unusual for a question to have hundreds of answers, which makes it very time consuming to the user to inspect and winnow.

We propose a task that can help automate this process by identifying the posts in the answer thread that answer the question well vs. those that can be potentially useful to the user (e.g., because they can help educate him/her on the subject) vs. those that are just bad or useless.
Moreover, for the special case of YES/NO questions we propose an extreme summarization version of the task, which asks for producing a simple YES/NO summary of all valid answers.

In short
Subtask A:
Given a question (short title + extended description), and several community answers, classify each of the answers as
- definitely relevant (good),
- potentially useful (potential), or
- bad or irrelevant (bad, dialog, non-English, other).

Subtask B:
Given a YES/NO question (short title + extended description), and a list of community answers, decide whether the global answer to the question should be yes, no or unsure, based on the individual good answers. This subtask is only available for English.

For a more detailed description of the English and Arabic datasets, please check here:

Register to participate here:
Finally, do not miss the important dates (the evaluation period is from December 5 to December 22).

Important Dates:
- Evaluation period starts: December 5, 2014
- Evaluation period ends: December 22, 2014
- Paper submission due: January 30, 2015
- Paper notification: Early March, 2015
- Camera-ready due: March 30, 2015
- SemEval-2015 workshop: June 4-5, 2015 (collocated with NAACL'2015)

For questions and doubts please check out our Google Group:

Lluís Màrquez, Qatar Computing Research Institute
James Glass, CSAIL-MIT
Walid Magdy, Qatar Computing Research Institute
Alessandro Moschitti, Qatar Computing Research Institute
Preslav Nakov, Qatar Computing Research Institute
Bilal Randeree, Qatar Living

Linguistic Field(s): Computational Linguistics

Subject Language(s): Arabic, Standard (arb)
                            English (eng)

Page Updated: 04-Nov-2014