* * * * * * * * * * * * * * * * * * * * * * * *
LINGUIST List logo Eastern Michigan University Wayne State University *
* People & Organizations * Jobs * Calls & Conferences * Publications * Language Resources * Text & Computer Tools * Teaching & Learning * Mailing Lists * Search *
* *
LINGUIST List 22.1898

Mon May 02 2011

FYI: 7th RTE Challenge at TAC 2011

Editor for this issue: Brent Miller <brentlinguistlist.org>


To post to LINGUIST, use our convenient web form at http://linguistlist.org/LL/posttolinguist.cfm.
Directory
        1.     Danilo Giampiccolo , 7th RTE Challenge at TAC 2011

Message 1: 7th RTE Challenge at TAC 2011
Date: 02-May-2011
From: Danilo Giampiccolo <giampiccolocelct.it>
Subject: 7th RTE Challenge at TAC 2011
E-mail this message to a friend

Seventh Recognizing Textual Entailment Challenge at TAC 2011

http://www.nist.gov/tac/2011/RTE/

The Recognizing Textual Entailment (RTE) task consists of developing a
system that, given two text fragments, can determine whether the meaning of
one text is entailed, i.e. can be inferred, from the other text. The task
has proved to work as a common framework in which to analyze, compare
and evaluate different techniques used in NLP applications to deal with
semantic inference.

Since the introduction of the RTE task in 2005, RTE challenges have been
organized annually. After three highly successful PASCAL RTE Challenges
held in Europe, in 2008 RTE became a track at the Text Analysis Conference
(TAC). During the last years RTE has constantly evolved, in the attempt to
apply Textual Entailment to specific application settings and move it
towards more realistic scenarios. After experimenting textual entailment
recognition on a corpus in the RTE-5 Pilot Search task, in RTE-6 further
innovations were introduced. First of all, the task of Recognizing Textual
Entailment within a Corpus, a close variant of the RTE-5 Pilot task,
replaced the traditional Main task. Furthermore, a Novelty Detection
subtask aimed at specifically addressing the needs of the Summarization
Update scenario was also proposed, where a system has to detect whether a
statement is novel with respect to the content of a prior set of documents.
Finally a new Knowledge Base Population (KBP) Validation Pilot, based on
the TAC KBP Slot Filling task, was set up in order to investigate the
potential utility of RTE systems for Knowledge Base Population.

Encouraged by the positive response obtained so far, the RTE Organizing
Committee is glad to launch the Seventh Recognizing Textual Entailment
Challenge at TAC 2011.

Organizations interested in participating in the RTE-7 Challenge are
invited to submit a track registration form by June 3, 2011, at the TAC
2011 web site (http://www.nist.gov/tac/2011/).

The RTE-7 Tasks:

In order to ensure the continuity with the previous campaign and allow
participants to get acquainted with the novelties introduced for the first
time in RTE-6, the same tasks are proposed also in RTE-7 without
significant changes, namely:

Main Task - Recognizing Textual Entailment within a Corpus:

In the RTE-7 Main task given a corpus, a hypothesis H, and a set of
"candidate" entailing sentences for that H retrieved by Lucene from the
corpus, RTE systems are required to identify all the sentences that entail
H among the candidate sentences.

The RTE-7 Main data set is based on the data created for the TAC 2008 and
2009 Update Summarization task, consisting of a number of topics, each
containing two sets of documents, namely i) Cluster A, made up of the first
10 texts in chronological order of publication date, and ii) Cluster B,
made up of the last 10 texts. H’s are standalone sentences taken from the
TAC Update Summarization corpus, meanwhile candidate entailing sentences
(T's) are the 100 top-ranked sentences retrieved for each H by Lucene from
the Cluster A corpus, using H verbatim as the search query. While only the
subset of the candidate entailing sentences must be judged for entailment,
these sentences are not to be considered as isolated texts, but the entire
Cluster A corpus, to which the candidate entailing sentences belong, is to
be taken into consideration in order to resolve discourse references and
appropriately judge the entailment relation.

The example below presents a hypothesis referring to a given topic and some
of the entailing sentences found in the subset of candidate sentences:

< H_sentence >Lance Armstrong is a Tour de France winner.< /H_sentence >
< text doc_id="AFP_ENG_20050824.0557" s_id="1"
evaluation="YES" >Claims by a French newspaper that seven-time Tour de
France winner Lance Armstrong had taken EPO were attacked as unsound and
unethical by the director of the Canadian laboratory whose tests saw
Olympic drug cheat Ben Johnson hit with a lifetime ban.< /text >
< text doc_id="AFP_ENG_20050824.0557" s_id="2"
evaluation=''YES'' >L'Equipe on Tuesday carried a front page story headlined
"Armstrong's Lie" suggesting the Texan had used the illegal blood booster
EPO (erythropoeitin) during his first Tour win in 1999.< /text >

The second sentence in the example entails H because "the Texan" and
"Tour" can be resolved as "Lance Armstrong" and "Tour the France"
respectively, on the basis of the context in which they occur in the
Cluster A document.

Novelty Detection Subtask:

The Novelty Detection subtask is based on the Main task and is aimed at
specifically addressing the interests of the Summarization community, in
particular with regard to the Update Summarization task, focusing on
detection of novelty in Cluster B documents.

The task consists of judging if the information contained in each H (drawn
from the Cluster B documents) is novel with respect to the information
contained in the set of Cluster A candidate entailing sentences. If for a
given H one or more entailing sentences are found, it means that the
content of the H is not new. On the contrary, if no entailing sentences are
detected, it means that the information contained in the H is regarded as
novel.

The Novelty Detection task requires the same output format as the Main task
- i.e. no additional type of decision is needed. Nevertheless, the Novelty
Detection task differs from the Main task in the following ways:

1) The set of H’s is not the same as that of the Main task;

2) The system outputs are scored differently, using specific scoring
metrics designed for assessing novelty detection.

The Main and Novelty Detection task guidelines and Development set are
available at the RTE-7 Website (http://www.nist.gov/tac/2011/RTE/).

KBP Validation Task:

Based on the TAC Knowledge Base Population (KBP) Slot-Filling task, the
KBP validation task is to determine whether a given relation (Hypothesis)
is supported in an associated document (Text). Each slot fill that is
proposed by a system for the KBP Slot-Filling task would create one
evaluation item for the RTE-KBP Validation task: the Hypothesis would be a
simple sentence created from the slot fill, while the Text would be the
source document that was cited as supporting the slot fill.

The KBP Validation task guidelines are available at the RTE-7 website
(http://www.nist.gov/tac/2011/RTE/), together with the instructions to
obtain the Development data
(http://www.nist.gov/tac/2011/RTE/registration.html).

Resource and Tool Evaluation through Ablation Tests:

The exploratory effort on resource evaluation, started in RTE-5 and
extended to tools in RTE-6, will be continued also in RTE-7. Ablation
tests are required for systems participating in the RTE-7 Main task, in
order to collect data to better understand the impact of both knowledge
resources and tools used by RTE systems and evaluate their contribution to
systems' performance.

In an ablation test, a single resource or tool is removed from or added to
a system, which is then rerun. Comparing the results to those obtained by
the original system, it is possible to assess the practical contribution
given by the individual resource or tool.

More details on ablation tests are given in the Main and Novelty Detection
Task guidelines:
http://www.nist.gov/tac/2011/RTE/RTE7_Main_NoveltyDetection_Task_Guidelines.pdf.

The RTE Resource Pool:

http://www.aclweb.org/aclwiki/index.php?title=Textual_Entailment_Resource_Pool

The RTE Resource Pool, set up for the first time during RTE-3, serves as a
portal and forum for publicizing and tracking resources, and reporting on
their use. All the RTE participants and other members of the NLP community
who develop or use relevant resources are encouraged to contribute to this
important resource.

The RTE Knowledge Resource Page:

http://www.aclweb.org/aclwiki/index.php?title=Textual_Entailment_Resource_Pool#Knowledge_Resources

The page contains a list of the "standard" RTE resources, which have been
selected and exploited majorly in the design of RTE systems during the RTE
challenges held so far, together with the links to the locations where they
are made available. Furthermore, the results of the ablation tests carried
out in RTE-5 and in RTE-6, and their description, is also provided.

Proposed RTE-7 Schedule:

April 29: KBP Validation task: Release of Development Set
April 29: Main task: Release of Development Set
June 3: Deadline for TAC 2011 track registration
August 17: KBP Validation task: Release of Test Set
August 29: Main task: Release of Test Set
September 8: Main task: Deadline for task submissions
September 15: Main task: Release of individual evaluated results
September 16: KBP Validation task: Deadline for task submissions
September 23: KBP Validation task: Release of individual evaluated results
September 25: Deadline for TAC 2011 workshop presentation proposals
September 29: Main task: Deadline for ablation tests submissions
October 6: Main task: Release of individual ablation test results
October 25: Deadline for systems' reports
November 14-15: TAC 2011 workshop in Gaithersburg, Maryland, USA

Track Coordinators and Organizers:

Luisa Bentivogli, CELCT and FBK, Italy
Peter Clark, Vulcan Inc., USA
Ido Dagan, Bar Ilan University, Israel
Hoa Trang Dang, NIST, USA
Danilo Giampiccolo, CELCT, Italy

Linguistic Field(s): Computational Linguistics

This Year the LINGUIST List hopes to raise $67,000. This money will go to help 
keep the List running by supporting all of our Student Editors for the coming year.

See below for donation instructions, and don't forget to check out Fund 
Drive 2011 site!

http://linguistlist.org/fund-drive/2011/

There are many ways to donate to LINGUIST!

You can donate right now using our secure credit card form at  
https://linguistlist.org/donation/donate/donate1.cfm

Alternatively you can also pledge right now and pay later. To do so, go to: 
https://linguistlist.org/donation/pledge/pledge1.cfm

For all information on donating and pledging, including information on how to 
donate by check, money order, or wire transfer, please visit: 
http://linguistlist.org/donation/

The LINGUIST List is under the umbrella of Eastern Michigan University and as 
such can receive donations through the EMU Foundation, which is a registered 
501(c) Non Profit organization. Our Federal Tax number is 38-6005986. These 
donations can be offset against your federal and sometimes your state tax return 
(U.S. tax payers only). For more information visit the IRS Web-Site, or contact 
your financial advisor.

Many companies also offer a gift matching program, such that they will match 
any gift you make to a non-profit organization. Normally this entails your 
contacting your human resources department and sending us a form that the 
EMU Foundation fills in and returns to your employer. This is generally a simple 
administrative procedure that doubles the value of your gift to LINGUIST, without 
costing you an extra penny. Please take a moment to check if your company 
operates such a program.

Thank you very much for your support of LINGUIST!

Read more issues|LINGUIST home page|Top of issue



Page Updated: 02-May-2011

Supported in part by the National Science Foundation       About LINGUIST    |   Contact Us       ILIT Logo
While the LINGUIST List makes every effort to ensure the linguistic relevance of sites listed on its pages, it cannot vouch for their contents.