Publishing Partner: Cambridge University Press CUP Extra Publisher Login

New from Cambridge University Press!


Revitalizing Endangered Languages

Edited by Justyna Olko & Julia Sallabank

Revitalizing Endangered Languages "This guidebook provides ideas and strategies, as well as some background, to help with the effective revitalization of endangered languages. It covers a broad scope of themes including effective planning, benefits, wellbeing, economic aspects, attitudes and ideologies."

We Have a New Site!

With the help of your donations we have been making good progress on designing and launching our new website! Check it out at!
***We are still in our beta stages for the new site--if you have any feedback, be sure to let us know at***

Academic Paper

Title: A transformation-driven approach for recognizing textual entailment
Linguistic Field: Computational Linguistics
Abstract: Textual Entailment is a directional relation between two text fragments. The relation holds whenever the truth of one text fragment, called Hypothesis (H), follows from another text fragment, called Text (T). Up until now, using machine learning approaches for recognizing textual entailment has been hampered by the limited availability of data. We present an approach based on syntactic transformations and machine learning techniques which is designed to fit well with a new type of available data sets that are larger but less complex than data sets used in the past. The transformations are not predefined, but calculated from the data sets, and then used as features in a supervised learning classifier. The method has been evaluated using two data sets: the SICK data set and the EXCITEMENT English data set. While both data sets are of a larger order of magnitude than data sets such as RTE-3, they are also of lower levels of complexity, each in its own way. SICK consists of pairs created by applying a predefined set of syntactic and lexical rules to its T and H pairs, which can be accurately captured by our transformations. The EXCITEMENT English data contains short pieces of text that do not require a high degree of text understanding to be annotated. The resulting AdArte system is simple to understand and implement, but also effective when compared with other existing systems. AdArte has been made freely available with the EXCITEMENT Open Platform, an open source platform for textual inference.


This article appears IN Natural Language Engineering Vol. 23, Issue 4, which you can READ on Cambridge's site .

Return to TOC.

View the full article for free in the current issue of
Cambridge Extra Magazine!
Add a new paper
Return to Academic Papers main page
Return to Directory of Linguists main page