Publishing Partner: Cambridge University Press CUP Extra Publisher Login

New from Cambridge University Press!

ad

Revitalizing Endangered Languages

Edited by Justyna Olko & Julia Sallabank

Revitalizing Endangered Languages "This guidebook provides ideas and strategies, as well as some background, to help with the effective revitalization of endangered languages. It covers a broad scope of themes including effective planning, benefits, wellbeing, economic aspects, attitudes and ideologies."


We Have a New Site!

With the help of your donations we have been making good progress on designing and launching our new website! Check it out at https://linguistlist.org/!
***We are still in our beta stages for the new site--if you have any feedback, be sure to let us know at webdevlinguistlist.org***

Academic Paper


Title: A Semantic Scattering model for the automatic interpretation of English genitives
Linguistic Field: Computational Linguistics; Text/Corpus Linguistics
Subject Language: English
Abstract: An important problem in knowledge discovery from text is the automatic extraction of semantic relations. This paper addresses the automatic classification of the expressed by English genitives. A learning model is introduced based on the statistical analysis of the distribution of genitives' semantic relations in a corpus. The semantic and contextual features of the genitive's noun phrase constituents play a key role in the identification of the semantic relation. The algorithm was trained and tested on a corpus of approximately 20,000 sentences and achieved an f-measure of 79.80 per cent for of-genitives, far better than the 40.60 per cent obtained using a Decision Trees algorithm, the 50.55 per cent obtained using a Naive Bayes algorithm, or the 72.13 per cent obtained using a Support Vector Machines algorithm on the same corpus using the same features. The results were similar for s-genitives: 78.45 per cent using Semantic Scattering, 47.00 per cent using Decision Trees, 43.70 per cent using Naive Bayes, and 70.32 per cent using a Support Vector Machines algorithm. The results demonstrate the importance of word sense disambiguation and semantic generalization/specialization for this task. They also demonstrate that different patterns (in our case the two types of genitive constructions) encode different semantic information and should be treated differently in the sense that different models should be built for different patterns.

CUP AT LINGUIST

This article appears IN Natural Language Engineering Vol. 15, Issue 2.

Return to TOC.

Add a new paper
Return to Academic Papers main page
Return to Directory of Linguists main page