* * * * * * * * * * * * * * * * * * * * * * * *
LINGUIST List logo Eastern Michigan University Wayne State University *
* People & Organizations * Jobs * Calls & Conferences * Publications * Language Resources * Text & Computer Tools * Teaching & Learning * Mailing Lists * Search *
* *
 
E-mail this message to a friend
Title: Computational Morphosyntactic Analysis of Modern Greek
Author: Giorgos Orphanos
Email: click here to access email
Degree Awarded: University of Patras , Department of Computer Engineering and Informatics
Degree Date: 2000
Linguistic Subfield(s): Computational Linguistics
Subject Language(s): Greek
Director(s): Dimitris Christodoulakis
Georgios Philokiprou
Panagiotis Pintelas

Abstract:

This dissertation addresses the computational morphosyntactic analysis of Modern Greek — an inflected natural language. Morphosyntactic analysis is a cognitive process that constitutes an intermediate layer between morphological and syntactic analysis and aims to assign unambiguous morphosyntactic information to words of texts. With the term morphosyntactic information we mean the morphological origin and the morphosyntactic properties of a word (e.g. the word ανθρώπου is the genitive singular form of the masculine noun [άνθρωπος]). The primary concern of morphosyntactic analysis is to resolve the morphosyntactic ambiguity introduced by morphological analysis (e.g. the word απαντήσεις is either a form of the verb [απαντώ] or a form of the noun [απάντηση]), so as to alleviate the already difficult task of syntactic analysis.
After an overview of the models that have been applied to the morphosyntactic disambiguation of various natural languages, we propose and implement a new model for Modern Greek. Our model comprises two layers. The first layer is constructed according to the machine learning approach. It resolves a significant (the most difficult) part of the ambiguity with the aid of automatically induced decision trees. Decision tree induction is performed with three different algorithms; all three are variants of the standard ID3 algorithm adapted to the linguistic nature of the training datasets. The second layer is constructed according to the linguistic approach. It resolves the remainder of the ambiguity with the aid of handcrafted syntactic rules. The description of the syntactic rules is based on the definite-clause grammar formalism. For the evaluation of our model we used a manually disambiguated corpus of Greek running texts. The evaluation results certify the success of our approach.
The practical outcome of the research presented herein was the development of a morphosyntactic tagger (better known as part-of-speech tagger). The major characteristic of this tagger is its robustness, i.e. the capability to process any text written in Greek. Apart from its utility as a standalone text-analysis tool, the morphosyntactic tagger plays a key role in almost all natural language processing applications: corpus annotation, syntactic analysis, grammar checking, word sense disambiguation, information retrieval, information extraction, summarization, text classification, etc.
Add a dissertation
Update dissertation
Page Updated: 26-Nov-2009

Please report any bad links or misclassified data

LINGUIST Homepage | Read LINGUIST | Contact us

NSF Logo

While the LINGUIST List makes every effort to ensure the linguistic relevance of sites listed
on its pages, it cannot vouch for their contents.