Editor for this issue: Takako Matsui <tako
linguistlist.org>
Institution: University of Hyderabad Program: Centre for Applied Linguistics and Translation Studies Dissertation Status: Completed Degree Date: 2003 Author: Vaishnavi Ramaswamy Dissertation Title: A Morphological Analyzer for Tamil Linguistic Field: Computational Linguistics Dissertation Director 1: G. Uma Maheshwara Rao Dissertation Abstract: This thesis deals with the designing and implementation of a morphological analyzer for the Tamil language. It also involves a comparative study of certain other models of morphological processing, in order to analyze the advantages of each, in terms of suitability for adaptation for a language like Tamil. This is primarily aimed at constructing a complete morphological module for Tamil that could be used in any NLP application like a spell checker, POS tagger, or parser. Aspects of designing a computational model for morphological analysis include: 1) Deciding a model based on psycholinguistic factors. 2) Designing formal methods/techniques that would enable converting theoretical descriptions into computational models. The analyzer under consideration relies on a theoretical blend of the IA and IP approaches to morphological decomposition. Wherever automatic phonological rules operate largely, IP is incorporated. In areas where complex but non-automatic morphophonemics (sandhi) is involved, IA is the choice. Qualitative and quantitative methods in corpus linguistics were employed to extract frequency counts and collocations of words. All possible contexts of occurrence and usage of a word were studied. For every grammatical category of the language, an extracted list of the minimum number of word-forms required for a sufficient coverage had been prepared. Based on such attributes, and in consideration of the factors of coverage and efficiency for a morphological analyzer, an essential set of morphological paradigms for each word class in Tamil had been established. This served as a database comprising of different tables of inflectional forms of a word, for all the words in the language. An analysis of two other well-established models of morphological analysis: AMPLE and KIMMO had also been taken up for the purpose of comparison. They formed good platforms for implementing morphological analyzers in various languages. Implementation of these have been compared with the Tamil Morph developed here, taking into consideration factors such as, the cost of implementation in terms of effort and time, coverage and efficiency.Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue