Academic Paper |
|
|
|
|
| Title: | Developing a Sanskrit Analysis System for Machine Translation |
| Author: | Subhash Chandra |
| Email: | click here to access email |
| Homepage: | http://sanskrit.jnu.ac.in/rstudents/subhash.html |
| Institution: | Centre for Development of Advanced Computing |
| Linguistic Field: | Computational Linguistics; Translation |
| Subject Language: |
Sanskrit
|
| Abstract: |
In the present paper, the authors outline the parameters of a Sanskrit Analysis System (SAS). The importance of this system can be understood by the fact that no translation system from Sanskrit to Indian languages can be developed without first building such systems. The paper gives details of each component required, and current developments.
In particular, the paper will focus on the following components: · Building linguistic resources for translation · Reverse Sandhi module for initial segmentation · POS tagging module · Verb inflection morphology (tinanta) analysis module · Nominal Inflection morphology (subanta) analysis module · Derivational morphology (krit, taddhita, stri, samaasa) analysis module · Kaaraka analysis module Building a robust and quality Machine Translation System (MTS) has been one of the most challenging endeavors for the Artificial Intelligence (AI) and Computational Linguistics (COLING) community so far. The Source Language (SL) and Target language (TL) dynamics and the direction of translation are difficult variables in determining the complexity level of the translation system. Building a MTS with Sanskrit as the SL is important and interesting, as well as challenging. It is important because Sanskrit is the only language in India which can be truly considered a donor language. The vast knowledge reserves in Sanskrit can be transferred (translated) to other Indian languages with the help of the computer. For various historical reasons, this knowledge in its same un-diluted form may not have been allowed to travel to regional languages/cultures. It is interesting because Sanskrit has a unique status in various senses, one of them being the fact that it is a language frozen in time. It is a highly standardized language with little scope for deviation from the system of Paanini, and is the only national language of India with no state in India. Having Sanskrit as the SL is challenging because of the difficulty in parsing, due to its synthetic nature in which a single word (sentence) can run up to 32 pages (Banbhattaís kaadambree). |
| Type: | Individual Paper |
| Status: | Completed |
| Venue: | Kerala, India |
| Publication Info: | Kerala |
|
|
|
|
Back
Add a new paper Return to Academic Papers main page Return to Directory of Linguists main page |
|


