Publishing Partner: Cambridge University Press CUP Extra Publisher Login
amazon logo
More Info


New from Oxford University Press!

ad

Oxford Handbook of Corpus Phonology

Edited by Jacques Durand, Ulrike Gut, and Gjert Kristoffersen

Offers the first detailed examination of corpus phonology and serves as a practical guide for researchers interested in compiling or using phonological corpora


New from Cambridge University Press!

ad

The Languages of the Jews: A Sociolinguistic History

By Bernard Spolsky

A vivid commentary on Jewish survival and Jewish speech communities that will be enjoyed by the general reader, and is essential reading for students and researchers interested in the study of Middle Eastern languages, Jewish studies, and sociolinguistics.


New from Brill!

ad

Indo-European Linguistics

New Open Access journal on Indo-European Linguistics is now available!


Academic Paper


Title: Developing a Sanskrit Analysis System for Machine Translation
Author: Subhash Chandra
Email: click here to access email
Homepage: http://sanskrit.jnu.ac.in/rstudents/subhash.html
Institution: Centre for Development of Advanced Computing
Linguistic Field: Computational Linguistics; Translation
Subject Language: Sanskrit
Abstract: In the present paper, the authors outline the parameters of a Sanskrit Analysis System (SAS). The importance of this system can be understood by the fact that no translation system from Sanskrit to Indian languages can be developed without first building such systems. The paper gives details of each component required, and current developments.

In particular, the paper will focus on the following components:
· Building linguistic resources for translation
· Reverse Sandhi module for initial segmentation
· POS tagging module
· Verb inflection morphology (tinanta) analysis module
· Nominal Inflection morphology (subanta) analysis module
· Derivational morphology (krit, taddhita, stri, samaasa) analysis module
· Kaaraka analysis module

Building a robust and quality Machine Translation System (MTS) has been one of the
most challenging endeavors for the Artificial Intelligence (AI) and Computational
Linguistics (COLING) community so far. The Source Language (SL) and Target
language (TL) dynamics and the direction of translation are difficult variables in
determining the complexity level of the translation system. Building a MTS with Sanskrit
as the SL is important and interesting, as well as challenging. It is important because Sanskrit is the only language in India which can be truly considered a donor language. The vast knowledge reserves in Sanskrit can be transferred (translated) to other Indian
languages with the help of the computer. For various historical reasons, this knowledge in its same un-diluted form may not have been allowed to travel to regional languages/cultures. It is interesting because Sanskrit has a unique status in various senses, one of them being the fact that it is a language frozen in time. It is a highly standardized language with little scope for deviation from the system of Paanini, and is the only national language of India with no state in India. Having Sanskrit as the SL is challenging because of the difficulty in parsing, due to its synthetic nature in which a single word (sentence) can run up to 32 pages (Banbhattaís kaadambree).
Type: Individual Paper
Status: Completed
Venue: Kerala, India
Publication Info: Kerala


Back
Add a new paper
Return to Academic Papers main page
Return to Directory of Linguists main page