Date: 24-Feb-2010
From: Sarmad Hussain <sarmad.hussainnu.edu.pk>
Subject: PANL10n Releases Linguistic Resources and Software
E-mail this message to a friend
PAN Localization project (http://www.panl10n.net) has been a regionalinitiative promoting language technology across developing Asia. Theproject, initiated in 2003, has developed and disseminated computingsolutions for Bahasa Indonesia, Bangla, Dzongkha, Khmer, Lao, Mongolian,Nepali, Pashto, Sinhala, Tamil Tibetan and Urdu.
On the occasion of the eleventh International Mother Language Day, 21stFebruary 2010, PAN Localization project is pleased to release its research,technology and resources through its website.
This project has been carried out with collaboration of IDRC, Canada(www.idrc.ca), National University of Computer and Emerging Sciences,Pakistan (www.crulp.org; www.nu.edu.pk) and the following partnerorganizations:
- Afghan Computer Science Association, Afghanistan- BRAC University and Development Research Network, Bangladesh- Department of IT, Bhutan- Ministry of Education, Youth and Sport, Institute of Technology, andNational ICT Development Authority, Cambodia- Tibet University, Institute of Science and Technology, Tibet Academy ofAgricultural and Animal Husbandry Sciences, China- University of Indonesia, Agency for the Assessment and Application ofTechnology, Indonesia- National Authority for Science and Technology, Laos- InfoCon Co. Ltd., Mongolian University of Science and Technology andNational University of Mongolia, Mongolia- Madan Puraskar Pustakalaya, and E-Network Research and Development, Nepal- University of Colombo School of Computing, Sri Lanka
Salient Outputs (and more … on the project website: http://www.panl10n.net/):
Bahasa IndonesiaStatistical Machine Translation (Awarded), English-Bahasa Parallel Corpus(1 Million words), POS Tagged Bahasa Corpus (500,000 words), Part of SpeechTagset and Tagger
BanglaText to Speech System (Awarded), Optical Character Recognition System(Shortlisted for Award), Bangla Pad, Spell Checker, Lexicon, Language Tablefor IDNs, Part of Speech Tagset and Tagger, Wordnet (1000 words), TaggedCorpus (5 Million words), English-Bangla Parallel Corpus
DzongkhaDzongkhaLinux, Optical Character Recognition System, Language Table forIDNs, Part of Speech Tagset, Corpus (600,000 words), Lexicon (23,000words), Text to Speech System (prototype), Dzongkha Terminology, Collation,Locale, Fonts and Keyboard
KhmerOptical Character Recognition System, Java Applications and OpenOffice.orgPlug-ins for Collation, Encoding Conversion, Word Segmentation, Locale,Mobile SMS, Language Table for IDNs, Part of Speech Tagset and Tagger,Lexicon, Text to Speech System (prototype), Tagged Corpus (150,000 words),Online Khmer Content on Veticar.com
LaoOptical Character Recognition System, OpenOffice.org and MS Office Plug-infor Word Segmentation, Collation, Spell Checker, Lao Pad, Fonts, Keyboard,Language Table for IDNs, Part of Speech Tagset, POS Tagged Corpus, ParallelCorpus (37,000 words), Online Lao Content
MongolianPart of Speech Tagset and Tagger, Spell Checker, Corpus (1,000,000 words),Tagged Corpus (100,000 words), Lexicon (10,000 words), Automatic SpeechRecognition, Localization of Pidgin and SeaMonkey
NepaliNepaLinux (Awarded), Spell Checker, Grammar Checker, Parallel Corpus(100,000 words), Tagged Corpus (80,000 words), Lexicon (37,000 words),Optical Character Recognition System (prototype)
PashtoLocalized SeaMonkey (Awarded), Keyboard, Fonts
Sinhala & TamilSinhala Optical Character Recognition System, Sinhala Text to Speech System(Awarded), Screen Reader for Sinhala for Blind, Language Learning Tool forTamil in Sinhala and English, Sinhala Wordnet, Localized OpenTM, CollationStandard, Encoding Conversion tool
Linguistic Field(s):
Computational Linguistics
General Linguistics
Translation
Writing Systems
Subject Language(s): Dzongkha (dzo)
Indonesian (ind)
Mongolian, Halh (khk)
Khmer, Central (khm)
Samba Leko (ndi)
Nepali (nep)
Lao (lao)
Pashto, Central (pst)
Sinhala (sin)
Tamil (tam)
Urdu (urd)
Page Updated:
|