Editor for this issue: Andrea Berez <andrea
linguistlist.org>
Computational Approaches to Arabic Script-based Languages Short Title: coling2004 workshop Date: 28-Aug-2004 - 28-Aug-2004 Location: Geneva, Switzerland Contact: Karine Megerdoomian Contact Email: karinemMail to author|Respond to list|Read more issues|LINGUIST home page|Top of issueinxight.com Meeting URL: http://members.cox.net/karinem/COLING2004 Linguistic Sub-field: Computational Linguistics Subject Language: Arabic, Standard ,Kurdi ,Pashto, Southern ,Farsi, Western ,Urdu Call Deadline: 01-Apr-2004 This is a session of the following conference: 20th International Conference on Computational Linguistics Meeting Description: Recently, there has been a surge of interest in the study of the languages of the Middle East, especially Arabic, Persian (Farsi), Pashto and Urdu. Computational applications for proper name identification, entity recognition, categorization, information retrieval, summarization, machine translation and other implementations are currently in high demand. The goal of this workshop, being held as a session of COLING 2004, is to provide a forum for those involved in the development of NLP systems in Arabic script languages to exchange ideas, approaches and implementations of computational systems; to discuss the common challenges faced by all practitioners; and to assess the state of the art in the field. The workshop is to be held at the Univeristy of Geneva, Switzerland on August 28th, 2004. Invited Speaker: Martin Kay (Stanford University) Workshop website: http://members.cox.net/karinem/COLING2004 =================== WORKSHOP PROGRAM =================== OPENING AND OVERVIEW 8:30-9:00 Computer Processing of Arabic Script-based Languages: Current State and Future Directions Ali Farghaly SESSION 1: LEXICON AND CORPORA 9:00-9:30 Developing an Arabic Treebank: Methods, Guidelines, Procedures, and Tools Mohamed Maamouri and Ann Bies 9:30-10:00 Preliminary Lexical Framework for English-Arabic Semantic Resource Construction Anne R. Diekema 10:00-10:30 The Architecture of a Standard Arabic Lexical Database: Some Figures, Ratios, and Categories from the DIINAR.1 Source Program Ramzi Abb�s, Joseph Dichy and Mohamed Hassoun 10:30-10:45 BREAK SESSION 2: MORPHOLOGY 10:45-11:15 Systematic Verb Stem Generation for Arabic Jim Yaghi and Sane Yagi 11:15-11:45 Issues in Arabic Orthography and Morphology Analysis Tim Buckwalter 11:45-12:15 Finite-State Morphological Analysis of Persian Karine Megerdoomian 12:15-2:00 LUNCH & DEMO SESSIONS DEMONSTRATIONS Urdu Localization Project Sarmad Hussain FarsiSum - A Persian Text Summarizer Martin Hassel and Nima Mazdak Stemming the Qur'an Naglaa Thabet Language Weaver Arabic->English MT Daniel Marcu, Alex Fraser, William Wong and Kevin Knight INVITED SPEAKER 2:00-2:45 Arabic Script-Based Languages Deserve to be Studied Linguistically Martin Kay SESSION 3: STATISTICAL APPROACHES 2:45-3:15 An Unsupervised Approach for Bootstrapping Arabic Sense Tagging Mona T. Diab 3:15-3:45 Automatic Arabic Document Categorization Based on the Naive Bayes Algorithm Mohamed El Kourdi, Amine Bensaid and Tajje-eddine Rachidi 3:45-4:00 BREAK SESSION 4: SPEECH PROCESSING 4:00-4:30 A Transcription Scheme for Languages Employing the Arabic Script Motivated by Speech Processing Applications Shadi Ganjavi, Panayiotis G. Georgiou and Shrikanth Narayanan 4:30-5:00 Automatic Diacritization of Arabic for Acoustic Modeling in Speech Recognition Dimitra Vergyri and Katrin Kirchhoff 5:00-5:30 Letter-to-Sound Conversion for Urdu Text-to-Speech System Sarmad Hussain DISCUSSION AND CLOSING 5:30-6:00 Ali Farghaly and Karine Megerdoomian Accepted papers and formal demonstrations will be published in a proceedings volume, which will be made available at the workshop. ======================= WORKSHOP REGISTRATION ======================= For the workshops to take place, the COLING 2004 organizers require at least 20 participants to register for the workshop. Speakers and participants are therefore asked to register via the official Coling 2004 website as soon as possible by visiting http://www.issco.unige.ch/coling2004/. Workshop fees (in Swiss Francs): * Student early chf 90 * Student late chf 120 * Student on-site chf 150 * Regular early chf 120 * Regular late chf 150 * Regular on-site chf 180 ====================== ORGANIZING COMMITTEE ====================== Ali Farghaly (SYSTRAN Software, Inc.) Karine Megerdoomian (Inxight Software and University of California, San Diego) =================== PROGRAM COMMITTEE =================== Jan W. Amtrup (Bowne Global Solutions) Tim Buckwalter (Linguistic Data Consortium) Miriam Butt (Konstanz University, Germany) Violetta Cavalli-Sforza (Carnegie Mellon University) Joseph Dichy (Lyon University) Abdelkadir Fassi Fehri (Mohammed V University-Souissi Rabat, Morocco) Andrew Freeman (University of Washington) Nizar Habash (University of Maryland, College Park) Masayo Iida (Inxight Software, Inc) Simin Karimi (University of Arizona) Martin Kay (Stanford University) Kevin Knight (USC/Information Sciences Institute) Farhad Oroumchian (University of Wollongong in Dubai) Ahmed Rafea (The American University in Cairo) Jean Senellart (SYSTRAN Software) Bonnie Glover Stalls (University of Southern California) R�mi Zajac (SYSTRAN Software)