Publishing Partner: Cambridge University Press CUP Extra Publisher Login
amazon logo
More Info

New from Oxford University Press!


It's Been Said Before

By Orin Hargraves

It's Been Said Before "examines why certain phrases become clichés and why they should be avoided -- or why they still have life left in them."

New from Cambridge University Press!


Sounds Fascinating

By J. C. Wells

How do you pronounce biopic, synod, and Breughel? - and why? Do our cake and archaic sound the same? Where does the stress go in stalagmite? What's odd about the word epergne? As a finale, the author writes a letter to his 16-year-old self.

Academic Paper

Title: Rewriting the orthography of SMS messages
Author: Francois Yvon
Institution: Université Paris Sud 11
Linguistic Field: Computational Linguistics; Writing Systems
Subject Language: French
Abstract: Electronic written texts used in computer-mediated interactions (emails, blogs, chats, and the like) contain significant deviations from the norm of the language. This paper presents the detail of a system aiming at normalizing the orthography of French SMS messages: after discussing the linguistic peculiarities of these messages and possible approaches to their automatic normalization, we present, compare, and evaluate various instantiations of a normalization device based on weighted finite-state transducers. These experiments show that using an intermediate phonemic representation and training, our system outperforms an alternative normalization system based on phrase-based statistical machine translation techniques.


This article appears IN Natural Language Engineering Vol. 16, Issue 2.

Return to TOC.

Add a new paper
Return to Academic Papers main page
Return to Directory of Linguists main page