Publishing Partner: Cambridge University Press CUP Extra Publisher Login

New from Cambridge University Press!


Revitalizing Endangered Languages

Edited by Justyna Olko & Julia Sallabank

Revitalizing Endangered Languages "This guidebook provides ideas and strategies, as well as some background, to help with the effective revitalization of endangered languages. It covers a broad scope of themes including effective planning, benefits, wellbeing, economic aspects, attitudes and ideologies."

We Have a New Site!

With the help of your donations we have been making good progress on designing and launching our new website! Check it out at!
***We are still in our beta stages for the new site--if you have any feedback, be sure to let us know at***

Academic Paper

Title: The automatic identification of lexical variation between language varieties
Author: Yves Peirsman
Institution: Université Catholique de Louvain
Author: Dirk Geeraerts
Institution: Université Catholique de Louvain
Author: Dirk Speelman
Institution: Université Catholique de Louvain
Linguistic Field: Lexicography; Semantics; Text/Corpus Linguistics
Subject Language: Dutch
Abstract: Languages are not uniform. Speakers of different language varieties use certain words differently – more or less frequently, or with different meanings. We argue that distributional semantics is the ideal framework for the investigation of such lexical variation. We address two research questions and present our analysis of the lexical variation between Belgian Dutch and Netherlandic Dutch. The first question involves a classic application of distributional models: the automatic retrieval of synonyms. We use corpora of two different language varieties to identify the Netherlandic Dutch synonyms for a set of typically Belgian words. Second, we address the problem of automatically identifying words that are typical of a given lect, either because of their high frequency or because of their divergent meaning. Overall, we show that distributional models are able to identify more lectal markers than traditional keyword methods. Distributional models also have a bias towards a different type of variation. In summary, our results demonstrate how distributional semantics can help research in variational linguistics, with possible future applications in lexicography or terminology extraction.


This article appears IN Natural Language Engineering Vol. 16, Issue 4.

Return to TOC.

Add a new paper
Return to Academic Papers main page
Return to Directory of Linguists main page