Publishing Partner: Cambridge University Press CUP Extra Publisher Login

New from Cambridge University Press!


Revitalizing Endangered Languages

Edited by Justyna Olko & Julia Sallabank

Revitalizing Endangered Languages "This guidebook provides ideas and strategies, as well as some background, to help with the effective revitalization of endangered languages. It covers a broad scope of themes including effective planning, benefits, wellbeing, economic aspects, attitudes and ideologies."

We Have a New Site!

With the help of your donations we have been making good progress on designing and launching our new website! Check it out at!
***We are still in our beta stages for the new site--if you have any feedback, be sure to let us know at***

Academic Paper

Title: Lörres, Möppes, and the Swiss. (Re)Discovering regional patterns in anonymous social media data
Author: Christoph Purschke
Author: Dirk Hovy
Linguistic Field: Sociolinguistics
Subject Language: German
Abstract: We study regional similarities and differences in language use on an anonymous mobile chat application in the German-speaking area. We use a neural network on 2.3 million online conversations to automatically learn representations of words and cities. These linguistic-use-based representations capture regional distinctions in a high-dimensional vector space that can be clustered and visualized to discover patterns in the data. We find that the resulting regional patterns are closely linked to the traditional division of German dialects, even though most of the conversations are written in standard German. The resulting maps correspond to traditional dialect divisions and language-external spatial structures, with a few notable exceptions that can be explained through external factors.

Our method also facilitates two qualitative analyses, allowing us to discover geographically-pertinent words for various regional levels, as well as creating regional group-specific style profiles based on various linguistic resources. The results of our study strongly suggest the existence of region-specific patterns of language use (“digital regiolects”) representing distinctive strategies of linguistic stylization in relation to linguistic resources and topics. As a methodological contribution, we show how linguistic theory can drive the application and direction of neural network-based representation learning, and how their judicious application provides the basis for qualitative analysis of large-scale data collections.


This article appears IN Journal of Linguistic Geography Vol. 7, Issue 2, which you can READ on Cambridge's site .

Return to TOC.

View the full article for free in the current issue of
Cambridge Extra Magazine!
Add a new paper
Return to Academic Papers main page
Return to Directory of Linguists main page