Featured Linguist!

Jost Gippert: Our Featured Linguist!

"Buenos dias", "buenas noches" -- this was the first words in a foreign language I heard in my life, as a three-year old boy growing up in developing post-war Western Germany, where the first gastarbeiters had arrived from Spain. Fascinated by the strange sounds, I tried to get to know some more languages, the only opportunity being TV courses of English and French -- there was no foreign language education for pre-teen school children in Germany yet in those days. Read more



Donate Now | Visit the Fund Drive Homepage

Amount Raised:

$34890

Still Needed:

$40110

Can anyone overtake Syntax in the Subfield Challenge ?

Grad School Challenge Leader: University of Washington


Publishing Partner: Cambridge University Press CUP Extra Publisher Login

FYI: New Linguistic Corpus of Sina Weibo Messages


Author: Daan van Esch

Linguistic Field(s): Text/Corpus Linguistics

FYI Body: It is my pleasure to announce to you the Leiden Weibo Corpus (LWC),
an annotated linguistic 100-million word corpus containing 5.1 million
messages from Sina Weibo, China’s premier Twitter-like microblogging
service.

The LWC is freely available online at http://lwc.daanvanesch.nl/. Data
for the LWC was collected in January 2012. As such, it contains many
linguistic phenomena that may not be found in older corpora, such as
suffixation with "-ing", an aspect marker borrowed from English.

Furthermore, Sina Weibo messages come with valuable meta data,
such as the gender of the user and his location. This information allows
the LWC to calculate how often words are used in different provinces
and cities across China, which is useful for research into lexical
variation across China.

Naturally, the LWC also supports searching for single words or
grammar patterns, such as "any verb followed by an aspectual particle
and then a noun". This feature may also be of interest to students and
teachers of Mandarin who are looking for example sentences.

Please feel free to forward this announcement to anyone who might be
interested. Any feedback regarding the LWC would be greatly
appreciated; please send it to daanvanesch@gmail.com.

Best wishes,

Daan van Esch
Graduate Student in Chinese linguistics
Leiden University

Back   FYI main page