LINGUIST List 24.25

Mon Jan 07 2013

Calls: Typology, Computational Linguistics/Germany

Editor for this issue: Alison Zaharee <alisonlinguistlist.org>

From: Sebastian Nordhoff <sebastian.nordhoffglottotopia.de>
Subject: Linked Data in Typology ALT10
Full Title: Linked Data in Typology ALT10
Short Title: LDT

Date: 15-Aug-2013 - 18-Aug-2013
Location: Leipzig, Germany
Contact Person: Sebastian Nordhoff
Meeting Email: < click here to access email >
Web Site: http://www.eva.mpg.de/lingua/conference/2013_ALT10/files/theme_sessions.html

Linguistic Field(s): Computational Linguistics; Typology

Call Deadline: 15-Jan-2013

Meeting Description:

Typology lives on data. Typologists produce, curate, extract, aggregate, and analyze data on a daily basis. One major issue is the interoperability of digital data thus gathered. This workshop will deal with the production, publication, and interlinking of typological data according to Semantic Web principles (Linked Open Data).

Several attempts at standardizing typological data have been made, e.g. LDS (Comrie & Smith 1977) and GOLD (Farrar and Langendoen 2003). These top-down approaches have had some success, but a large scale adoption is still wanting. A bottom-up approach as for instance employed by TDS (http://tds2.dans.knaw.nl/) and ISO-CAT (http://www.isocat.org/) could be more promising as it takes into account the often strong feelings linguists have about data categories.

Numerous projects around the world gather heterogeneous typological data, but data representation is by and large project-specific and not guided by general principles. This often results in serious problems over time, including issues with regard to persistence, provenance, interoperability, and accessibility.

These problems are well-known in other data-heavy subdisciplines, e.g. lexicography and corpus linguistics. The lemon project (McCrae et al. 2012) tackles these issues for lexicography, OLiA does the same for corpus linguistics (Chiarcos 2012). In this workshop, we want to explore in how far the solutions developed in the other subdisciplines can be applied to typology, building upon more general concepts of interlinking heterogeneous data sets in the context of Linked Open Data (Berners-Lee 2006, Heath & Bizer 2009).

The working group on Open Data in Linguistics of the Open Knowledge Foundation has recently started working on interlinking data from various subdisciplines (Chiarcos et al. 2012a). The insights and experiences gained there can fruitfully be applied to typology, as the integration of WALS, WOLD, ASJP, Glottolog, and IDS into the Linguistic Linked Open Data Cloud show (Nordhoff 2012, Hellmann et al. forthcoming). Chiarcos et al. (2012b) show how such data can then be cross-queried across knowledge bases to gain new insights and test hypotheses.

The major advantages of the Linked Open Data approach advocated in Chiarcos et al. (2012a) are the potentials of cross-querying data, and the possibility of a federated approach to data production (crowdsourcing).

The aim of this workshop is to bring together typologists who create or curate large data sets and practitioners of Linked Open Data, to leverage the potential of creating a linked data cloud for linguistic typology.

Call for Papers:

We welcome presentations about novel techniques of publishing data on the web, about interlinking and cross-querying databases, and about federating data production.


Send your abstract as an email attachment to: ALT10 AT eva.mpg.de.

Subject header: (your name) ALT 10 abstract

Include these things in the body of the email:

- Authors’ names
- Abstract title
- Contact information: email, phone, fax

Note: One individual may be involved in a maximum of two abstracts (maximum of one as sole author), regardless of category (oral, poster, theme-session talk).

Maximum length: 500 words or 1 single-spaced page

Please put this information at the top of your abstract:

- Abstract title
- Abstract category (oral, poster, oral/poster)
- Theme session

Format: If at all possible, please send your abstract as a pdf.

Name: Give your pdf a filename similar to the subject header.

Anonymity: Abstracts must be anonymous: do not put your name or other identifying information on the abstract. Also, please anonymize your pdf by removing identifying information.

Further Information:



Berners-Lee, Tim. 2006. Design Issues: Linked Data. July 2006. http://www.w3.org/DesignIssues/LinkedData.html

Chiarcos, Christian. 2012. Ontologies of Linguistic Annotation: Survey and Perspectives. LREC 2012, Istanbul.

Chiarcos, Christian, Nordhoff, Sebastian & Hellmann, Sebastian (eds.). 2012a. Linked Data in Linguistics: Representing and Connecting Language Data and Language Metadata. Heidelberg: Springer.

Chiarcos, Christian, Hellmann, Sebastian & Nordhoff, Sebastian. 2012b. Linking Linguistic Resources: Examples from the Open Linguistics Working Group. In Chiarcos et al. (eds.) 2012a.

Comrie, Bernard & Smith, Norval. 1977. The Lingua Descriptive Studies Questionnaire. Lingua 41. 1-74.

Farrar, Scott & Langendoen, Terry. 2003. A linguistic ontology for the semantic web. GLOT International 7. 200-203.

Heath Tom & Bizer, Chris. 2011. Linked Data - Evolving the Web into a Global Data Space. San Rafael: Morgan & Claypool.

Hellmann, Sebastian, Moran, Steven, Brümmer, Martin, McCrae, John (eds.). Forthcoming. Multilingual Linked Open Data. Special Issue of the Semantic Web Journal.

McCrae, John, Montiel-Ponsoda, Elena & Cimiano, Philipp. 2012. Integrating WordNet and Wiktionary with lemon. In Chiarcos et al. (eds.) 2012a.

Nordhoff, Sebastian. 2012. Linked Data for Linguistic Diversity Research: Glottolog/Langdoc and ASJP Online In Chiarcos et al. (eds.) 2012a.

Page Updated: 07-Jan-2013

