The LINGUIST List is dedicated to providing information on language and language analysis, and to providing the discipline of linguistics with the infrastructure necessary to function in the digital world. LINGUIST is a free resource, run by linguistics students and faculty, and supported primarily by your donations. Please support LINGUIST List during the 2016 Fund Drive.
|Full Title:||Linked Data in Typology @ALT10|
|Start Date:||15-Aug-2013 - 18-Aug-2013|
|Meeting Email:||click here to access email|
|Meeting Description:||Typology lives on data. Typologists produce, curate, extract, aggregate, and analyze data on a daily basis. One major issue is the interoperability of digital data thus gathered. This workshop will deal with the production, publication, and interlinking of typological data according to Semantic Web principles (Linked Open Data).
Several attempts at standardizing typological data have been made, e.g. LDS (Comrie & Smith 1977) and GOLD (Farrar and Langendoen 2003). These top-down approaches have had some success, but a large scale adoption is still wanting. A bottom-up approach as for instance employed by TDS (http://tds2.dans.knaw.nl/) and ISO-CAT (http://www.isocat.org/) could be more promising as it takes into account the often strong feelings linguists have about data categories.
Numerous projects around the world gather heterogeneous typological data, but data representation is by and large project-specific and not guided by general principles. This often results in serious problems over time, including issues with regard to persistence, provenance, interoperability, and accessibility.
These problems are well-known in other data-heavy subdisciplines, e.g. lexicography and corpus linguistics. The lemon project (McCrae et al. 2012) tackles these issues for lexicography, OLiA does the same for corpus linguistics (Chiarcos 2012). In this workshop, we want to explore in how far the solutions developed in the other subdisciplines can be applied to typology, building upon more general concepts of interlinking heterogeneous data sets in the context of Linked Open Data (Berners-Lee 2006, Heath & Bizer 2009).
The working group on Open Data in Linguistics of the Open Knowledge Foundation has recently started working on interlinking data from various subdisciplines (Chiarcos et al. 2012a). The insights and experiences gained there can fruitfully be applied to typology, as the integration of WALS, WOLD, ASJP, Glottolog, and IDS into the Linguistic Linked Open Data Cloud show (Nordhoff 2012, Hellmann et al. forthcoming). Chiarcos et al. (2012b) show how such data can then be cross-queried across knowledge bases to gain new insights and test hypotheses.
The major advantages of the Linked Open Data approach advocated in Chiarcos et al. (2012a) are the potentials of cross-querying data, and the possibility of a federated approach to data production (crowdsourcing).
The aim of this workshop is to bring together typologists who create or curate large data sets and practitioners of Linked Open Data, to leverage the potential of creating a linked data cloud for linguistic typology.
|Linguistic Subfield:||Computational Linguistics; Typology|
| This is a session of the following meeting:
Association for Linguistic Typology Biennial Conference
|Calls and Conferences main page|