LINGUIST List 23.819

Sat Feb 18 2012

Calls: Discipline of Linguistics/Turkey

Editor for this issue: Alison Zaharee <>

Date: 17-Feb-2012
From: Thorsten Trippel <>
Subject: Workshop on Describing Language Resources with Metadata
E-mail this message to a friend

Full Title: Workshop on Describing Language Resources with Metadata
Date: 22-May-2012 - 22-May-2012 Location: Istanbul, Turkey Contact Person: Thorsten Trippel
Meeting Email: < click here to access email >
Web Site:
Linguistic Field(s): Discipline of Linguistics

Call Deadline: 24-Feb-2012

Meeting Description:

Workshop on Describing Language Resources with Metadata: Towards Flexibility and Interoperability in the Documentation of Language ResourcesTo be held in conjunction with the 8th International Language Resources and Evaluation Conference (LREC 2012)22 May 2012Lütfi Kirdar Istanbul Exhibition and Congress Centre, Istanbul, Turkey

The description of Language Resources (LRs) continues to be a crucial point in the life cycle of LRs, and more particularly, in their sustainable exchange. This has been so for a number of repositories or LR distribution centres in place (ELRA, GSK, LDC, OLAC, TST-Centrale, BAS, among others), who house LR catalogues following some proprietary metadata schema. A number of projects and initiatives have also focused these past few years in the sharing of LRs (ENABLER, CLARIN, FLaReNet, PANACEA, META-SHARE), for example for Language Technology (LT).

Based on these initiatives a consensus emerges that shows a number of requirements for standardized metadata:

1. There should be a common publication channel for the LR descriptions in the world.2. This channel allows users to carry out easy and efficient LR data discovery and possible subsequent retrieval of LRs.3. Expert knowledge is required to create the data model for the metadata description.4. Subject matter experts (both researchers and LR/LT providers and developers) are required to provide the content for the data model.5. The data model needs to be clear, expressive, flexible, customizable and interoperable.6. Metadata have to provide for different user groups, ranging from providers to consumers (both individuals and organisations). This applies both to the information contained in the metadata and the supporting tool infrastructure for creating, maintaining, distributing, harvesting and searching the metadata.

Currently several initiatives focus on metadata. From the realm of work done within initiatives like ENABLER and CLARIN descended the Component MetaData Infrastructure (CMDI, ISO TC 37 SC 4 work item for ISO 24622), which allows the combination of standard data categories (for example from ISO 12620, to components, which are combined into metadata profiles. Early versions of this model have been operational in repositories such as ELRA's, which complied with the work done within INTERA. FLaReNet, as the result of a permanent and cyclical consultation, has issued a set of main recommendations where a global infrastructure of uniform and interoperable metadata sets appear among the Top Priorities for the field of LRs. For use within HLT, META-SHARE provides a fully-fledged schema for the description of LRs, in the framework of the component model, covering all the current resource types and media types of use, in all the stages of a resource's life-cycle. Our aim is to learn from one another's experiences and plans in this area.

Making resources available for others and putting this to a second use in other projects has never been more widely accepted as a sensible efficient way to avoid a waste of efforts and resources. However, when it comes to the details, there is still a vast number of problems. This workshop will be a forum to address issues and challenges in the concrete work with metadata for LRs, not restricted to a single initiative for archiving LRs.

The current state of the art for metadata provision allows for a very flexible approach, catering for the needs of different archives and communities, referring to common data category registries that describe the meaning of a data category at least to authors of metadata. Component models for metadata provisions are for example used by CLARIN and META-SHARE, but there is also an increased flexibility in other metadata schemas such as Dublin Core, which is usually not seen as appropriate for meaningful description of language resources.

Call for Papers:

Extended deadline for submission: 24 February 2012

Topics of interest are:

1. Infrastructures for creating components and profiles for metadata2. Editing and creating metadata3. Porting legacy metadata4. Metadata as a resource5. Maintenance of metadata6. Classification of language resources7. Providing metadata concepts8. Creating components and profiles9. Services harvesting and interpreting metadata10. Experience from the large LR data center catalogues: LDC, ELRA, BAS, and how to interoperate with them11. Controlled vocabularies, terminology and metadata description12. Formal models for metadata representation and standardized models of serialisation13. Customization and reuse of metadata schemas14. Plans or experiences with emerging metadata infrastructures as for example from CLARIN & META-SHARE15. Experiences with the Component based metadata infrastructures16. Integration and conversion of multiple repositories: experiences from META-SHARE, CESAR, METANET4U and META-NORD, etc.17. Standardization issues for metadata

We invite submissions for full papers and system demonstrations that address these questions and other related issues relevant to the workshop.

Workshop Programme and Audience Addressed:

This full-day workshop aims at bringing together technology oriented working groups on metadata modeling or schema creation and both researchers and producers creating metadata in the course of their work. Those interested to use metadata in their projects should get the insights and come out with a clear idea of how to either describe their LRs or convert their schema. Those who have developed recently a model can share their experience, and those who have specific concerns with interoperability of metadata schemas as developed by the various initiatives can open the discussion in search for joint solutions.

Tools and the tool infrastructures should also be part of the discussion given that the initiatives provide also editors, mappings, search interfaces, component and profile registries.


Authors should use the START system accessible from and the LREC author's kit for submitting a two-column article of 4 to 8 pages.

For further queries, please contact Victoria Arranz at

Programme Committee:

Helen Aristar-Dry (Eastern Michigan University, USA)Núria Bel (UPF, Barcelona, Spain)Antonio Branco, (University of Lisbon, Portugal)Lars Borin (Språkbanken, Sweden)Khalid Choukri (ELDA/ELRA, Paris, France)Thierry Declerck (DFKI, Germany)Matej Durco (Austrian Academy of Sciences, Austria)Gil Francopoulo (CNRS-LIMSI-IMMI + TAGMATICA, Paris, France)Francesca Frontini (CNR-ILC, Pisa, Italy)Erhard Hinrichs (Univerität Tübingen, Germany)Penny Labropoulou (ILSP-Athena, Athens, Greece)Valérie Mapelli (ELDA/ELRA, Paris, France)Jan Odijk (Universiteit Utrecht, The Netherlands)Elena Pierazzo (Kings College, London, UK)Laurent Romary (INRIA, France)Mike Rosner (University of Malta, Malta)Andreas Witt (IDS, Germany)Peter Wittenburg (MPI, The Netherlands)Tamás Varadi (Hungarian Academy of Sciences, Hungary)Marta Villegas (UPF, Barcelona, Spain)Sue Ellen Wright (Kent State University, USA)

Page Updated: 18-Feb-2012