Publishing Partner: Cambridge University Press CUP Extra Publisher Login

New from Elsevier!


Applied Corpus Linguistics

Edited by Eric Friginal and Paul Thompson

Applied Corpus Linguistics is a new, international peer-reviewed journal for the dissemination of research that reports or supports the applications of corpus linguistics methods, theories, applications, techniques and tools to a wide variety of real-world contexts.

About OLAC

OLAC logo
If you are a linguist, you are no doubt familiar with the difficulty of finding information relevant to your research. To a certain extent, searching the internet has made this situation easier, but you have to spend time searching each archive or website individually. Furthermore, as the same thing can be described in several different ways (dictionary and lexicon; genitive and possessive; Lappish and Sami), you might never find what you are looking for.
In order to make linguistic data more easily accessible, the Open Language Archives Community (OLAC) is assembling an online database, similar to a huge library catalog. In this catalog is stored information on language resources, such as field notes, grammars, audio/video recordings, descriptive papers, and so on. The information is stored as metadata in XML format, which organizes it so that it is easily understood by the OLAC search engine.
OLAC is encouraging linguists everywhere to submit information about what they have; that is, to become 'a data provider'. Even if the resource itself is not available on the internet (a collection of cassettes, for example), people will still be able to find out what resources exist and where to find them.
Become a data provider
So that as many people as possible participate, there are three different ways to provide data.
  • Talkbank's Metamaker: This is the simplest way of becoming a data provider, as no programming knowledge is required. You enter the information using an online form, and this information is automatically converted into a format that the OLAC search engine can read. The Metamaker is suitable for individuals with small projects.
  • The Virtual Data Provider (Vida): If you know XML, Vida is a quicker way of providing data. If you can create an XML document yourself, and put it on a publically-accessible website, Vida can take it and submit it to the OLAC search engine. Vida is suitable for larger projects that have no pre-existing catalog database.
  • Conventional Approach for large archives: If you already have a catalog and programming skills, this is the best method. Here, you implement a software interface to your database, which creates the XML metadata and sends it to the OLAC search engine.
More ways to submit data are being developed, such as one that can generate the data from a spreadsheet.
Learn more about OLAC
  • Frequently Asked Questions about OLAC   Frequently Asked Questions
  • How to join an OLAC discussion list   Join an OLAC list
The OLAC initiative has great potential benefits for the academic linguistics community. But to realize its potential, the organizers will need your continuing advice, participation, and feedback.
If you would like to help with the OLAC enterprise, please let us know!
Thank you in advance for your help!