|








|
New: Working Group Responses
Request for the Markup Working Group:
LINGUIST
has received funding from the National Science Foundation to digitize
data from ten minority languages, as part of the general E-MELD
project. An essential component of this work is to mark
up the data in a form which allows the maximum amount of interchangeability
and interoperability between different linguistic servers, and this entails
that we come to a consensus on what best practice is in this area. What
we are soliciting from you, the members of the Markup Working Group, is
suggestions about the kind of markup our software should be designed to
handle (morphological markup, annotation for sound alignment, formatting),
and what the nature of that markup should be. Essentially, we would like
you to try out one of two variant markups which we
hope to have for you, and write a brief (1 page or less)
report which we can use as a springboard for discussion at the workshop.
|
|
The
two sets we have you use are:
- The
Dobes Markup (Dokumentation Bedrohter Sprachen) The goal
of this project is to document endangered languages, and is funded by
the VolkswagenStiftung.
- The LACITO Markup
(Langues et Civilizations
à Tradition Orale): The goal of the LACITO is to archive linguistic
documents associating transcription and recorded speech in a format
which guarantees their conservation and their availability for research,
and disseminate the results.
We hope you will:
- Try them out:
Please take some of your data and simply try to mark it up using one
of these schemes. If you have no suitable resource to annotate, just
look at one or both of the sets and try to draw some conclusions. We
are interested in the answers to questions like:
- Are the tags
and attributes clearly named and described?
- Do they allow
you to target the right information--i.e, the aspects of your data
that you consider important and/or that other linguists might want
to search for?
- Do you think
the system(s) would be reasonably easy to use, given appropriate
software, e.g., a tag editor?
- Do you have
any other suggestions for markup schema? For example, what markup
are you currently using on your data, and how does it compare to
these?
- Write
a brief (1 page) report of your results:
If you will email your report to Helen
Aristar-Dry (hdry@linguistlist.org) by June 14, we will put it on
the website prior to the workshop. Otherwise, we ask you to bring 12
copies of your report to the workshop. Your conclusions and suggestions
will be the springboard for the discussion in the Metadata Working Group
sessions.
We have put together a
page providing some background on markup, in which we attempt briefly
to answer questions like:
- What is mark up?
- What other standards exist?
- Do we all have to use
the same markup system?
For additional information,
consult the pages on linguistic
annotation at the Linguistic Data Consortium.
:
|