Publishing Partner: Cambridge University Press CUP Extra Publisher Login

Discussion Details




Title: Re: 16.1288, Disc: Re: A Challenge to the Minimalist Communi
Submitter: Stefan Müller
Description: Carson Schutze:
  So, I would like to suggest a revised version of the challenge that
  incorporates a second corpus consisting of ungrammatical
sentences that
  are to be identified as such. (Earlier P&P parsers such as Fong's
were
  designed to do this, but it's not obvious that this ability will easily
scale up
  with broader coverage, so I don't think this is a sucker's bet.)
Furthermore,
  since the computationalists got to choose the corpus of good
sentences,
  it would seem only fair that the theoreticians get to choose the
corpus of
  bad sentences :-)

This is a very important point and negative data has been collected
and is used to evaluate deep linguistic processing.

A nice software for evaluating systems and working with test suite
databases can be found at:

http://www.delph-in.net/itsdb/

Test suites for German, English, French, Spanish, and other
languages are also available there.

You may find test suites for German at:

http://www.cl.uni-bremen.de/Software/TS/

These test suites contain (normalized) examples from the descriptive
literature, P&P, HPSG, and other theoretical literature. With the [incr
TSDB()] software it is possible to get a selection of sentences that is
relevant for a certain phenomenon. Sentences are crossclassified
according to the phenomena they are relevant for.

The idea is to develop these collections further into a generally
accepted benchmark for linguistic theories in general and for deep
linguistic processing in particular. Of course the negative sentences
can be used to check what statistical parsers have to say about them
in comparison to the well-formed examples.

So if somebody has a look at the German collection and wants to
contribute, please send me the relevant examples and pointers to the
publications in which the examples are discussed.

Best wishes

Stefan Müller
Universität Bremen/Fachbereich 10

http://www.cl.uni-bremen.de/~stefan/

http://www.cl.uni-bremen.de/~stefan/Babel/Interaktiv/
Date Posted: 29-Apr-2005
Linguistic Field(s): Computational Linguistics
Linguistic Theories
Discipline of Linguistics
Language Specialty: German
LL Issue: 16.1364
Posted: 29-Apr-2005

Search Again

Back to Discussions Index