* * * * * * * * * * * * * * * * * * * * * * * *
LINGUIST List logo Eastern Michigan University Wayne State University *
* People & Organizations * Jobs * Calls & Conferences * Publications * Language Resources * Text & Computer Tools * Teaching & Learning * Mailing Lists * Search *
* *
LINGUIST List 16.1491

Tue May 10 2005

Disc: Re: A Challenge to the Minimalist Community

Editor for this issue: Michael Appleby <michaellinguistlist.org>


To post to LINGUIST, use our convenient web form at http://linguistlist.org/LL/posttolinguist.html.
Directory
        1.    Lance Nathan, Re: A Challenge to the Minimalist Community
        2.    Peter Svenonius, Re: A Challenge to the Minimalist Community
        3.    John Frampton, Re: A Challenge to the Minimalist Community


Message 1: Re: A Challenge to the Minimalist Community
Date: 10-May-2005
From: Lance Nathan <tahnanMIT.EDU>
Subject: Re: A Challenge to the Minimalist Community


On Mon, 9 May 2005, Marc Hamann wrote:

> A number of people have responded to Sproat and Lappin's
> challenge with the objection that P & P has as its object of study the
> human language faculty in its full generality, and that it is therefore
> unreasonable to expect it to do well or for it to be relevant on a
> small, particular subset of language phenomena as represented by a
> corpus.
>
> Unfortunately this is a rather peculiar notion of what it means to be
> "more general" or "most general". Generality of a theory is usually a
> claim that you can handle or explain ALL possible instances of the
> phenomenon you cover. It is surely MUCH easier to show that you
> can handle one such instance, especially the somewhat restricted
> instance represented by a limited corpus.

I must admit that, having taken some courses in computational
linguistics as an undergraduate and found myself not particularly
adept at the subject, I don't have any strong feelings on the subject of
Sproat and Lappin's challenge. (Certainly I know that *I* won't be any
help in meeting it.)

But it seems to me that Hamann's objection to the objection is not
entirely fair. Certainly if a theory has *achieved* full generality, then it
can explain the instances of a limited corpus. But if a theory's *goal* is
full generality, that doesn't mean it's already prepared to handle any
given limited instance.

To oversimplify: suppose that there are 100 facts (numbered 1 to
100), such that any complete theory of language explains all 100
facts; suppose that a limited corpus covers the facts from 1 to 10. A
statistical parser might cover 90% of this limited corpus; that's fairly
successful in terms of covering the corpus, perhaps, though it only
really covers 9% of language as a whole. Meanwhile, the theory of
Principles and Parameters, as it currently stands, can only explain
every fourth point of its domain (i.e. facts 4, 8, 12, ..., 96, 100).

P&P (Principles and Parameters) covers 25% of language, and in that
sense is more successful than the statistical parser's 9%. But a
parser built on P&P will cover a mere 20% of the limited corpus,
making it seem far less successful than the statistical parser's 90%.

This is, of course, a wild simplification. Language doesn't break down
into a hundred simple independent points, nor does a corpus contain
a simple 10% of the range of linguistic facts, nor does...and so forth.
Nevertheless, I hope that as an analogy it might explain the flaw I see
in Hamann's reasoning: a theory dedicated to explaining "the human
language faculty in its full generality" is not necessarily well-suited to
explaining "a small, particular subset of language phenomena"; and
more to the point, that failure is not a failure of the program.

--Lance Nathan


Linguistic Field(s): Computational Linguistics
Discipline of Linguistics
Syntax
Message 2: Re: A Challenge to the Minimalist Community
Date: 10-May-2005
From: Peter Svenonius <peter.svenoniushum.uit.no>
Subject: Re: A Challenge to the Minimalist Community


As a theoretical linguist, I remain unconvinced from the discussion so
far that building a parser of the kind proposed by Sproat & Lappin
(LINGUIST 16-1156) would be as important as they suggest. The
proposal, if I understand it correctly, is to get a computer to match a
corpus of e.g. newspaper texts to a set of ''hand-constructed'' trees
for the sentences in that text.

The allowable training procedure consists in feeding the machine
pairs of sentences and trees, I gather. Unless the trees more
information than is usual, it is not clear that this procedure resembles
what a child does when learning a language. Recent acquisitional
work stresses the importance of child-directed speech in the
acquisition process, and the importance of supporting context. An
important clue to the difference between ''wipe'' and ''clean'' (to take a
well-studied example) is the contexts in which they're used. The
meaning difference, inferrable from the contexts of use, has subtle
syntactic effects that might or might not turn up in strings in a given
corpus. But such contextual evidence, abundant to the learner, is
necessarily ignored in the proposed scenario, because the trees
don't indicate what kind of thematic relation an object has to the event
it participates in. Certain aspects of intonation also turn out to be
extremely important in the acquisition process, but intonation is barely
indicated at all in written texts, and is underdetermined in standard
trees.

So the proposal seems to be to build a machine that works like
another machine (i.e. the kind that Sproat & Lappin have in mind), not
to build a machine that works like a human. There is a good chance
that such an exercise would simply fail to advance our understanding
of the human language faculty, the way the program Eliza fails to
advance our understanding of human intelligence.

I suppose that to make a human-like learning machine, I would first
want to build a corpus that resembled the actual input to which a child
typically attends, with intonation and supporting context. The input
would include such information as whether a discourse referent was
the same as one previously referred to or not, and whether a
discourse referent appeared to be proactive or simply passive in its
participation in a given event. These might be important clues for a
child deciding whether something is a definite article or whether
something is the syntactic subject (and these two matters might be
interrelated).

Then I would use that corpus as the training ground for testing my
simulacrum, because P&P (Principles and Parameters) theory is not
trying to describe a Language Acquisition Device that can learn a
language from the Wall Street Journal (with or without labeled
brackets), but a Language Acquisition Device that can learn a
language from a learning environment like the one described in the
preceding paragraph.

If my concerns are well-founded, then building a parser of the kind
described by Sproat & Lappin would not even be a milestone on the
road to a workable model of language; it would be a detour.

Peter Svenonius
CASTL (Center for Advanced Study in Theoretical Linguistics)
University of Tromsoe, Norway


Linguistic Field(s): Language Acquisition
Syntax

Message 3: Re: A Challenge to the Minimalist Community
Date: 10-May-2005
From: John Frampton <j.framptonneu.edu>
Subject: Re: A Challenge to the Minimalist Community


Suppose I grant that "your parser can beat my parser". What should
we conclude? What we are interested in is theories of language, not
parsers. The suitability of a theory of language to serve as the basis
for a parser is one factor that weighs in the balance in judging
theories. So what I want to know is whether "your theory of syntax
can beat my theory of syntax".

What alternatives to Minimalist Syntax are on the table? What
theories is the discussion about? Unless the discussion is about
comparision of theories of syntax, it is irrelevant.


Linguistic Field(s): Computational Linguistics
Discipline of Linguistics
Syntax



Respond to list|Read more issues|LINGUIST home page|Top of issue




Please report any bad links or misclassified data

LINGUIST Homepage | Read LINGUIST | Contact us

NSF Logo

While the LINGUIST List makes every effort to ensure the linguistic relevance of sites listed
on its pages, it cannot vouch for their contents.