|Title:||Re: A Challenge to the Minimalist Community|
|Description:||On Mon, 9 May 2005, Marc Hamann wrote:
A number of people have responded to Sproat and Lappin's
challenge with the objection that P & P has as its object of study the
human language faculty in its full generality, and that it is therefore
unreasonable to expect it to do well or for it to be relevant on a
small, particular subset of language phenomena as represented by a
Unfortunately this is a rather peculiar notion of what it means to be
"more general" or "most general". Generality of a theory is usually a
claim that you can handle or explain ALL possible instances of the
phenomenon you cover. It is surely MUCH easier to show that you
can handle one such instance, especially the somewhat restricted
instance represented by a limited corpus.
I must admit that, having taken some courses in computational
linguistics as an undergraduate and found myself not particularly
adept at the subject, I don't have any strong feelings on the subject of
Sproat and Lappin's challenge. (Certainly I know that *I* won't be any
help in meeting it.)
But it seems to me that Hamann's objection to the objection is not
entirely fair. Certainly if a theory has *achieved* full generality, then it
can explain the instances of a limited corpus. But if a theory's *goal* is
full generality, that doesn't mean it's already prepared to handle any
given limited instance.
To oversimplify: suppose that there are 100 facts (numbered 1 to
100), such that any complete theory of language explains all 100
facts; suppose that a limited corpus covers the facts from 1 to 10. A
statistical parser might cover 90% of this limited corpus; that's fairly
successful in terms of covering the corpus, perhaps, though it only
really covers 9% of language as a whole. Meanwhile, the theory of
Principles and Parameters, as it currently stands, can only explain
every fourth point of its domain (i.e. facts 4, 8, 12, ..., 96, 100).
P&P (Principles and Parameters) covers 25% of language, and in that
sense is more successful than the statistical parser's 9%. But a
parser built on P&P will cover a mere 20% of the limited corpus,
making it seem far less successful than the statistical parser's 90%.
This is, of course, a wild simplification. Language doesn't break down
into a hundred simple independent points, nor does a corpus contain
a simple 10% of the range of linguistic facts, nor does...and so forth.
Nevertheless, I hope that as an analogy it might explain the flaw I see
in Hamann's reasoning: a theory dedicated to explaining "the human
language faculty in its full generality" is not necessarily well-suited to
explaining "a small, particular subset of language phenomena"; and
more to the point, that failure is not a failure of the program.
Discipline of Linguistics