Discussion Details
| Title: | Re: A Challenge to the Minimalist Community |
| Submitter: | Lance Nathan |
| Description: | On Mon, 9 May 2005, Marc Hamann wrote:
A number of people have responded to Sproat and Lappin's challenge with the objection that P & P has as its object of study the human language faculty in its full generality, and that it is therefore unreasonable to expect it to do well or for it to be relevant on a small, particular subset of language phenomena as represented by a corpus. Unfortunately this is a rather peculiar notion of what it means to be "more general" or "most general". Generality of a theory is usually a claim that you can handle or explain ALL possible instances of the phenomenon you cover. It is surely MUCH easier to show that you can handle one such instance, especially the somewhat restricted instance represented by a limited corpus. I must admit that, having taken some courses in computational linguistics as an undergraduate and found myself not particularly adept at the subject, I don't have any strong feelings on the subject of Sproat and Lappin's challenge. (Certainly I know that *I* won't be any help in meeting it.) But it seems to me that Hamann's objection to the objection is not entirely fair. Certainly if a theory has *achieved* full generality, then it can explain the instances of a limited corpus. But if a theory's *goal* is full generality, that doesn't mean it's already prepared to handle any given limited instance. To oversimplify: suppose that there are 100 facts (numbered 1 to 100), such that any complete theory of language explains all 100 facts; suppose that a limited corpus covers the facts from 1 to 10. A statistical parser might cover 90% of this limited corpus; that's fairly successful in terms of covering the corpus, perhaps, though it only really covers 9% of language as a whole. Meanwhile, the theory of Principles and Parameters, as it currently stands, can only explain every fourth point of its domain (i.e. facts 4, 8, 12, ..., 96, 100). P&P (Principles and Parameters) covers 25% of language, and in that sense is more successful than the statistical parser's 9%. But a parser built on P&P will cover a mere 20% of the limited corpus, making it seem far less successful than the statistical parser's 90%. This is, of course, a wild simplification. Language doesn't break down into a hundred simple independent points, nor does a corpus contain a simple 10% of the range of linguistic facts, nor does...and so forth. Nevertheless, I hope that as an analogy it might explain the flaw I see in Hamann's reasoning: a theory dedicated to explaining "the human language faculty in its full generality" is not necessarily well-suited to explaining "a small, particular subset of language phenomena"; and more to the point, that failure is not a failure of the program. --Lance Nathan |
| Date Posted: | 10-May-2005 |
| Linguistic Field(s): |
Computational Linguistics
Syntax Discipline of Linguistics |
| LL Issue: | 16.1491 |
| Posted: | 10-May-2005 |

