LINGUIST List 16.1505
|
Wed May 11 2005
Disc: Re: A Challenge to the Minimalist Community
Editor for this issue: Michael Appleby
<michael linguistlist.org>
|
To post to LINGUIST, use our convenient web form at http://linguistlist.org/LL/posttolinguist.html.
|
Directory
1. Carson
Schutze,
Re: A Challenge to the Minimalist Community
2. Charles
Yang,
Re: A Challenge to the Minimalist Community
3. Anjum
Saleemi,
Re: A Challenge to the Minimalist Community
Message 1: Re: A Challenge to the Minimalist Community
|
Date: 10-May-2005
From: Carson Schutze <cschutze ucla.edu>
Subject: Re: A Challenge to the Minimalist Community
I see my attempt at a simple metaphor has gone awry. And we seem to be spiraling down into a general discussion of "How can proponents of theory X ever show that it is right/wrong/nonvacuous etc.." Over the years such discussions on the List have not been very fruitful, in my opinion. But I think the Sproat & Lappin challenge raised a much more specific point that risks getting lost. [Sorry for consuming so much bandwidth. I don't foresee the need to say anything further. And let me acknowledge Richard Sprout for some off-list discussion that helped me to clarify some points; he is of course not responsible for anything I say below.] Emily Bender drew the following conclusion from my metaphor: > If I've understood the point of this analogy, it is that building a system > which can take UG and some natural language input and produce a > grammar which can be used to assign structures to (at least the > grammatical) strings in some corpus of language is somehow outside > the original point of what P&P was trying to do. No, that was not the point. The point was that trying to compare the success of two systems (vehicles) at accomplishing a single task (going really fast) is pretty meaningless if you totally ignore all the other things the systems can or cannot do, e.g. support family transportation needs (something that one of the candidates--Corvette, was never designed to do and shows no signs of being able to do). [Of course opinions differ on whether something shows signs of being able to do X--see below.] This is not to say that going fast was not *a* goal in the design of the SUV as well (does anyone ever design a vehicle with the intent of it NOT being able to go fast? perhaps a go- kart), it's simply that other desiderata were considered higher priorities to worry about first (for what many of us consider principled reasons). Just to be crystal clear (and I don't pretend to speak for all P&Pers here): I have no objection with the suggestion that P&P might benefit by trying to build a wide-coverage parser, or implement aspects of the theory in some other way, or pursue proofs as to whether it is capable of (learning to) parse. Others may have strong feelings that this would be unproductive at this stage, I'm agnostic, that's not relevant to my point. My point is that the comparison, which was fairly explicit in S&L's original posting, between P&P and statistical (and other, though they focused on statistical) parsers doesn't make sense. Here's some text from the challenge: > What is particularly notable about the Klein-Manning grammar > induction procedures is that they do what Chomsky and others > have argued is impossible: They induce a grammar using general > statistical methods which have few, if any, built-in assumptions > that are specific to language. To even debate this, we would have to establish a definition for "grammar"; earlier in the paragraph this system is described as inferring a "parser", which, as has been discussed, is crucially not the same thing under usual interpretations of these terms. The important point is the suggestion that some 'alternative(s)' to P&P can supposedly do "what Chomsky and others have argued is impossible ... induce a grammar". Here we have a comparison based on a false premise, it seems to me. What is the evidence that the Klein/Manning algorithms induce a grammar that has the properties Chomsky argued required innate structure to learn? All we've been told about it is that it parses some corpora at some rate less than 80% but is "quickly converging" on that level of accuracy. No one in P&P ever claimed that inducing the ability to parse a representative subset of a corpus of everyday speech to a certain approximation (given POS tags) required innate linguistic machinery. That's not the basis of any poverty-of-the-stimulus argument. We haven't even been told whether this statistical learner systematically distinguishes well-formed from ill- formed novel input, a sine qua non for the sorts of systems Chomsky is talking about. Later on we find the following > If the claims on behalf of P&P approaches are to be taken seriously, > it is an obvious requirement that someone provide a computational > learner that incorporates P&P mechanisms, and uses it to > demonstrate learning of the grammar of a natural language. > > **With this in mind, we offer the following challenge to the > community.** > > We challenge someone to produce, by May of 2008, a working P&P > parser that can be trained in a supervised fashion on a standard > treebank, such as the Penn Treebank, and perform in a range > comparable to state-of-the-art statistical parsers. What are we to make of "with this in mind" as a connective between the upper (and preceding) paragraphs and the lower? The former talks about learning a grammar of a natural language. The latter talks about correctly parsing 90% of examples sampled from some corpus the system was trained on. Accomplishing the very narrow parsing task in S&L's challenge hardly tells us anything about whether some system is or is not able to learn a natural language grammar, so if our goal is really studying how humans acquire grammars, the challenge is virtually irrelevant to that goal. I suppose that someone of the S&L persuasion might sum up the argument thus [I'm speaking purely hypothetically, following the lead of S&L in suggesting what "the other side" might say:] "How do humans learn and parse human language? Chomsky says this ability relies on innate language-specific knowledge. But *we* have statistical systems that we claim can achieve part of what humans do, without any innate language-specific knowledge. We've solved/are on the verge of solving (part of) the problem you said only your approach could solve, so you'd better convince us that at the very least you can indeed solve that problem too. Then we'll have two promising theories that we can try out on other parts of the bigger problem." To show what's wrong with this, despite some trepidation I cannot resist one final vehicular analogy. "What makes a car work in its primary function (as a self-propelled device)? You claim that an engine is absolutely crucial. Now we observe that one of the properties that cars have is that if you push them, they will roll for a while (e.g. when the battery is dead). I've built a contraption (a little red wagon, say) that will roll for a while if you push it. Therefore, your claim that an engine is necessary to make a car work is now seriously in jeopardy, because my little red wagon doesn't have an engine, and look, it rolls almost as well as a fast car, and better than an SUV. We should explore little red wagons as alternatives to cars." To avoid misinterpretation: engine = innate knowledge roll on wheels = (learn to) approximately parse a corpus after training on it self-propulsion = acquiring human language car = human: can do lots of things, of which rolling after a push is one, and obviously not totally unrelated to its critical function of self- propulsion, but not one of the more difficult things to get it to do either SUV = current-day P&P model, according to S&L, who might say it doesn't roll at all Carson Linguistic Field(s): Computational Linguistics Discipline of Linguistics Linguistic Theories
Message 2: Re: A Challenge to the Minimalist Community
|
Date: 11-May-2005
From: Charles Yang <charles.yang alum.mit.edu>
Subject: Re: A Challenge to the Minimalist Community
I would like to add two points to the current discussion. First, the challenge probably has been met - and many years ago. Broad coverage parsers based on Government Binding / Minimalism DO exist. The earliest commercial application I am aware of was Bob Kuhns' GB parser that was used to summarize newswire stories in the 1980s, published at the COLING conference in 1990. A more glaring omission is Dekang Lin's Principles & Parameters based parsers - unambiguously dubbed PRINCIPAR and MINIPAR respectively - which have been used in a variety of applications, and have figured prominently in computational linguistics. For instance, for the task of pronoun antecedent resolution, Lin's P&P-based system compared favorably against the much larger and expensive programs at DARPA's 6th Message Understanding Conference (MUC) in 1995. One of the reasons for its success was the implementation of - God forbid - the binding theory, in addition to other discourse constraints on pronoun resolution. MINIPAR is a parsing system based on the Minimalist formalism, and has been around for at least 8 years: I evaluated - and recommended - the parser for a major computer company in the summer of 1997. According to Lin's website, http://www.cs.ualberta.ca/~lindek/minipar.htm, ''MINIPAR is a broad- coverage parser for the English language. An evaluation with the SUSANNE corpus shows that MINIPAR achieves about 88% precision and 80% recall with respect to dependency relationships. MINIPAR is very efficient, on a Pentium II 300 with 128MB memory, it parses about 300 words per second.'' You can even download a copy. I suspect that no reward is necessary: Dekang Lin is currently at Google, Inc. My second point has to with the success of statistical parsing. In my experience, most linguists don't give a damn about parsing, or computers, for that matter: they are not paid to develop technologies that may one day interest Microsoft. Yet I invite those who are in the business of (statistical) parsing to reflect on their success. On my view, the improvement in parsing quality over the past decade or so has less to do with breakthroughs in machine learning, but rather with the enrichment in the representation of syntactic structures over which statistical induction can take place. The early 1990s parsers using relatively unconstrained stochastic grammars were disastrous (Charniak 1993). By the mid 90s, notions like head and lexical selection, both of which are tried and true ideas in linguistics, had been incorporated in statistical parsers (de Marcken 1996, Collins 1997). The recent, and remarkable, work of Klein and Manning (2002) takes this a step further. So far as I can tell, in the induction of a grammatical constituent, Klein & Manning's model not only keeps track of the constituent itself, but also its aunts and sibling(s) in the tree structure. These additional structures is what they refer to as ''context''; those with a more traditional linguistics training may recall ''specifier'', ''complement'', ''c-command'', and ''government''. If this interpretation is correct, then the rapid progress in statistical parsing offers converging evidence that the principles and constraints linguists have discovered are right on the mark, and if one wishes, can be put into use for practical purposes. (And perhaps linguists deserve a share of the far larger pot of research funds available to natural language engineers.) This, then, would seem to be a time to rejoice and play together, rather than driving a wedge of ''challenge'' between the two communities. Charles Yang Yale University References Charniak, E. 1993. Statistical natural language processing. Cambridge, MA: MIT Press. Collins, M. 1997. Three generative, lexicalized models for statistical parsing. ACL97, Madrid. de Marcken, C. 1995. On the unsupervised induction of phrase structure grammars. Proceedings of the 3rd workshop on very large corpora. Cambridge, MA. Klein, D & Manning, C. 2002. Natural language grammar induction using a constituent-context model. NIPS 2001. Linguistic Field(s): Computational Linguistics Discipline of Linguistics Linguistic Theories
Message 3: Re: A Challenge to the Minimalist Community
|
Date: 11-May-2005
From: Anjum Saleemi <saleemi ncnu.edu.tw>
Subject: Re: A Challenge to the Minimalist Community
Much of the recent discussion about Minimalism reminds me of a prevalent trend witnessed many times before on the LINGUIST list in the course of other similar discussions. As linguists we seem to be far too much driven by some supposedly significant methodological and computational imperatives, or even by mere notational determinants. My recollection of most past debates of this nature is that eventually they often deteriorate into sterile argumentation. While issues bearing on methodology, computational tractability, and so forth, should remain important, surely none of them can be considered to constitute a decisive testing ground for what is or isn't a good theory. Usually we come to know that a theory is good only after the fact, that is, after it has been formulated and found to be successful (and, therefore, true). As John Framptom and others have implied in some of the recent postings, a good parser is primarily just that: a good parser. How exactly to anticipate the success (or otherwise) of a linguistic theory even before it has been fleshed out is a question that's not only unfair but misguided: if we already knew what a good theory in a relatively unexplored domain was supposed to look like, we wouldn't be in the business of striving for one in the first place! In the end, the generative paradigm may indeed turn out to be wrong, but over the decades it has provided most of the leading ideas in our field, and has in addition helped us dig up a lot of new data. To the extent that this is any indication of eventual success, I believe it wouldn't be wise to let its fate be judged by any programming sleights of hand. Anjum Saleemi National Chi Nan University Taiwan Linguistic Field(s): Computational Linguistics Discipline of Linguistics Linguistic Theories
Respond to list|Read more issues|LINGUIST home page|Top of issue
|
|

Please report any bad links or misclassified data
LINGUIST Homepage | Read
LINGUIST | Contact us

While the LINGUIST List makes every effort to ensure the linguistic relevance of sites listed on its pages, it cannot vouch for their contents.
|
|