Discussion Details
| Title: | Re: 16.1156, A Challenge to the Minimalist Communi |
| Submitter: | Sean Fulop |
| Description: | I believe the challenge offered by Sproat and Lappin, for someone to
build a successful Principles and Parameters parser that learns from treebanks, is mixed up in a number of respects. Let me point out some of the ways the original challenge doesn’t take into account the goals of theoretical linguistics, and misses an important distinction between applied natural language processing and theoretical computational linguistics. First, let me say that I am not in favor of the Principles and Parameters framework as a linguistic theory, nor am I in favor of Minimalism (whatever that actually turns out to be). I promote type- logical categorial grammar as a framework for Universal Grammar, but whatever, my goals are broadly the same as those of P&P proponents, and Sproat & Lappin’s challenge might equally be applied to type-logical grammar. There have been some spirited defenses of P&P posted already by various proponents, and I apologize if I overlap with what is said there a little bit in the sequel. P&P is an effort to describe and explain what the human language faculty is, and what it does during learning. Personally, I don’t think there is much hope for its approach, but that remains debatable, so let’s presume it’s a good theory, meaning some future tweak of it will turn out to effectively describe, to every linguists’ satisfaction, the fundamental grammars of every language and how those could be learned given the specific Universal Grammar that would also be needed. Now, who said anything about computational tractability? This is a complete theory, not an efficiency contest. I think Ed Stabler’s work on GB and Minimalism over the years should be convincing enough, that a computational implementation is at least possible, and won’t be tractable because it is bereft of performance limitations. But we have to be happy with our theories when they correctly describe the input- output relation of the human language acquisition process, never mind the actual process. It is hopeless to try to conform to the “actual process” used by the mind, we have no way of knowing what that is or whether our particular computers are even up for the job. Some cognitive scientists have even suggested that the brain is not constrained by computability at all, since mathematical computability is defined in reference to computational procedures that we now fathom. Worrying about tractability in P&P is like denigrating relativity theory because it makes it needlessly harder to calculate artillery ballistics. That’s of course true, but that’s also why no one invokes such a complete theory of relative motion simply to calculate artillery ballistics. Sproat and Lappin note that learning from treebanks is supervised, meaning the parser is trained by a subset of the right answers. They go on to reference work by Klein and Manning which induces grammars in an “unsupervised” fashion from text. Well first of all, it is still debatable whether anything can ever actually do this (see the algorithmic learning theory literature, summarized in Jain et al. 1999), and Sproat and Lappin also note that Klein and Manning’s scheme uses part-of-speech tagged text, which is a far cry from text. This is a huge annotation, and could be taken as a component of Universal Grammar. It is a component that is argued for in P&P, as well. I agree with Sproat and Lappin, that P&P would be better off with an attempt to implement its general learning scheme---an analysis of the informational complexity of this problem is provided in Partha Niyogi’s book. I disagree that this computational effort should be expected to succeed in practice, since that would require tractability, and that is too much to ask of a generally correct theory. The effort should be mathematically proven to succeed in theory, by defining just exactly what it would learn if you could wait long enough for it, and by proving that it would terminate if you waited long enough for it. This is the sort of learnability result that I’ve obtained for type-logical grammar, given a certain format for Universal Grammar and certain assumed mathematical properties of all human languages. What I don’t require is the parts of speech—we learn those, surely. The learning scheme is outlined in my book and also some current papers (see references). Instead of the POS annotation, it requires annotation by skeletal semantic structures, but there is nothing wrong with requiring annotation. Children certainly receive “annotated” sentences, in that they get other clues to meaning when sentences are spoken. While I support the current efforts in statistical NLP, I don’t believe it has any future as a linguistic theory. It is a computational methodology, inspired by the recognition that it would be too hard/slow to invoke the most complete form of linguistic theory to perform basic NLP tasks. As I’ve often told my comp ling students, “if you’ve made software that works well, it probably doesn’t actually solve a theoretical problem, which is what makes it part of applied NLP; if you’ve made software that implements a full theoretical solution, it probably won’t finish by the end of the day, which is what makes it theoretical computational linguistics.” That doesn’t mean the twain shall never meet. But not yet. References: Fulop, Sean A. (2005) “Semantic bootstrapping of type-logical grammar,” Journal of Logic, Language and Information 14:49-86. Fulop, Sean A. (2004) On the Logic and Learning of Language. Victoria, Canada: Trafford. Fulop, Sean A. (2003) “Discovering a new class of languages,” Proceedings of Mathematics of Language 8, available online at MOL website. Niyogi, Partha (1998) The Informational Complexity of Learning. Kluwer. Stabler, Edward (1992)? The Logical Approach to Syntax. Cambridge, MA: MIT Press. |
| Date Posted: | 29-Apr-2005 |
| Linguistic Field(s): |
Computational Linguistics
Syntax Discipline of Linguistics |
| LL Issue: | 16.1364 |
| Posted: | 29-Apr-2005 |

