|Title:||Re: 16.1156, A Challenge to the Minimalist Communi|
|Description:||I believe the challenge offered by Sproat and Lappin, for someone to
build a successful Principles and Parameters parser that learns from
treebanks, is mixed up in a number of respects. Let me point out
some of the ways the original challenge doesn’t take into account the
goals of theoretical linguistics, and misses an important distinction
between applied natural language processing and theoretical
First, let me say that I am not in favor of the Principles and
Parameters framework as a linguistic theory, nor am I in favor of
Minimalism (whatever that actually turns out to be). I promote type-
logical categorial grammar as a framework for Universal Grammar, but
whatever, my goals are broadly the same as those of P&P
proponents, and Sproat & Lappin’s challenge might equally be applied
to type-logical grammar.
There have been some spirited defenses of P&P posted already by
various proponents, and I apologize if I overlap with what is said there
a little bit in the sequel.
P&P is an effort to describe and explain what the human language
faculty is, and what it does during learning. Personally, I don’t think
there is much hope for its approach, but that remains debatable, so
let’s presume it’s a good theory, meaning some future tweak of it will
turn out to effectively describe, to every linguists’ satisfaction, the
fundamental grammars of every language and how those could be
learned given the specific Universal Grammar that would also be
Now, who said anything about computational tractability? This is a
complete theory, not an efficiency contest. I think Ed Stabler’s work
on GB and Minimalism over the years should be convincing enough,
that a computational implementation is at least possible, and won’t be
tractable because it is bereft of performance limitations. But we have
to be happy with our theories when they correctly describe the input-
output relation of the human language acquisition process, never mind
the actual process. It is hopeless to try to conform to the “actual
process” used by the mind, we have no way of knowing what that is or
whether our particular computers are even up for the job. Some
cognitive scientists have even suggested that the brain is not
constrained by computability at all, since mathematical computability is
defined in reference to computational procedures that we now
fathom. Worrying about tractability in P&P is like denigrating relativity
theory because it makes it needlessly harder to calculate artillery
ballistics. That’s of course true, but that’s also why no one invokes
such a complete theory of relative motion simply to calculate artillery
Sproat and Lappin note that learning from treebanks is supervised,
meaning the parser is trained by a subset of the right answers. They
go on to reference work by Klein and Manning which induces
grammars in an “unsupervised” fashion from text. Well first of all, it is
still debatable whether anything can ever actually do this (see the
algorithmic learning theory literature, summarized in Jain et al. 1999),
and Sproat and Lappin also note that Klein and Manning’s scheme
uses part-of-speech tagged text, which is a far cry from text. This is a
huge annotation, and could be taken as a component of Universal
Grammar. It is a component that is argued for in P&P, as well.
I agree with Sproat and Lappin, that P&P would be better off with an
attempt to implement its general learning scheme---an analysis of the
informational complexity of this problem is provided in Partha Niyogi’s
book. I disagree that this computational effort should be expected to
succeed in practice, since that would require tractability, and that is
too much to ask of a generally correct theory. The effort should be
mathematically proven to succeed in theory, by defining just exactly
what it would learn if you could wait long enough for it, and by proving
that it would terminate if you waited long enough for it.
This is the sort of learnability result that I’ve obtained for type-logical
grammar, given a certain format for Universal Grammar and certain
assumed mathematical properties of all human languages. What I
don’t require is the parts of speech—we learn those, surely. The
learning scheme is outlined in my book and also some current papers
(see references). Instead of the POS annotation, it requires
annotation by skeletal semantic structures, but there is nothing wrong
with requiring annotation. Children certainly receive “annotated”
sentences, in that they get other clues to meaning when sentences
While I support the current efforts in statistical NLP, I don’t believe it
has any future as a linguistic theory. It is a computational
methodology, inspired by the recognition that it would be too
hard/slow to invoke the most complete form of linguistic theory to
perform basic NLP tasks. As I’ve often told my comp ling students, “if
you’ve made software that works well, it probably doesn’t actually
solve a theoretical problem, which is what makes it part of applied
NLP; if you’ve made software that implements a full theoretical
solution, it probably won’t finish by the end of the day, which is what
makes it theoretical computational linguistics.” That doesn’t mean the
twain shall never meet. But not yet.
Fulop, Sean A. (2005) “Semantic bootstrapping of type-logical
grammar,” Journal of Logic, Language and Information 14:49-86.
Fulop, Sean A. (2004) On the Logic and Learning of Language.
Victoria, Canada: Trafford.
Fulop, Sean A. (2003) “Discovering a new class of languages,”
Proceedings of Mathematics of Language 8, available online at MOL
Niyogi, Partha (1998) The Informational Complexity of Learning.
Stabler, Edward (1992)? The Logical Approach to Syntax.
Cambridge, MA: MIT Press.
Discipline of Linguistics