LINGUIST List 14.2530

Tue Sep 23 2003

Disc: New: Review of Comp Ling: Copestake, Ann (2002)

Editor for this issue: Sarah Murray <sarahlinguistlist.org>


Directory

  1. Mike Maxwell, Review of Computational Linguistics: Copestake (2002)

Message 1: Review of Computational Linguistics: Copestake (2002)

Date: Mon, 22 Sep 2003 12:27:52 -0400
From: Mike Maxwell <maxwellldc.upenn.edu>
Subject: Review of Computational Linguistics: Copestake (2002)

This is a discussion note, concerning my review of Copestake, Ann
(2002) "Implementing Typed Feature Structure Grammars", which appeared
in Linguist 14.2409.

Miriam Butt (of the Centre for Computational Linguistics at UMIST) and
I have been discussing the issue of syntactic parsing of free word
order languages, and Miriam agreed to my suggestion to summarize our
discussion for the benefit of LL readers.

In the original review, I had said concerning Copestake's "Linguistic
Knowledge Building system" (LKB):

> ...it will be difficult to treat 'free word order' languages. (I
hasten
> to add that this is a fundamental problem which virtually all
> parsing programs face.)

Miriam's response was that the treatment of free word order languages
(FWOLs) was built into Lexical Functional Grammar (LFG). Concerning
computational parsing programs in particular (as opposed to linguistic
theories like LFG), she suggested that there was at least one parser
which implemented LFG in an efficient manner:

> ...the XLE parser developed at PARC by John Maxwell (mainly)
> and Ron Kaplan. They have some papers about how to get an
> LFG-based unification grammar to be efficient. The place to
> look for information on that is:
>
> http://www2.parc.com/istl/groups/nltt/xle/default.html

As I understand it, one reason why a parser implementing LFG could be
efficient with FWOLs, is that LFG separates c-structure and
f-structure considerations. I'm speculating a bit here, but: consider
a FWOL in which NPs can be free scrambled within a clause, and their
grammatical function (Subject, Direct Object etc.) is determined by
case marking. A c-structure grammar could be written whose main
clause rule is:

 S --> NP* V NP*

Given a sequence

 ...V NP V...

(where the second V belongs to an embedded clause), it would not be
apparent which V the NP belonged to. The ambiguity would be resolved
(and some c-structures might be ruled out) by mapping the c-structures
to f-structures. Crucially, since scrambling is (by hypothesis)
clause-bounded, and clause boundaries are specified by the
c-structure, this mapping would require a small amount of time.

Such a solution to FWOLs would only be available in theories which
make something like the c-structure/f-structure distinction--not in
theories like HPSG, so far as I can tell.

Of course I greatly over-simplify in this example (e.g. by ignoring
PPs and adverbials). And as I say, this is mostly speculation on my
part.

In the end, though, there still seems to be some discussion going on
in other circles over whether LFG parsers are really that efficient
for FWOLs. In particular, c-structure to f-structure mapping is not
in general efficient, as discussed in the paper at the
above-referenced URL. At any rate, the purpose of this discussion
note is simply to make linguists who would like to parse FWOLs aware
of an alternative.

Oh, I should point out--John Maxwell, who is listed as the contact
point for the LFG parser in the parc.com URL given above, is not the
same as Mike Maxwell (me). Nor are we related, so far as I know...

 Mike Maxwell
 Linguistic Data Consortium
 maxwellldc.upenn.edu
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue