LINGUIST List 5.677

Fri 10 Jun 1994

Misc: Endangered Languages, Protolanguage, NLP article

Editor for this issue: <>


Directory

  1. Dan Everett, Support for Endangered Languages
  2. Mark Durie, Re: FWD>5.640 Protolanguage
  3. Jacques Guy, NLP: France Telecom vs Telecom Australia

Message 1: Support for Endangered Languages

Date: Mon, 30 May 1994 13:24:59 Support for Endangered Languages
From: Dan Everett <deverisp.pitt.edu>
Subject: Support for Endangered Languages

Tony Woodbury's recent mailing about interest in a special
group of papers at the LSA meetings on field work caused me to
think that there might be a number of you interested in another aspect
of field work - aid to the speakers of languages we work on, in
particular endangered languages.

Recently, partially at my suggestion, the Summer Institute of
Linguistics initiated a grant for the study of endangered languages,
in the amount of $1000.00. The idea behind this is that the money
should go to a graduate student and that the funds be matched, if
possible, by the student's home institution. The first recipient of
this grant, a student from our department, will be studying the
Kootenai language. (For further information on this for future years,
I think that the person to contact is David Payne, International
Linguistics Coordinator, SIL 7500 W. Camp Wisdom Rd. Dallas, TX 75236;
I know nothing about the future plans for this grant).

This grant causes me to wonder how many other organizations and/or
individuals would be able and willing to support the study of
endangered languages. I have in mind two principal categories for
support: (i) graduate student research (faculty do, after all,
have other sources, e.g. the NSF which is quite interested in
worthy projects on such languages); (ii) money for the speakers
themselves.

For example, it costs many thousands of dollars in Brazil for the
government to demarcate Indian territory. In recent years, the rock
star, Sting, has contributed a significant amount to such efforts (in
a recent series of shows he raised nearly $600,000.00). Demarcation
is perhaps the single most important need in Brazil. For example, most
of the groups that I have worked with are smaller than 200 and
settlers are beginning to invade their traditional lands. The Piraha's
reserve was mapped out over eight years ago (partially funded by
Cultural Survival) and the resultant map was made law last year. But
(although the map is legal) absolutely nothing has yet been done to
identify the land (which involves cutting a 2-meter wide swath in the
jungle around the entire reserve, by Brazilian law). The reason is
simple - there are no funds available. Without their own land, chances
for survival, already very bleak, are almost nonexistent.

Another area of need is medical work. This summer, for example, I will
be accompanying a dental team to the Piraha. Next summer, I have about
three surgeons lined up to go to the Amazon. Although these people
usually pay their own way (they should!), there are numerous
additional expenses involved beyond international transport - medical
equipment, supplies, local transportation.

When I lived in Brazil, I was able to raise funds for humanitarian aid
for Indians from a group of companies (IBM, Bosch, and some of the
latter's subsidiaries). It seems to me like this ought to be a
priority for linguists working on endangered languages. In this
regard, I would like to see a discussion here of ideas others may have
had for providing aid of different kinds to speakers of endangered
languages. Actual potential funding agencies should not, of course, be
mentioned in a public forum, but those of us who are most interested
in such matters might organize an effort along these lines off-line.
One source would be multinationals with a record of exploiting the
relevant areas - these groups often want to buy their way into heaven
via grants.

The need for such additional funding struck me this year as I worked
with the Banawa. Out of a three year grant I have plenty of money
for my own transportation, summer salary, etc. Even some money for
paying language teachers. But the percentage of the funds I have that
will go to actually directly benefit the community, beyond informant
fees, is about 1% of the entire grant. Yet my home institution
received a much larger percentage in overhead.

I would like to hear from people interested in this issue with
specific ideas on funding and applications of funds raised. The latter
issue ought to be taken up on Linguist, I think, in order to get the
maximum number of ideas. Maybe there are some obvious sources out
there that I am unaware of. But then, if I am unaware of them, so will
lots of others be.

Dan Everett
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue

Message 2: Re: FWD>5.640 Protolanguage

Date: Mon, 06 Jun 1994 12:15:44 Re: FWD>5.640 Protolanguage
From: Mark Durie <mark_duriemuwayf.unimelb.edu.au>
Subject: Re: FWD>5.640 Protolanguage

 Reply to: RE>FWD>5.640 Protolanguage proof
By Jacques Guy's method, even one of the daughter languages could be a proto
language, with 100% retention of vocabulary. Guy's 'proof' that the root can
be placed in infinitely many places only works on the assumption of
infinitely arbitrary variations in vocabulary replacement rates. Few
advocates of lexico-statistics (I am not one) would share the covert
assumption of Jacques' proof that replacement rates are maximally and
arbitrarily variable. Even if one does not hold the controversial opposite
assumption that vocab replacement rates are universally constant across time
and space, most of us would not wish to go so far the other way as to assume
that they vary completely freely and arbitrarily. It is bizarre to suggest
that a reconstruction that assumes retention rates ranging from 100% to 20%
in the one family is as equally plausible as one which assumes a range of
55%-65% retention rates across the family.

Jacques' argument is certainly not a once-and-for-all- proof.

The more moderate statement is: the greater the variation in retention rates
that one permits, the more uncertain and imprecise one's location of the
'root' (i.e. the relative position of the proto-language) must be. Jacques's
proof is merely the extreme corollary of this: if one assumes maximal
uncertainity in the rate of retention across a family, then the the
lexico-statistical method gives maximal uncertainty in the location of the
root.

Mark Durie
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue

Message 3: NLP: France Telecom vs Telecom Australia

Date: Tue, 31 May 1994 11:02:35 NLP: France Telecom vs Telecom Australia
From: Jacques Guy <j.guytrl.oz.au>
Subject: NLP: France Telecom vs Telecom Australia


There is an interesting article in "L'echo des RECHERCHES" No.146,
pp.51-60 (4th quarter 1991), on natural language processing, entitled
"Interrogation en language naturel du Minitel Guide des Services (MSG)".
The authors use the term "syntactic analyzer", but that term is a bit of
a misnomer: their syntactic analyzer is nothing like a parser. It relies
purely on semantics and is in fact a semantic-network builder. The
resulting graph is also nothing like the trees you see in linguistic
analysis (typically an S with NP and VP nodes dangling, and more from
there). First, it contains circuits. Second, it has two types of arcs --
which they call "relatif_a" and "sorte_de" ("related to, having to do
with" and "kind of, subset of"). Third, those arcs are directed.
Finally, the graphs are not rooted.
 I was particularly interested because:
1. It so happens that there is a group here working on the very same
 problem: direct access to our electronic Yellow Pages database in
 natural language.
2. I had come to a model of syntax/semantics very similar to Gilloux,
 Lassalle and Ombrouck's but from a very different starting point:
 given a bilingual text, extract the dictionary and the rules for
 translating from one language into the other. I seem to be ending up
 with two fundamental operations, one commutative, the other not,
 a bit like their "sorte_de" and "relatif_a", except these are
 relationships, not operations. It makes me think that they are onto
 something, namely, a proper model of language.

(1) is amusing. The system developed here so far relied on a syntactic
parser (in the classical sense) and a neural net. The parser parses the
customer query and sends its output to the neural net. But it does not
work very well. Actually, it has a strong propensity for not working,
period (from first-hand demos). So I suggested: "Junk the parser.
Parsing gets you nowhere, this is not how we process natural language,
only artificial, algorithmic languages like C, Lisp, whatever. Send
the query directly to the neural net, verbatim." To which was objected:
"We cannot get rid of the parser, it is an AI application, and this is
the AI department." The power of words and labels! Our NLP interface
works better now. The programmer in charge followed my advice and
modified the existing parser so that now sends to the neural net the
query it received, unchanged. There still is a module called parser, so
everyone is happy.
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue