LINGUIST List 3.8

Sun 05 Jan 1992

Disc: Are Languages Infinite?

Editor for this issue: <>


Directory

  1. Stavros Macrakis, Infinite languages
  2. "Bruce E. Nevin", is infinity relevant?

Message 1: Infinite languages

Date: Mon, 30 Dec 91 12:08:51 ESInfinite languages
From: Stavros Macrakis <macrakisosf.org>
Subject: Infinite languages

Alexis Manaster Ramer says:

 ...the union of the set of even natural numbers with the set of
 primes which I happen to know is infinite although again not
 well-defined. (Perhaps it is not even a set, but whatever it is,
 it is infinite, because the set of natural numbers is infinite.)

If this is not a well-defined set for (Manaster?) Ramer, he is using a
radically different kind of set theory than anything used in
mathematics. What does he have in mind?

	-s
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue

Message 2: is infinity relevant?

Date: Mon, 30 Dec 91 13:40:40 ESis infinity relevant?
From: "Bruce E. Nevin" <bnevinccb.bbn.com>
Subject: is infinity relevant?

In 2.885 of last Saturday 12/28 Alexis quite rightly points out that the
ill-definedness of language does not vitiate arguments for infinitude of
language based on properties of mathematical systems that have been
correlated with language. I agree. I intended to say that it is the
limited nature of the correlation between mathematical production
systems and language that vitiates those arguments. I apparently didn't
express this clearly enough (don't have my earlier post at hand).
Thanks for bringing it out as follows:

>		 While Bruce makes a very strong
>case for languages being ill-defined, it seems to me that
>the connection between this and the cardinality issue is
>not what he assumes.

>I think that people who dislike mathematical claims
>about language . . . tend to argue . . .
>that language is not a well-defined set and hence it cannot
>have any such properties predicated of it. At the same time,
>people who like formal linguistics tend to make precisely
>the same connection and insist that languages are well-
>defined precisely so that they can then predicate such
>properties of them.

>But as a matter of fact, there is no such connection.

My aim in the discussion of language not being well defined, was to
indicate some of the ways in which the generally assumed correlation
fails.

When we talk of infinite sets of sentences (or of word-sequences, or of
morpheme-sequences, etc.) the topic is mathematical systems. Anything
whatsoever could be (abstractly!) substituted in the proposed generative
system in place of the words and morphemes, and the fact that the
results of such substitution would no longer resemble language
is of no consequence to the mathematical system.

Conversely, when we study human discourse, such mathematical systems are
useful tools, but we should no more confuse these tools with our topic,
language, than a surveyor should identify theodolite and chain and
records of measurement with the land surveyed. It is true that some
linguists see in language only examples testifying for or against the
aptness of one generative system or another, just as some real estate
persons seem to live in a world of acreage and parcels rather than of
stones and grasshoppers, but surely both are missing something
essential. And by this I mean something essential for a science of
language (or for real estate transactions, for that matter) and not
anything that might be decried as romantic and sentimental--though to be
sure it enters into that very definite human need as well.

Those aspects of language that elude the net of the familiar
mathematical constructs have sometimes been attributed to performance
limitations, such as inability to keep track of nested iterations.
Aside from these sorts of limitations, there are information-content
grounds for a limit to the set of sentences. Vocabulary is finite. It
does not take many iterations of a given operator word like "very" or
"think that" or "be a fact" before they cease to contribute usefully to
the information content of the sentence, and it does not take very many
layers of piling up even different operator words ("I think that John
said that it surprised Jane that . . .") before the relationship
between the topmost and the relatively concrete referents at the bottom
becomes so tenuous that the former make no clearly identifiable
informational contribution. This is illustrated in relatively
commonplace language use, where the usual gambit to make such
complexities manageable is to zero the lower layers, leaving higher ones
as nominalizations in the argument of the topmost--the opaque "abstract
vocabulary" of which philosophers like Hegel and Dewey can be so fond.
And we are only talking about a very few layers of iteration here!

Now is this sort of limitation a matter of mere performance? I would
suggest that it has to do directly with the character and function of
language for error-free transmission of information. It is possible
that the mechanisms in the human organism for controlling language
(whatever they may be) *are* capable of generating an infinite set of
morpheme sequences, in principle (ignoring "performance" limitations
such as language change during the course of years during which one of
the relatively longer utterances in the set was produced and the
eventual death of the speaker--petits de'tails!). But this fact (if it
be fact) is at best marginal and quite possibly irrelevant for that
character, and for that function, and for any conceivable usefulness of
linguistics.

>			 someone should develop a
>suitable nonstandard set theory

Any characterization of language as a mathematical object must provide
for essential characteristics of language including variation, change,
conventionality, and multiplicity of standards (variable conformity to
several disparate idealizations). Ill-definition of the model of
language would follow from its (the model's) providing for these
characteristics. Some sort of fuzzy set theory might emulate the ill
definition, but only accidentally so, that is, without providing for
these characteristics.

I believe a good start is to be found in the theory of language and
information developed by Harris. It does provide for these
characteristics. It describes mathematical structures whose
interpretation is precisely language, not an approximation with
exceptions to be pruned out or accounted for by other means (if at all).
The means by which it accounts for marginal cases at the growing/dying
edges of a language are precisely those required to account for
variation and change. Because of Harris's emphasis on semantics, he has
advanced the study of subject-matter specialized sublanguages,
especially sublanguages of science, something that has been somewhat
developed in computational linguistics also, albeit there in ad hoc ways
driven by pragmatic requirements of database query systems. The study
of relations among sublanguages has scarcely been broached. Each
sublanguage grammar is tightly constrained. Vocabulary items for one
sublanguage (members of its "word classes") often turn out to be
analyzable as phrases or other constructions in other sublanguages.
Thus as Naomi Sager has shown, "the beating of the heart" is an
unanalyzable "symptom" in a sublanguage of pharmacology, but is
decomposable into separate words in the antecedent science of
physiology. _The Form of Information in Science_ demonstrates change in
a sublanguage of immunology coordinated with (and enabling) important
changes taking place in the field at the time the analyzed sublanguage
texts were written.

It appears to me that we develop or "internalize" a multiplicity of
intersecting grammars for sublanguages and for regional and social
variation, and metagrammatical relations among these. A kind of
multilingualism is the norm. The form of this might be something like
Harris's notion of transfer grammar, which was developed for study of
contrasted languages. Borrowing from one constituent grammar to others
is commonplace but not at all without control.

Perhaps this suggests the kind of account that is required. Note that
the mechanisms for each sublanguage and each dialectal variant are
capable in principle of generating an infinite number of sentences, to
the extent that any of them is. But that imputed fact is much less
interesting, and much less productive of research ideas, than the
relations among them.

					 Bruce Nevin
					 bnbbn.com
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue