Editor for this issue: Martin Jacobsen <marty
linguistlist.org>
I provide a summary of responses to a request made on LINGUIST, CORPORA, and E-LEX for information on subordinating conjunctions. First, I repeat the call, then summarize. ***Call*** "I am performing a "definitive" analysis of the meanings of subordinating conjunctions and would be interested in linking up with anyone who has focused on their representation in NLP systems. I am performing an analysis of subordinating conjunction definitions in Webster's 3rd International Dictionary, modeling these definitions using the theory of labeled directed graphs (digraphs), using principles for identifying primitives I have previously described (see Litkowski, K. C. (1988). On the search for semantic primitives. Computational Linguistics, 14(1), 52 for an overview). The "meaning" of subordinating conjunctions essentially consists of labeling clauses and establishing discourse relationships of time, contingency, place, condition, concession, contrast, reason, purpose, and result (see Quirk et al. pp. 1070-1112). I am aware that subordinating conjunctions are used as cue words in discourse processing, but I am not aware of any systematic bringing together of these "meanings" in a computational system. Characterizing these meanings is important in the digraph analysis, and while I can do it myself, it would be preferable not to reinvent the wheel. I would be grateful if anyone can point to computational representations of these meanings. A database of these "meanings" will eventually be made publicly available on the web for anyone to use." ***Responses*** Many respondents noted correctly that another term which subsumes subordinating conjunctions (SCs) is "discourse markers," for which there is a substantial literature. Megan Duque-Estrada has a very extensive bibliography of this literature available on the web at http://www.ufpa.br/~megan. This literature provides substantial information pertinent to my request. More specific information going to the heart of my request for features and semantic labels associated with SCs was provided by Alex Eulenberg, Ken Barker, and Alistair Knott. Mary Dee Harris provided the link to Ali's work; I am very grateful for this link. Alistair Knott (http://www.cogsci.ed.ac.uk/~alik/publications.html, particularly "A Data-Driven Method for Classifying Connective Phrases") and Alex Eulenberg (http://php.indiana.edu/~aeulenbe/, providing features for additive conjunctive sentence adverbials) provide an identification of features associated with comprehensive lists of "cue phrases" (that go beyond the smaller set of SCs). The feature names include MODAL STATUS, POLARITY, FOCUS OF POLARITY, PRESUPPOSITIONALITY, SOURCE OF COHERENCE, ANCHOR, PATTERN OF INSTANTIATION, and RULE TYPE. Ken Barker (http://www.csi.uottawa.ca/~kbarker/, particularly "Interactive semantic analysis of clause-level relationships, CLRs) provides an identification of semantic labels assigned to clauses based on CLR markers or clausal connectives. The semantic relationships include CAUSAL (CAUSATION, ENABLEMENT, ENTAILMENT, PREVENTION, DETRACTION), TEMPORAL (CO-OCCURRENCE, PRECEDENCE), and CONJUNCTIVE (CONJUNCTION, DISJUNCTION). These two sets of information respond precisely to my needs and are quite useful in their own right. Another respondent (who prefers to remain anonymous), who has investigated diachronic processes (including SCs in German), noted a possibly very interesting point that Old High German had no Complementizer Phrases, so that "das" evolved into "dass". This person also cited work describing subordinating conjunctions with prepositions (like: bevor, nachdem, indem) having a descriptive part (-vor-, nach-) and a referential part (expressed by the d-words or the w-words), including pairs like "nachdem -wonach", "dadurch dass - wodurch", and "damit - womit". There is thus the suggestion (to me, at least) that the use of subordinating conjunctions over time might reflect an evolutionary process of reasoning where particular feature values and semantic relationships have become lexicalized. The characterization of feature values and semantic relationships by Knott, Eulenberg, and Barker may facilitate this type of diachronic analysis. When I complete the first phase of my research (the initial digraph analysis), I will provide notification of its availability. Then, when I fully incorporate the analysis of features and semantic relationships, I will post the data on the ACL SIGLEX Lexical Resources page (http://www.clres.com/siglex.html). I thank everyone who responded and hope that this summary responds to those who asked to be kept informed of my findings. Ken - Ken Litkowski TEL.: 301-926-5904 CL Research EMAIL: kenMail to author|Respond to list|Read more issues|LINGUIST home page|Top of issueclres.com 20239 Lea Pond Place Gaithersburg, MD 20879-1270 USA Home Page: http://www.clres.com
On August 8, 1997, I posted a question about the existence of
_ls-_ and _lz-_ (i.e., a liquid plus a fricative in that order)
configurations on the Linguist List electronic bulletin board. I
stated that such articulations seemed to me to be phonologically
improbable and that they might naturally metathesize to _zl-_, _sl-_,
etc., or that, if they did occur, they would be highly marked. Among
those who kindly replied to my query were Victor Peppard via Jacob
Caflisch (University of South Florida), Mark Liberman (University of
Pennsylvania), Peter Chew (Oxford University), David Robertson
(tincan), John E. Koontz (Boulder), Subhadra Ramachandran (cyantic),
Robert Beard (Bucknell University), Sondra Ahlen (cmu), James Giangola
(General Magic), Christopher Miller (University of Quebec), Colin
Whiteley (Barcelona), Ronald Cosper (Saint Mary's University,
Halifax), Alain Theriault (University of Montreal), Jakob Dempsey
(Yuan-ze University, Taiwan), Kimmo Huovila (kielikone, Finnland),
Michael Betsch (Tuebingen University), Sandra Paoli (University of
York, England), Mark Donohue (United Kingdom), David Gohre (Indiana),
Geoffrey Sampson (University of Sussex), James Kirchner (no address or
affiliation), Olga Shaumyan (University of Sussex), Steve Seegmiller
(Montclair State University), Manaster (probably Alexis Manaster
Ramer, Michigan), Paul Boersma (Instituut voor Fonetische
Wetenschappen, Amsterdam), Wolfgang Behr (Frankfurt University), Keith
Goeringer (University of California at Berkeley), Heli Harrikari
(University of Helsinki), Charles Gribble (OSU), and Elena Andonova
(Bulgaria[?]). Several graduate students at the University of
California (Los Angeles) and elsewhere requested that their names not
be listed in my response because they did not want to get in trouble
with their adviers for spending too much time on the Internet. I hope
that I have not inadvertently forgotten any others. My profound
gratitude is due to each and every one who responded.
The gist of the information which the above-named individuals
provided to me is that there certainly do exist _ls_, _lz_, and
similar configurations, even in English (e.g., "else," "holster,"
"also," "balsam," "pulse," "calcium," "dulcimer," "bells," "pulls,"
"files," and "celsius"), but note that these are all internal or
final. Other languages with internal _-ls-_, _-lz-_, etc. (often
separated in two adjoining syllables) cited in the responses include
Coast Salish, Malayalam, Bulgarian, French, Spanish, Portuguese,
Italian, and Finnish. It was reported that some Athapaskan languages
may have such clusters in final position. As indicated by the dashes,
however, I was thinking of syllable initial _ls-_, _lz-_, etc.; it
would appear that such articulations are quite rare throughout the
world.
Levantine and Western dialects of Arabic (including Maltese)
were mentioned among the replies I received, although without
indication of the location (initial, internal, or final) of these
consonant combinations and without citation of specific words. Also
mentioned was the mysterious language Lvova, said to be from the Santa
Cruz Islands, Solomons, and written about by Wurm in articles for
numerous Pacific linguistics publications. The languages of the
Caucasus were noted as being particularly rich in initial consonant
clusters, but _ls-_ and _lz-_ were not cited specifically.
The overwhelming preponderance of the citations for such
configurations were from Slavic languages, in which some of my
correspondents declared that virtually any combination of consonants
is possible! (For example, there is a Russian word, _vzbzdnut'_,
which you will not find in any dictionary, that means "to emit a
silent but very smelly fart." And Czech, amazingly, even has whole
words that are spelled without any vowels, although out of
physiological necessity a kind of epenthetic schwa is used when they
are pronounced. Geoffrey Sampson cites the Czech word _vlh_ ["wolf"]
which consists wholly of an _-l-_ sound surrounded by fricatives on
both sides [the _-h_ in this word is actually a voiceless velar
fricative, IPA [x]]!) As Victor Peppard put it, "One of the reasons
Slavic has so many complex consonant clusters is that in about the
ninth century Common Slavic lost a pair of semi-vowels, one back and
one front, precipitating in a lot of places, to put it colloquially, a
tremendous collision of consonants." Nonetheless, even in Slavic,
_ls_ or _lsh_ and _lz_ or _lzh_ are usually found intervocalically,
but are much less common (and HARDER TO PRONOUNCE) in initial position
(cf. _lzh-_ ["false"], _lze_ ["possible"], etc.). Often, as with
Russian _l'stit_ ("to flatter") and _l'viny_ ("lion's"), an initial
_l-_ in such combinations tends to become palatalized, perhaps to ease
pronunciation.
The difficulty of pronouncing syllable initial _ls-_, _lz-_,
was commented upon by Sondra Ahlen as follows: "In that case I would
not be surprised to see some phonological process occur since as I
recall syllable initial sequences tend to involve increasing levels of
sonority as you get closer to the nucleus, with the common exception
of fricatives before stops as in _str-_. Metathesis is one of several
phonological processes that might affect an underlying syllable
initial (or potentially syllable initial) such as _lz-_, _ls-_. Other
options might include vowel epenthesis, consonant deletion,
syllabification of the liquid, etc."
Paul Boersma cited one instance of metathesis in Czech:
_ml-ha_ ("fog," two syllables, the /_l_/ being syllabic) from an older
_mgla_ which still exists in Polish.
A check of all the roots beginning with _l-_ in the
_Etimologicheskii Slovar' Slavyanskikh Yaz'ikov_, vols. 15-17,
revealed that whenever the _l-_ was not followed by a vowel (i.e.,
when it was followed by something other than a vowel), the letter to
be found was either the hard or the soft sign, both of which I presume
indicate some sort of palatalization or yodization. My interpretation
of this pattern would be that it reflects a phonological process
designed to ease the pronunciation of the following consonant
(including _-s-_ , _-z-_, and _-zh-_) after the _l-_.
Jakob Dempsey provided extremely valuable data from Tibetan which
lends support for the possibility of metathesis: "Old Tibetan 'moon' was
_*sla_ which assimilated to _zla_ in the classical period, but in the
western dialects this underwent initial-cluster metathesis (seen in many
examples of western Tibetan): _zla_ > _lza_. That form remains in the
extreme west (Balti), but in central Tibet we have: _nda_ < _lda_ which
in turn seems to come from _lza_. It has been proposed that _lce_
('tongue') came from _*sle_ (via _*lse_), but since there are still
dialects in Tibetan which preserve _cle_, this is yet another example of
that metathesis, with the _c-_ in _cle_ probably a palatization of
earlier _*tle_ which in turn may be from _*ple_, cf. Drung _p-lai_ (Drung
has many old loans from Tibetan). 'Tongue' in many other Tibeto-Burman
languages is from _*ble_."
Wolfgang Behr observed that "Qiangic [a Tibeto-Burman language
found in Sichuan Province of China] allows _rp-, rk-, rt-, rb-, rg-,
rts-, rdz-, rtsh-, rdzh-, rdzh-, rm-, rng-, rl-_ [!!], _rw_ (with
distinctive syllabic and non-syllabic _r-_), but no _*ls-_ or _*lz-_
(neither _*rs-_ or _*rz-_). Jiarong [another Tibeto-Burman language from
the same area of southwest China], although equipped with one of the most
curious initial cluster systems known (> 170 types), has such things as
_ltsh-, ldz-, ldzh-, lj-_, but again, no _*ls_ or _*lz-_." As for the
anomalous distribution of preinitial resonants in Written Tibetan (e.g.,
<_rts_> but not *<_lts_>, etc.), this phenomenon has apparently never
been explained in the literature. It is not known for sure whether these
clusters were ever pronounced as they were written in the Old Tibetan and
Pre-Tibetan periods (we may notice the great variation of written cluster
representations in the Dunhuang documents), or if they were pronounced
sesquisyllabically, or if the preinitials came into being as mere
graphical conventions marking tone. Similar clusters, violating not only
basic sonority hierarchy restrictions but even such notions as
Hjelmslev's "resolvability principle" (i.e., every language L that allows
C1C2C3- ititials of a given shape in its phonotactics must allow for all
adjacent subsets of the cluster, viz., C1C2-, C2C3), have been set up for
Old Sinitic by "proto-form stuffers" (to use James A. Matisoff's term).
Those who have done so, again quoting Matisoff, lack an adequate
"Proto-Sprachegefuehl."
Finally, Wolfgang Behr also offered some very interesting
theoretical perspectives, complete with an extensive bibliography,
concerning the "sonority sequencing principle" (SSP) and its violations.
A basic assumption of the SSP is that the least sonorant segments occur
toward the margins of a syllable. Among the finer differentiations of
the sonority scale are those proposed by Th. Vennemann in his _Preference
Laws for Syllable Structure_ (Berlin: Mouton, 1988). According to the
sonority restrictions applying to the distribution of segments in a
syllable on Vennemann's fine-grained scale, predictions may be made about
statistical frequencies or markedness properties. By these standards,
_ls-_ and _lz-_ would have to be classified as marked.
-
******************************************************************************
Victor H. Mair Dept. of Asian & Middle Eastern Studies
University of Pennsylvania
Philadelphia, PA 19104-6305
USA
Tel.: 215-898-8432
Fax.: 215-573-9617
e-mail: vmair
sas.upenn.edu (read once or twice a week)
******************************************************************************
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue