LINGUIST List 9.1235

Mon Sep 7 1998

Disc: POS is well-formed or not well-formed?

Editor for this issue: Martin Jacobsen <martylinguistlist.org>


Directory

  1. Ji Donghong, Disc: POS is well-formed or not well-formed?

Message 1: Disc: POS is well-formed or not well-formed?

Date: Fri, 4 Sep 1998 16:38:53 +0800
From: Ji Donghong <dhjikrdl.org.sg>
Subject: Disc: POS is well-formed or not well-formed?


In response to one open question listed in the sum about
part-of-speech (LINGUIST List 9.1186), which I repeat in section 1,
Dr. John Kovarik proposed a two-step method and draw a conclusion
(LINGUIST List 9.1202), which I list in section 2. I present my
comments in section 3 and some further discussion in section 4.

 
 Disc: POS is well-formed or not well-formed?


1. Open Question: 

Suppose that we are given a language, which is just like English,
however without any affixes, e.g., -ment, -ing, -ed, -tion, -sion,
etc., So the following are all possible phrases in the language: make
develop; develop country; develop product, etc. Now the problem is:
How to determine the distribution-based POS system for the language?
(The case is roughly like that in Chinese.)

2. The method and conclusion by Dr. John Kovarik

 Two-step Method:

a) word segmentation based on some standard;
 
b) producing POS guidelines by analyzing distributional patterns;

 Conclusion:

The resulted (POS) system is not the one and only, but well-formed.

3. My comments

 1) a) is not necessary for the open question, because the assumed
language is like English, with spaces between words.
 
 2) b) is the common and standard method to the open question, as
we can find in the textbooks. The procedure may be like this: first
selecting some typical distributional patterns for every word, second
classifying the words based on the selected patterns.

 3) It is obvious that the resulted POS system is not the one and
only, in fact, it may be just one among (possibly) infinite kinds of
POS systems for the language. The reason is that you may select
different patterns as the criteria, and may select different
classification as the POS system.

 4) It is reluctant to say that the resulted POS system is well-
formed, if we admit that it is just one among many possible systems.
On the other hand, it suffers from the three problems I listed in the
sum (LINGUIST List 9.1186).

4. Further discussion

 1) It may be the case that distribution-based POS is of subjective
nature, and a kind of constructional concept, with a kind of
construction corresponding with a kind of POS system.

 2) Chinese word, like POS, is also a constructional concept. To my
knowledge, there have been no theories trying to clearly define the
concept, i.e., answering the question "WHAT is a Chinese word?". The
segmentation guideline Dr. John Kovarik mentioned is just among many
possible ones, each guideline corresponding with a segmentation
result. So many researchers doubt that the concept "word" may be
inappropriate for Chinese.

 3) What troubles me now is a question: if distribution-based POS is
of constructional and subjective nature, why so many linguistic
theories adopt it? Try considering the notion "molecule structure" in
chemistry, if it varied with different standards, it should have
disappeared in the textbooks.
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue