LINGUIST List 12.817

Fri Mar 23 2001

Qs: Ancient Chinese Taboo Words, Tokenization Ref

Editor for this issue: Karen Milligan <>

We'd like to remind readers that the responses to queries are usually best posted to the individual asking the question. That individual is then strongly encouraged to post a summary to the list. This policy was instituted to help control the huge volume of mail on LINGUIST; so we would appreciate your cooperating with it whenever it seems appropriate.


  1. Gabriele Bugada, Ancient Chinese taboo words
  2. Maite Taboada, tokenization reference

Message 1: Ancient Chinese taboo words

Date: Wed, 21 Mar 2001 14:31:29 +0100
From: Gabriele Bugada <>
Subject: Ancient Chinese taboo words

I am an italian student taking a course of Sociolinguistics. I need
some informations about words which in ancient Chinese dialects were
considered taboo not just for their common-use meaning, but because
their pronunciation contained taboo words, exp. with sexual
meaning. E.g., I heard that there was a taboo word which meant an
animal but whose pronunciation was 'composed' by sounds meaning penis
and omosexual. I would like to know if this is true, what word (and
meaning what animal) was implied, and if other examples are known.
Can anyone help me? 

Thank you in advance.
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue

Message 2: tokenization reference

Date: Fri, 23 Mar 2001 13:04:43 -0800
From: Maite Taboada <>
Subject: tokenization reference

I'm looking for references on how to do tokenization from scratch
(separate a stream into words, numbers, punctuation signs). I don't
want to have to explain the whole process, so I thought I'd just say
"we use a standard procedure, such as the one described in X".

Can anyone help me find appropriate references?

Thanks a lot,

- Maite

Maite Taboada, Senior Computational Linguist Systems Inc. 
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue