Query Details
| Query Subject: |
Query re Unicode and tone languages
|
| Author: | Musgrave, S. |
| Submitter Email: | click here to access email |
| Query: |
In developing a typological database which will include text data from numerous languages, we have encountered a problem with the representation of tone using Unicode fonts (we are using Lucida Sans Unicode in our application). The Unicode standard includes two diacritics which can be used to represent contour tones, those normally used for HL and LH contours. But many languages have more contour tones than these two: for example, Ngiti has three tone levels and all combinations of levels allowed in one contour tone: HM, HL, LH, LM, MH, ML. In principle it should be possible to combine more than one diacritic with a text character in a Unicode font, and therefore (if the font in question includes the full diacritic set) i should be possible to provide diacritics for all contour tones. However, our attempts suggest that this method is not workable because the positioning of diacritics cannot be controlled finely enough. That is, the various diacritics tend to be positioned on top of one another, rather than beside each other. Our first question then is: 1) has anyone else had more success in producing diacritics for contour tones using the Unicode standard, and if so, what technique was used? If no satisfactory answers to this question emerge, we intend to explore the possibility of creating a set of contour tone diacritics for inclusion in Unicode, either as a part of the user-defined area which the standard makes available, or (preferably) as a part of the defined standard encoding. To this end, we also seek answers to a second question: 2) what range of contour tones have been reported for the languages of the world? We will post a summary of responses to the list. Simon Musgrave Spinoza Program Lexicon and Syntax (SPLS) University of Leiden |


