Editor for this issue: T. Daniel Seely <dseely
emunix.emich.edu>
I would like to get some estimate of what percentage of the world's written languages are represented orthographically in a phonemic manner. More specifically, how many written languages are such that one can predict the phonological properties of a word --- including stress, accent or tone --- merely by consulting the string of symbols used to write that word, and without further information, such as the morphological structure of the word? For a language whose writing system is largely phonemic, one could write down a set of rules for word pronunciation, and in the ideal case the number of rules would be within an order of magnitude of the number of graphemes. (A few lexical exceptions don't matter, as long as there aren't hundreds of them.) I am leaving the sense of `phoneme' intentionally vague: normally a phonemic written representation implies that one can predict the surface phonemic representation from the written form of the word, but I would be perfectly happy considering a system to be phonemic if some more abstract level of phonological representation were represented, from which the surface phonemic representation could be predicted by regular phonological rules/principles. (I should also note, to clarify the question further, that I am interested primarily in the correspondence between the written form and the spoken form for the the standard variet(y,ies) of the language, which the written form presumably reflects to some degree: I am not interested (at the moment) in dialects of the language which deviate to varying degrees from the standard.) So, under this definition, Spanish would presumably count as very phonemic since one can nearly always predict the pronunciation of a word, including its stress, from the orthography. Romanian is less phonemic since while the actual set of phonemes in a word is mostly determinable by the set of graphemes used (with the representation of glides being slight source of complication), the placement of stress requires some knowledge of the morphological class of the word (following work of Ioana Chitoran). English is presumably among the least phonemic, since the `regular rules' of pronunciation are themselves quite complex, and there are many lexical exceptions. The particular classification of the writing system as logographic, moraic or segmental is unimportant: in principle Chinese writing could be classed as phonemic (albeit with a rather large set of graphemes), but for the fact that especially among the more common characters there are quite a few with pronunciation ambiguities which can only be resolved using lexical information. I am familiar with several of the recent books on writing systems: but while these typically contain in-depth analyses of particular systems, as far as I can tell, nobody has done a survey of this kind. (If on the contrary, someone can point me to a survey that answers this question, I would be most grateful.) So, I would be very interested in getting as much information related to this question on as many languages as people are sufficiently familiar with. I think I already know the answer to these questions for the more familiar Western European languages (including some less familiar ones like Irish and Welsh), as well as Romanian, Russian, Hebrew, Arabic, Chinese, Japanese and Malagasy. I would be particularly interested in knowing about languages for which writing systems have only recently been developed, or for which the spelling system has recently undergone a massive restructuring: conventional wisdom has it that in such cases the writing system should be very phonemic, but perhaps that is not always true. Please send any replies to me, and if there are a sufficient number I will post the results of this survey to the List. - Richard Sproat Linguistics Research Department AT&T Bell Laboratories | tel (908) 582-5296 600 Mountain Avenue, Room 2d-451 | fax (908) 582-7308 Murray Hill, NJ 07974, USA | rwsMail to author|Respond to list|Read more issues|LINGUIST home page|Top of issueresearch.att.com