LINGUIST List 11.294

Sat Feb 12 2000

Sum: Reactions to Languages Listed in ISO 639

Editor for this issue: Karen Milligan <karenlinguistlist.org>

Directory

Nicholas Ostler, ISO 639: reactions from users/potential users

Message 1: ISO 639: reactions from users/potential users

Date: Sun, 6 Feb 2000 22:43:32 +0000
From: Nicholas Ostler <nostlerchibcha.demon.co.uk>
Subject: ISO 639: reactions from users/potential users

Note from Nicholas Ostler: As before, this is John Clews' message, (sent Sun, 06 Feb 2000 12:16:57 GMT ) and replies or comments should go to him direct at Endangersesame.demon.co.uk (John Clews)

Query Issue 11-230 This week I posted a questionnaire to various email lists, based on the comparative list of LC-MARC codes, ISO 639-2 codes and ISO 639-1 codes, that you have seen before.

There was a great deal of interest in this from various linguists in several parts of the world. Some of the comments may be useful, and some less so, and we may or may not want to deal with all the languages discussed.

I would like to thank all who sent replies about the codes or languages, and would like to feed this digest, without any comment at this stage, back to the lists whose members responded so well, and so quickly.

The responses cover several different groups of languages, and are arranged just in the order they were received, not by any language order. However, as it happens several responses on South Asian languages are all together, further down, as several responded from the South Asian Linguists list <VYAKARANLISTSERV.SYR.EDU>, and my apologies to subscribers to both lists who have therefore seen this twice. Text below this point is identical in the postings to both lists.

Some of the replies may suggest additional codes, or additional strategies, for what we do in the ISO 639 JAC, and/or in other codes lists.

If I receive any other replies after this, I shall also consider any further information which adds to what has already been sent.

Best regards

John Clews

NB: Replies are separated by dashes, thus:

- ----------------------------------------------------------

> Date: Thu, 03 Feb 2000 20:03:20 -0800 > To: John Clews <Endangersesame.demon.co.uk> > From: "John A. Halloran" <seagoatprimenet.com> > Subject: Sumerian language code > Mime-Version: 1.0 > Content-Type: text/plain; charset="us-ascii" > Status: R

How did Sumerian get to be sux, when sum is not in use. That doesn't make any sense.

Regards,

John Halloran http://www.sumerian.org/ e-mail: seagoatprimenet.com

- ----------------------------------------------------------

> Date: Thu, 03 Feb 2000 23:09:06 -0500 > To: John Clews <Endangersesame.demon.co.uk> > From: Claire Bowern <bowernfas.harvard.edu> > Subject: Language codes

Hi. I work on Australian languages. It's a shame you've only got two codes for about 800 languages - one for "Australian" - aus - and one for "Papuan-Australian (other)" - paa. No separate codes for the 180 odd Australian languages and no codes for the 700 odd Papuan ones? I'm not sure what the latter one would refer to - there aren't any real connections between Papuan and Australian languages. You've got codes for dead languages, but nothing for Arrernte, Warlpiri, Tolai, Motu or Pitjantjatjara? There's a lot more writing going on in Arrernte than there is in Zuni, but . How about including some more languages from Oceania?

Regards,

Claire Bowern

Department of Linguistics Harvard University 305 Boylston Hall Cambridge, MA 02138 ph: 617-493-4230 http://www.fas.harvard.edu/~lingdept/

- ----------------------------------------------------------

> Date: Wed, 3 Feb 1999 20:59:26 -0800 (PST) > From: David Robertson <droberttincan.tincan.org> > To: endangersesame.demon.co.uk > Subject: ISO 639 & ChInuk (Chinook Jargon)

Hello, John,

Thanks for your posting on LINGUISTLIST about ISO 639 codes.

It's good to see that there is a code, chn, for ChInuk (Chinook Jargon).

I want to let you know that if and when the ISO get down to the process of establishing a standard character-set for this language, I and the CHINOOK list stand ready to advise and help.

ChInuk is fortunate in that quite a few technically savvy people are involved in the preservation and dissemination of the language. In fact, we've discussed ISO before on our list. Between the linguists and computer science people in our ranks, we can provide good feedback to ISO if, as we hope, we are called upon.

I can be reached at the above address; CHINOOK list postings go to chinooklistserv.linguistlist.org.

Lhush pulakli! Dave

*VISIT the archives of the CHINOOK jargon and the SALISHAN & neighboring* <=== languages lists, on the Web! ===> http://listserv.linguistlist.org/archives/salishan.html http://listserv.linguistlist..org/archives/chinook.html - ----------------------------------------------------------

> Date: Thu, 3 Feb 2000 23:06:25 -0600 (CST) > From: "James L. Fidelholtz" <jfidelsiu.buap.mx> > To: John Clews <Endangersesame.demon.co.uk> > Subject: Re: 11.230, Qs: Feedback sought on languages listed in ISO 639

[General comment: there are various cases (eg, alg, sal) where lg families are given, but not (all) the individual languages, which seems to go against the stated purpose of the list, insofar as I understand it. Specifically, there does not seem to be any reason NOT to be as complete as possible in such a list. In that sense, an appropriate tactic might be to use the Ethnologue (eg) basically in its entirety. If there are any criteria for excluding lgs., I missed them in your explanations. Being currently spoken seems a bad one, as does being spoken by X number of people as a minimum. Below I have interspersed my comments within brackets in the list, more or less in the appropriate place, although not always and only alphabetically well-placed]

>- ---------------------------------------------------------- > LC ISO 639-2 ISO 639-1 Language name in English >- ---------------------------------------------------------- > alg Algonquian languages

[Note that there is also one of the Alg. lgs. called "Algonquin" (sometimes); among the missing is Menominee ("men")]

[add]

[Beothuk "beo" -- a possibly Algonquian lg. not spoken since 1832]

> lus Lushai [See Salish]

> mic Micmac

[Note: the PC name is now 'Mi'gmaq'; this could be "mi'" (if single quotes are permitted), or "mig"; Algonquianists tend to use "mc"]

[add]

[missing: Mixe "mix"]

> nah Nahuatl (LC listed earlier as Aztec)

[In some circles, this is written "Nauatl", without any accents]

> nai North American Indian (Other)

[a rather diffuse category, esp. considering the ambiguity of "North American" (eg, are Mexican lgs. included--even many linguistic classification schemes do not include them]

> sal Salishan languages

[Note: lots of individual lg names missing in this list for Salish lgs.; specifically, the lg. formerly known as Skagit (?ska), now with the PC name Lushootseed (?lus), in any case virtually no longer spoken (does this matter?)]

[add]

[Totonac(o), presumably "tot" is missing here, as is Tepehua, presumably "tep", the only two clear members of their family, possibly related to Huasteco--also missing, "hua"?-- and Mayan--your "myn", but actually consisting of many languages/varieties, eg Lacandon "lac"] [missing: Zoque "zoq", possibly Mayan] - - >LINGUIST List: Vol-11-230 >

James L. Fidelholtz e-mail: jfidelsiu.buap.mx Maestr=EDa en Ciencias del Lenguaje Instituto de Ciencias Sociales y Humanidades Benem=E9rita Universidad Aut=F3noma de Puebla, M=C9XICO

- ----------------------------------------------------------

> Date: Fri, 4 Feb 2000 11:01:02 +0000 (GMT) > From: Dr Tony McEnery <mcenerycomp.lancs.ac.uk> > To: John Clews <Endangersesame.demon.co.uk> > Subject: Re: Languages listed in ISO 639: feedback sought

Hi John,

I think the work you are doing is splendid. The language I have been working most closely with recently is Sylheti - Nick beat me to the draw on getting that one added to your list. Though not an 'official' language, there is a growing movement in the UK Bengali community at least to identify it as such and revivify its writing system, Syheti Nagri.

Best,

T

Dr. Tony McEnery, Reader in Multilingual Corpus Linguistics, Dept. Linguistics, Lancaster University, Lancaster,LA1 4YT, UK.

Tel: +44 (0) 1524 593024 Fax: +44 (0) 1524 843085 email: mcenerycomp.lancs.ac.uk (JANET)

- ----------------------------------------------------------

> From: "A.F. GUPTA" <engafgARTS-01.NOVELL.LEEDS.AC.UK> > Organization: University of Leeds > To: Endangersesame.demon.co.uk > Date: Fri, 4 Feb 2000 12:28:15 GMT

You seem to have only one code for CHINESE. That's OK for the written language, but I think you certainly need codes for at least some of the major varieties of Chinese in speech, in line with normal practice of using CHINESE mostly for the written language.

You certainly need:

Mandarin Cantonese Hokkien

and probably several others.

cpe, cpf, cpp should all be OTHER than named cpe/f/p s. You have Papiemento and Bislama -- but you need more named creoles. Omissions I noticed which should certainly be there are Haitian Creole, Kristang, PNG Pidgin.

Hope this is useful.

Anthea

Anthea Fraser GUPTA : http://www.leeds.ac.uk/english/$staff/afg School of English University of Leeds LEEDS LS2 9JT UK

- ----------------------------------------------------------

> Date: Fri, 4 Feb 2000 09:32:28 -0500 > To: <Endangersesame.demon.co.uk> > From: dbeckumich.edu (David Beck) > Subject: Re: 11.230, Qs: Feedback sought on languages listed in ISO 639

Below is an additional family, Totonacan-Tepehua (Mexico), which sees to have been ommitted. This is a large family with over 100,000 speakers of eight or ten languages.

tot Totonacan-Tepehuan

David Beck Visiting Assistant Professor Programme in Linguistics University of Michigan Room 1087 Frieze Building 105 South State St. Ann Arbor, MI 48109-1285 office: (734) 647-2156 FAX: (734) 936-3406 e-mail: dbeckumich.edu http://www-personal.umich.edu/~dbeck/

- ----------------------------------------------------------

> Date: Fri, 4 Feb 2000 09:08:55 -0600 (CST) > From: Gregory David Anderson <gdandersmidway.uchicago.edu> > X-Sender: gdandersharper.uchicago.edu > To: Endangersesame.demon.co.uk > Subject: Language list codes

Hi, I noticed your list of language codes and wanted to draw your attention to some oversights on the list. In particular, the languages of Siberia are very poorly represented in your list.

Namely, there seems to be entire families missing including Ket (Yeniseian) Nivkh (aka Gilyak, language isolate) Yukaghir (isolate, or sister to Uralic) Chukotko-Kamchatkan (Chukchi/Chukchee, Koryak, and Itel'men (Kamchadal).

At least mention of these entire families should be made.

A propos to Munda languages, I noticed you had Santali separate from the rest. I would suggest having Mundari-Ho (ca. 2million speakers with a literature) as well as South Munda languages Kharia and Sora, the rest could/should be an 'other' category These latter are not as important, I think, than the oversight of entire language families which occupy a large area in central and eastern Siberia.

I hope this is of some use to you.

Greg Anderson

Gregory D. S. Anderson Department of Linguistics University of Chicago 1010 E. 59th St. Chicago, IL 60637 USA

- ----------------------------------------------------------

> Date: Fri, 4 Feb 2000 09:12:31 -0600 (CST) > From: Gregory David Anderson <gdandersmidway.uchicago.edu> > X-Sender: gdandersharper.uchicago.edu > To: Endangersesame.demon.co.uk > Subject: Further omissions from list

hi, one more omission I noted in your list

Burushaski--a language isolate from Pakistan

Greg Anderson

Gregory D. S. Anderson Department of Linguistics University of Chicago 1010 E. 59th St. Chicago, IL 60637 USA

- ----------------------------------------------------------

> To: Endangersesame.demon.co.uk > Subject: ISO language codes > Date: Fri, 04 Feb 2000 11:34:04 -0500 > From: Taylor Roberts <trobertsMIT.EDU>

Hi, thanks for posting your list to LINGUIST! I don't have anything in particular to say about the codes themselves, but I thought I would contribute an alternate spelling of 'Pushto'. There are several ways this has been rendered, but I think 'Pashto' must be used just as often. If thjis alternate spelling could be included, it might be helpful--though I realize that the names are based on LOC, and that the codes are your main point of interest.

Thanks and best wishes, Taylor

- ----------------------------------------------------------

> Date: Thu, 3 Feb 2000 22:59:11 -0600 > To: Endangersesame.demon.co.uk > From: Rick Mc Callister <rmccallisunmuw1.MUW.Edu> > Subject: Re: 11.230, Qs: Feedback sought on languages listed in ISO 639

This is a very good idea. However, if it's something for international consulmption, I strongly suggest that you use codes based on the speakers name for the language. If that name is unknown in the case of extinct languages, you might wish to use the Latin name for Western European languages and other local prestige langauges for other regions/continents. I think the use of English plays into the widespread notion that ascii and the computing infrastructure in general is Anglocentric at best and possibly racist e.g. esk, eus for Basque, esp for Spanish, deu for german, nih for Japanese, gai for Scots Gaelic/Gaidhlig, gae for Irish Gaelic/Gaeilge, cym for Welsh, etc. I have read that Wendish is a somewhat deprecative name for Sorbian/Lusatian

- ----------------------------------------------------------

> Date: Fri, 04 Feb 2000 10:05:37 -0700 > To: John Clews <Endangersesame.demon.co.uk> > From: Caroline L Rieger <criegergpu.srv.ualberta.ca> > Subject: languages listed in ISO 639 > Mime-Version: 1.0 > Content-Type: text/plain; charset="us-ascii"

Dear John Clews,

I missed Luxembourgish (Letzeburgesch) in your list. It is a Germanic language that does not have too rich a literary tradition, but one that started in 1829. Recently, more and more authors from Luxembourg pride themselves to publish in Luxembourgish. The corpus is thus growing rapidly.

If you need more information, please feel free to contact me.

Yours sincerely,

Caroline L. Rieger ********************** Caroline L. Rieger Ph.D. Candidate University of Alberta Dept.of Modern Languages & Cultural Studies 200 Arts Bldg. Edmonton, AB T6G 2E6 Canada Phone: 001 - 780 - 438 - 1062 Fax: 001 - 780 - 492 - 9106 E-mail: criegerualberta.ca

- ----------------------------------------------------------

> Date: Fri, 04 Feb 2000 11:12:24 -0600 > To: Endangersesame.demon.co.uk > From: Jill Wagner <jmwagneriastate.edu> > Subject: ISO lang names

When you said "omissions" I'm assuming you meant of languages.

I work on the interior salish language spoken in northern Idaho USA commonly referred to as Coeur d'Alene and commonly abbreviated CdA. In a cursory check of the list, I did not see this included. The indigenous name for the language, Snchitsu'umshtsn, is rarely used even by tribal members and speakers, but is worth noting.

- ----------------------------------------------------------

> From: Mark_MandelDragonsys.com > To: Endangersesame.demon.co.uk > Message-ID: <8525687B.006F42D7.00notes-mta.dragonsys.com> > Date: Fri, 4 Feb 2000 15:24:15 -0500 > Subject: Additions to ISO 639

With reference to the LINGUIST List announcement at http://linguistlist.org/issues/11/11-230.html :

(1) I suggest adding kli Klingon

This may seem a joke to you, but there is probably more material in Klingon than in Volapuk (vol), and unlike Volapuk the amount is steadily growing. The Klingon Language Institute (www.kli.org) and its quarterly journal _HolQeD_ (http://www.kli.org/study/HolQeD.html ; ISSN: 1061-2327; catalogued by MLA) are the centers of study of this language, originally developed by Dr. Marc Okrand.

(2) "Sign languages (not expanded further)" is about as acceptable as would be "Asian languages (not expanded further)". I will start with proposing asl American Sign Language and continue by posting a message about your request on SLLING-L, the Sign Language Linguistics List, to elicit contributions from sign linguists familiar with others of the dozens or hundred-plus of known sign languages in the world.

Mark A. Mandel : Senior Linguist and Manager of Acoustic Data Mark_Mandeldragonsys.com : Dragon Systems, Inc. 320 Nevada St., Newton, MA 02460, USA : http://www.dragonsys.com/ (speaking for myself) - ----------------------------------------------------------

> Date: Fri, 04 Feb 2000 18:14:43 +0000 > From: =?iso-8859-1?Q?Ant=F3nio?= Emiliano <a.emilianomail.telepac.pt> > Reply-To: a.emilianomail.telepac.pt > Organization: Universidade Nova de Lisboa / Dep. de >=?iso-8859-1?Q?Lingu=EDstica?= > To: John Clews <Endangersesame.demon.co.uk> > Subject: Languages listed in ISO 639: feedback sought

Dear Sir

I would like to propose that "Mirands" (with a circumflex over the E) be included in ISO 639.

It is the sole minority language native to Portugal, and is spoken in the North-West. It was originally a dialect of Leonese, the language of the old kingdom of Len.

An orthography has recently been developed, and Mirands is now taught in school (from grammar school level).

I would suggest that the 2 letter code "MD" be included in ISO 639-1, and the 3 letter code "MIR" be included in ISO 639-2.

For more information on this language you can contact the Linguistics Centre of the University of Lisbon (Centro de Lingustica da Universidade de Lisboa), where dialectologists can answer any query of yours. Their URL is: http://www.clul.ul.pt.

Best regards

Antnio Emiliano

NB: phone numbers in Portugal have changed as of 31 Oct 99

Dr Antonio H A Emiliano, Asst. Professor of Linguistics UNIVERSIDADE NOVA DE LISBOA Faculdade de Ciencias Sociais e Humanas Departamento de Linguistica Avenida de Berna, 26 - C 1069-061 LISBOA PORTUGAL tel: +351-21 793 35 19 fax: +351-21 797 77 59 e-mail: a.emilianomail.telepac.pt

Centro de Linguistica da Universidade Nova de Lisboa http://www.fcsh.unl.pt/hp/unidades/cecllm.htm

Nucleo Cientifico de Estudos Medievais http://www.fcsh.unl.pt/hp/unidades/ncem/index.html

"lc mann the wisdom lufath bith geslig" = lfric of Eynsham =

- --------------------------------------------------------

> Subject: Missing Language > To: John Clews <Endangersesame.demon.co.uk> > Bcc: > From: damonjunk.edu > Date: Fri, 4 Feb 2000 13:10:55 -0600

A language missing on your list is O'odham (formerly known as Papago), a language of southwestern Arizona. Its broader grouping is Piman, a Uto-Aztecan branch.

John Damon University of Nebraska at Kearney Kearney, NE 68849-1320 damonjunk.edu

- ----------------------------------------------------------

> To: SLLING-LADMIN.HUMBERC.ON.CA > cc: Endangersesame.demon.co.uk > Message-ID: <8525687B.0071D9EE.00notes-mta.dragonsys.com> > Date: Fri, 4 Feb 2000 15:52:31 -0500 > Subject: Languages listed in ISO 639: feedback sought

>From LINGUIST List #11-230 Please send replies not to me but to John Clews <Endangersesame.demon.co.uk> Do not simply post them to SLLING-L; he does not read it.

I attach an extract from a request for comments on and contributions to a list of codes for representing the names of languages, ISO [International Standards Organization] Standard #639. It was posted on the LINGUIST List. You can find the full text on the Web at http://linguistlist.org/issues/11/11-230.html

The reason I am posting it here is that the only mention of sign languages in the list is a single item, indicating that the Library of Congress uses the 3-letter code "sgn" for "Sign languages", without further distinction between them, and that neither of the existing versions of the ISO standard has anything at all for sign languages. (Please do not blame or flame Mr. [Dr.?] Clews. He is not responsible for this list!)

I have written to him as follows:

"Sign languages (not expanded further)" is about as acceptable as would be "Asian languages (not expanded further)". I will start with proposing asl American Sign Language and continue by posting a message about your request on SLLING-L, the Sign Language Linguistics List, to elicit contributions from sign linguists familiar with others of the dozens or hundred-plus of known sign languages in the world.

I strongly suggest to sign linguists who are specialists in or familiar with other sign languages that you email your suggestions for codes for those languages as soon as possible. The Joint Advisory Committee on ISO 639: Codes for representation of names of languages will be meeting in Washington, DC, February 17-18, and I suspect that information should be fed to them well in advance of those dates.

I also suggest that you look at the web site listed above for the existing list of language codes. It is possible that the usual sign linguists' abbreviation for a particular SL is already established in use for a spoken language, and some alternative code will have to be found for the SL. This has already been done with many spoken languages in the list, such as this set:

............................................................ LC ISO 639-2 ISO 639-1 Language name in English ............................................................ ara ar Arabic arc Aramaic arp Arapaho arn Araucanian (Mapuche) arw Arawak

A number of entries on the list refer to sets of languages, but almost all of these entries are for sets of *related* languages, such as

apa Apache languages

While it might make sense to have a listing for "French Sign Language and related SLs", each of those languages should also be listed; and other SLs, such as Japanese and Hong Kong SLs, could not be included in it.

The main purpose of this code is for use in computer systems. While most sign languages have no written form, that situation is changing rapidly with systems like SignWriting that are intended for signers to use. There are also systems like Stokoe notation and HamNoSys that are used by sign linguists.

Sincerely, Mark Mandel

Mark A. Mandel : Senior Linguist and Manager of Acoustic Data Mark_Mandeldragonsys.com : Dragon Systems, Inc. 320 Nevada St., Newton, MA 02460, USA : http://www.dragonsys.com/ (speaking for myself)

- ----------------------------------------------------------

> From: Rna8arnoldaol.com > Message-ID: <cc.12d30f9.25cca335aol.com> > Date: Fri, 4 Feb 2000 16:48:37 EST > Subject: Re: Languages listed in ISO 639: > To: SLLING-Ladmin.humberc.on.ca > CC: Endangersesame.demon.co.uk

There is a problem with using asl for American Sign Languages in the ISO 639 that leaves Australian Sign Language in a limbo.... can't use "asl", can't use '"aus" (it for native Australian languages).

How are we going to resolve this?? "ams" would be OK for ASL (I hope...).

Another solution would be to start with an "s" to denotate a signed language then two other letters to make it unique.

Thus: sam - for ASL (um... gives a new twist on Uncle Sam) sau - for Australian SL sot/soe/sor - for Austrian SL (Osterreich GS) snz - for NZSL snd - for Nederland SL (Holland) sdg - DGS (German Sign Language)

Remember these are just codes for computer/internet usage and not neccessarily for academic usage .... These codes would be used for a way to detect font types for the languages concerned.

My proposal is for a panel of SL experts (who are familiar with computer systems) to consult with ISO over this.

Richard Arnold

PS I have forwarded this to John Clews as well, but I thought to let you guys know and get us discussing this issue.

- ----------------------------------------------------------

> From: Rna8arnoldaol.com > Message-ID: <dd.ecbc6f.25cc9ee8aol.com> > Date: Fri, 4 Feb 2000 16:30:16 EST > Subject: re: ISO Standard codes for Sign Languages > To: Endangersesame.demon.co.uk

This is in response to the upcoming ISO standard codes for languages.

>From the SLLING_L list email:

<<The main purpose of this code is for use in computer systems. While most sign languages have no written form, that situation is changing rapidly with systems like SignWriting that are intended for signers to use. There are also systems like Stokoe notation and HamNoSys that are used by sign linguists.>>

They could incorporate the following codes for these three notation (written) systems:

SGW SignWriting STK/SKE Stokoe notation for sign languages HNS Hamburg Notation System for Sign Languages

For some Sign Languages itself:

NZL New Zealand Sign Language

- ----------------------------------------------------------

> Date: Fri, 4 Feb 2000 14:23:30 -0800 > To: John Clews <Endangersesame.demon.co.uk> > From: Valerie Sutton <SuttonSignWriting.org> > Subject: sign language codes > Cc: eversonINDIGO.IE

February 4, 2000

Dear John -

I have received a whole bunch of messages in the past half hour about a message you posted to the Linguist's List (smile...isn't the internet something?!)

I am not a member of the Linguists List...so I could not post this response myself, although a friend may do it for me later...and I did write to the Sign Language Linguists List (SLLING) a few minutes ago, and I sent you a copy of that message....

Meanwhile, I wanted to write to you personally. I am sending a copy of this message to Michael Everson, because Michael wrote and submitted an application to the Registration Authority on our behalf last September.

You can read about the application on this web page:

International Organization for Standardization (ISO) ...application for language codes for Sign Languages... http://www.indigo.ie/egt/standards/iso639/sign-language.html

We are already using these codes for signed languages in the SignWriter 5.0 computer program, typing signed languages from 18 countries in SignWriting, and they are working well. They are easy to recognize in the java source code...and our programmer likes them very much....

There were many arguments as to what "three letter codes" to use for signed languages....we discussed it for weeks on the SignWriting List...since obviously ASL could also be Austrian Sign Language and so forth...and of course German Sign Language is not GSL, because it is DGS in Germany and so it should be - since that is the terminology they use in Germany!

So finally, we placed "sgn" connected with the country code plus the region code of the country - so in other words:

sgn.DK

....means the signed language used in Denmark....

And if there are dialects...

sgn-ES-CT

stands for the signed language used in Espana (Spain) in the Catalonian region...etc...so that differentiates it from the signed language used in Madrid.

In other words it is pretty neutral, since the country or region code already established for the country or region is attached to the general three letter code "sgn" for sign language.

The reason this works for computer programmers is that they already know the code "DK" for Denmark, so attaching "sgn" to "dk" makes sense that it is the signed language used in Denmark...

And there is much more detail to this..Michael Everson hit upon an excellent way to determine "Signed Danish" versus Danish Sign Language:

sgn-dan-DK

means sign-Danish-Denmark

The three letter code "dan" is known for the spoken language of Denmark...so that is Signed Danish, since it is connected with spoken Danish...

Just wanted to let you know, John, and thanks so much for caring about signed languages!!

Valerie Sutton mailto:SuttonSignWriting.org

- ----------------------------------------------------------

> Date: Fri, 4 Feb 2000 15:56:51 -0700 (MST) > From: "Angus B. Grieve-Smith" <grvsmthunm.edu> > To: Endangersesame.demon.co.uk > Subject: Obvious omissions from ISO 639

I noticed that there are no signed languages in your list. As many signed languages are now being written on paper and by computer, this is an important issue. You can get a sense of the number of signed languages currently being written from <http://www.signwriting.org>;.

Thanks for asking.

-Angus B. Grieve-Smith Linguistics Department The University of New Mexico grvsmthunm.edu

- ----------------------------------------------------------

> From: Sebasti Pla <sastiaretemail.es> > Organization: Blackadder & Co. > To: Endangersesame.demon.co.uk > Subject: Valencian = Catalan > Date: Sat, 5 Feb 2000 01:11:05 +0100

Hi John

I've read the list of languages for the ISO 639 standard. You include Valencian, marked with --- --- --- (??). There is not a Valencian language. It is the name which the people in the Valencian country give to the variant of the catalan language they speak. There is a movement claiming it=20 is a different language, but this movement is inspired by political reasons, without any ground on real language.

By the way, I'm valencian, and of course I speak catalan.

If you want further information, feel free to contact me.

Best regards. Sebasti=E0 Pla.

- History shows that people who don't value freedom enough to defend it will tend to lose it.=20 =09=09=09 Richard M. Stallman

- ----------------------------------------------------------

> Date: Fri, 4 Feb 2000 16:55:30 -0500 (EST) > From: Martin Jansche <janscheling.ohio-state.edu> > Reply-To: Martin Jansche <janscheling.ohio-state.edu> > To: John Clews <Endangersesame.demon.co.uk> > Subject: Re: 11.230, Qs: Feedback sought on languages listed in ISO 639

Dear John Clews,

This is in response to a posting on the LINGUIST mailing list.

On 3 Feb 2000, The LINGUIST Network wrote:

> LINGUIST List: Vol-11-230. Thu Feb 3 2000. ISSN: 1068-4875.

> If possible could you embed your comments within my quoted table, > unless your comment is very simple on a few lines: that will enable > me to allign comments.

I'm repeating the entire list, with some short comments appended to the appropriate lines, so you'd have to do a diff to get at them. Others are interspersed on separate lines. The usual disclaimers apply.

A huge problem is getting the level of granularity right. The genetic codes range from Indo-European down to Germanic, and similar distinctions could be made in other families (Malayo-Polynesian under Austronesian, etc.). A hierarchical, extensible standard seems to be more appropriate in the long run, but I realize that this is not the right place and time now.

I'd very much like to see codes for the various Chinese languages/dialects in place. I realize this is a politically sensitive issue, but politics just has to come to grips with reality sometimes.

Thank you very much. Sincerely,

- martin jansche

> ara ar Arabic

Moroccan, Egyptian, Classical, ... Arabic

> aus Australian languages

Perhaps "Pama-Nyungan" instead, depending on what specific languages it is referring to.

[add]

can/yue Cantonese (Chinese)

> chi/zho * zh Chinese

replace with: Chinese languages

[add]

zht Middle Chinese (Sui, Tang Dynasties) zhz Old Chinese (Zhou Dynasty)

[add]

Dyirbal, Jirrbal (Pama-Nyungan)

[add]

flm Flemish

[add]

gan Gan (Chinese)

[add]

hac Haitian Creole

[add]

hak Hakka, Kejia-Hua (Chinese)

[add]

Kartvelian languages (Other)

[add]

bfh Mandarin, Beifang-Hua (Chinese)

[add]

Miao (or is that subsumed by Yao?)

[add]

mnb/fzh Min-Bei, Fuzhou-hua (Chinese) mnn/twn Min-Nan, Taiwanese (Chinese)

[add]

nad/den Na-Dene, Dene

[add]

Okinawan

[add]

pth Putonghua (Chinese) [NB separate from Mandarin]

[add]

Sundanese (Austronesian)

[add]

Vulgar Latin

[add]

Warlpiri (Walpiri, Walbiri), Australia

[add]

wuc Wu, "Shanghainese" (Chinese)

[add]

xia Xiang (Chinese)

- ----------------------------------------------------------

> From: Claudiu Costin <claudiucinterplus.ro> > Reply-To: claudiucinterplus.ro > To: Endangersesame.demon.co.uk > Subject: [ISO639 issue] Romanian language is good for "rom" > Date: Sat, 5 Feb 2000 02:26:54 +0200

Dear John,

> rum/ron * ro Romanian =20 > --- --- --- (ry) Romany; Romani > rom Romany

Please note: 1) "rom" is more apropiate for Romanian language. 2) Is unfair to assign "rom" to "romany" (or "romano") which is gipsy language.=20

If insist to remain tracked on "romany" denomination please note that _REAL_ name will be like "rommany" because in Romanian Academy, mass-media, gipsy books, gipsy people claim to be named "romm" and not "=FEigan" (gitanes in spanish) because in Romania it's almost a shame to be blessed like above (due to very high unsocial acts and behaviour).=20

I don't like to say these, but the reality is more than that. That's why after 1989 (after 17 December Romanian Revolution) appeared the term romm, and the language "romany".

These was boned from intellectual Gipsy people (which have good reputation) which try to "kill" pejorative denomination "=FEigan" (this quoted text is in ISO8859-2. You may read "tzeegun").

CONCLUSIONS:

1) Make "rom" for Romanian language 2) Change needed for Romany. It will be nice to be "rmy". You have more experience and you can choose better option. 3) Please contact me, tell me your opinion and if Romania have representants in ISO639 commission. If not I will try to make some waves to change this.

regards, - =20 o-------------------------------------------o Claudiu COSTIN claudiucinterplus.ro sysad ADComm.pub.ro claudiucgeocities.com Linux-KDE Romania http://www.ro.kde.org Home page http://lion.ADComm.pub.ro/~claudiuc

- ----------------------------------------------------------

> Date: Fri, 4 Feb 2000 14:10:08 -0800 > To: "For the discussion of linguistics and signed languages." > <SLLING-LADMIN.HUMBERC.ON.CA> > From: SignWriting <DACSignWriting.org> > Subject: Re: Languages listed in ISO 639: feedback sought > Cc: Endangersesame.demon.co.uk

February 4, 2000

Dear SLLING Members.... I noticed Mark's excellent message about recognition of signed languages in computer codes...

Thank you, Mark, for bringing this issue to everyone's attention..

As you all know, computer programmers refer to the "ISO 639-2 Registration Authority" for the standard codes used to represent the world's languages. This helps standardize software development.

In September, 1999, the Deaf Action Committee for SignWriting (the DAC), and the Irish National Body applied to the Registration Authority, with the help of Unicode specialist Michael Everson, requesting that the world's Sign Languages be included. The application is currently waiting for approval. It was supposed to be decided upon last November, and then the meeting was postponed until this month. I believe it will be voted on in the next two weeks.

You can read about the application to the ISO on these web pages:

International Organization for Standardization (ISO) ...application for language codes for Sign Languages... http://www.indigo.ie/egt/standards/iso639/sign-language.html

Recognition of Signed Languages http://www.SignWriting.org/unicod01.html

We are already using these codes for signed languages in the SignWriter 5.0 computer program, typing signed languages from 18 countries in SignWriting, and they are working well. They are easy to recognize in the java source code...and our programmer likes them very much....

There were many arguments as to what "three letter codes" to use for signed languages....we discussed it for weeks on the SignWriting List...since obviously ASL could also be Austrian Sign Language and so forth...and of course German Sign Language is not GSL, because it is DGS in Germany and so it should be - since that is the terminology they use in Germany!

So finally, we placed "sgn" connnected with the country code plus the region code of the country - so in other words:

sgn.DK

....means the signed language used in Denmark....

And if there are dialects...

sgn-ES-CT

stands for the signed language used in Espana (Spain) in the Catalonian region...etc...so that differentiates it from the signed language used in Madrid.

In other words it is pretty neutral, since the country or region code already established for the country or region is attached to the general three letter code "sgn" for sign language.

The reason this works for computer programmers is that they already know the code "DK" for Denmark, so attaching "sgn" to "dk" makes sense that it is the signed language used in Denmark...

And there is much more detail to this..Michael Everson hit upon an excellent way to determine "Signed Danish" versus Danish Sign Language:

sgn-dan-DK

means sign-Danish-Denmark

The three letter code "dan" is known for the spoken language of Denmark...so that is Signed Danish, since it is connected with spoken Danish...

Hope this helps a little!

Valerie Sutton mailto:SuttonSignWriting.org

- ----------------------------------------------------------

> From: Claudiu Costin <claudiucinterplus.ro> > Reply-To: claudiucinterplus.ro > To: Endangersesame.demon.co.uk > Subject: [ISO639 issue] Moldavia have language Romanian; Moldavian is a >dialect > Date: Sat, 5 Feb 2000 01:03:36 +0200

Dear John,

Please note that does not exist Moldavian language. It is a regional dialect. So, Moldavia is old romanian region which was hijack by communist Russian ago 50 years.=20

************************************************************** The people language is Romanian and with dialect Moldavian **************************************************************

We have many dialects in Romania (I don't know english translating):

- muntean - ardelean - moldovenesc <--- ("moldavian" in english) - arom=E2n

All these mean Romanian language. There is no Moldavian language. Just a silly historical & political situation that make my=20 country to by divided in two parts.

Moldavian education ministery have empowered all official and educational resources to skip from very high russification led by KGB to Romanian language (note this! Romanian not Moldavian). This was since 1994 if I recall corectly.

My current personal observations (made at National Moldavia TV) is that official Romanian language (aka ~90% "muntean") is speak and write everywhere.

Conclusion: 1) Correct Moldavia language to "rom". 2) Please inform me if you have any doubt

regards,

o-------------------------------------------o Claudiu COSTIN claudiucinterplus.ro sysad ADComm.pub.ro claudiucgeocities.com Linux-KDE Romania http://www.ro.kde.org Home page http://lion.ADComm.pub.ro/~claudiuc

- ----------------------------------------------------------

Note: several of the replies which follow are a group mainly relating to South Asian languages

- ----------------------------------------------------------

> Date: Thu, 3 Feb 2000 05:29:53 -0800 > Reply-To: South Asian Linguists <VYAKARANLISTSERV.SYR.EDU> > Sender: South Asian Linguists <VYAKARANLISTSERV.SYR.EDU> > From: Peter Claus <pclausCSUHAYWARD.EDU> > Organization: California State University, Hayward > Subject: Re: Indian and other Asian languages listed in ISO 639: > feedbacksought > To: VYAKARANLISTSERV.SYR.EDU

VYAKARAN: South Asian Languages and Linguistics Net Editors: Tej K. Bhatia, Syracuse University, New York John Peterson, University of Munich, Germany Details: Send email to listservlistserv.syr.edu and say: INFO VYAKARAN Subscribe:Send email to listservlistserv.syr.edu and say: SUBSCRIBE VYAKARAN FIRST_NAME LAST_NAME (Substitute your real name for first_name last_name) Archives: http://listserv.syr.edu

Dear John,

For the state of Karnataka, India, there are at least three major languages left off. All have literatures and scholars working in and on them: Tulu, Badaga, and Kodagu.

Tulu, a Dravidian language, in particular, has a large number of speakers and a large amount of scholarship devoted to it. It also has at least two phonemes which are not shared with other Dravidian languages. There is a great need for a standard transliteration scheme since much of the scholarship includes a large amount of transcribed oral textual material and many people (myself included) would like to put translated text on the internet.

Kodagu, at present, has a smaller literature and a smaller number of scholars working on it, but enough for consideration as a significant Indian language.

Badaga may not have its own literature, but there are scholars working on this language and there are oral texts which have been collected and transliterated.

Toda, Kota, and Kuruba (maybe several separate languages), found along the border of Karnataka and Tamil Nadu, should also be included, since their phonemic systems are distinctive and there is a fair amount of scholarship on them, both past and present.

Please contact Ulrich Demmer (t45ix.urz.uni-heidelberg.de) or Gail Coelho (gailutxvms.cc.utexas.edu) for the internal differentiation within the Kuruba group of languages.

Peter Claus

- ----------------------------------------------------------

> Date: Thu, 3 Feb 2000 08:49:25 -0600 > To: John Clews <EmeetSESAME.DEMON.CO.UK> > From: John Clews <EmeetSESAME.DEMON.CO.UK> (by way of Hans Henrich Hock) > Subject: Indian and other Asian languages listed in ISO 639: feedback > sought

Dear Colleague,

The usual abbreviation for _Sanskrit_ in the fields of Linguistics and Indology is _Skt_.

Best wishes,

Hans Henrich Hock Professor of Linguistics and Sanskrit Linguistics, 4088 FLB MC-168, University of Illinois 707 S. Mathews, Urbana IL 61801-3652 telephone: (217) 333-0357 or 333-3563 (messages) e-mail: hhhockstaff.uiuc.edu fax: (217) 333-3466

- ----------------------------------------------------------

> Date: Thu, 3 Feb 2000 09:38:25 -0600 > To: Emeetsesame.demon.co.uk > From: Mark Southern <m.southernmail.utexas.edu>

Dear John,

re your recent VYAKARAN appeal for info, esp. omissions / language names:

A lot of 1. North-Cent. Amer. / 2. Australian / 3. African languages seem to be left out - possibly caught by the SIL codes?

e.g.: 1. Mixtec, Tarascan, Tuscarora, Hopi 2. Warlpiri, Dyirbal 3. Fe?Fe? Bamileke, Damara

and in Austronesian: Mentawai (btw the Polynesian lg. Truk, whcih you have ??? beside, is usually called Trukese)

Asia: Samoyed, Ainu, Chukchi, Yukaghir, Burushaski

Mark S.

Mark Southern Dept. of Germanic Studies EPS 3.102 University of Texas Austin, TX 78712 512-232-6371

- ----------------------------------------------------------

> Date: Thu, 3 Feb 2000 11:07:16 -0500 (EST) > From: "E. Bashir" <ebashirumich.edu> > X-Sender: ebashirseawolf.gpcc.itd.umich.edu > To: John Clews <EmeetSESAME.DEMON.CO.UK> > Subject: Re: Indian and other Asian languages listed in ISO 639: feedback > sought

Dear John,

There follow names of some languages which are not included in your list. Most of these are spoken in Pakistan; two of them are (probably) extinct by now (Tirahi, Wotapur-Qatarqalai). Most of these are names of important languages, having numerous sub-dialects (usually named for the village or region where they are spoken). Siraiki and Hindko are two names for important variants of western Panjabi which are often discussed under the names given, particularly in the context of sociolinguistic or language planning issues.

Just a thought: instead of relying on input from list-members, which may be spotty and miss many things, why not consult a standard work on the languages of the world, for example Ruhlen's Guide to the Languages of the World (Ruhlen, Merritt. 1987. Guide to the Languages of the World. Stanford: Stanford University Press)?

Language Genetic grouping Suggested code (by me, have not checked Ethnologue codes)

Balti Tibeto-Burman blt Brahui Dravidian brh Brokskat Indo-Aryan (Dardic) bro Burushaski Isolate brs Dameli Indo-Aryan (Dardic) dam Domaki Indo-Aryan dom Gawarbati Indo-Aryan (Dardic) gaw Gojri Indo-Aryan goj Grangali Indo-Aryan (Dardic) gra Hindko Indo-Aryan hnk Ishkashmi Iranian ish Kalasha Indo-Aryan (Dardic) kls Kanyawali Indo-Aryan (Dardic) kan Khowar Indo-Aryan (Dardic) khw Kohistani Indo-Aryan (Dardic) koh Palula Indo-Aryan (Dardic) pll Pashai Indo-Aryan (Dardic) psh Sawi Indo-Aryan (Dardic) saw Shina Indo-Aryan (Dardic) shi Siraiki Indo-Aryan sir Shumashti Indo-Aryan (Dardic) shu Tirahi Indo-Aryan (Dardic) trh Torwali Indo-Aryan (Dardic) tor Wakhi Iranian wkh Wotapur-Qatarqalai Indo-Aryan (Dardic) wot Yazghulami Iranian yaz Yidgah Iranian ydg Zebaki Iranian zeb

Regards,

Elena Bashir

************************************************************************** Elena Bashir, Ph.D. 3070 Frieze Bldg. Lecturer in Urdu and Hindi The University of Michigan Dept. of Asian Languages and Cultures Ann Arbor, MI 48109 Phone: 734-763-9178 Dept. Phone: 734-764-8286 (messages only) Fax: 734-647-0157 **************************************************************************

- ----------------------------------------------------------

> Date: Fri, 4 Feb 2000 03:13:56 -0500 > Reply-To: South Asian Linguists <VYAKARANLISTSERV.SYR.EDU> > Sender: South Asian Linguists <VYAKARANLISTSERV.SYR.EDU> > From: Peter Hook <pehookUMICH.EDU> > Subject: Indian languages to be listed > X-cc: Emeetsesame.demon.co.uk > To: VYAKARANLISTSERV.SYR.EDU

Dear John Clews,

There are at least 5000 named languages in the world. Will a 3-letter code be able to cover them all? Mathematically possible, yes, but many of the 3 letter sequences (like QQQ) will not be helpful if there has to be a relationship between the language name and the abbreviation.

In any case, I would like to suggest adding a few more from India and Pakistan:

1. Poguli (POG?): spoken in Kashmir. (See my Webpage http://www-personal.umich.edu/~pehook/index.html for a link to more information on Poguli.)

2. Bangani (BAN?): spoken in Uttar Pradesh. Bangani is much in the South Asian linguistics news lately, as it is purported to have kentum vocabulary in it. (See my Webpage for a link to a Bangani page.)

3. Garhwali (GAR?) is spoken by about 2 million people in Uttar Pradesh.

4. Shina (SHN?) is spoken all over the Northern Areas of Pakistan and in several places in western Kashmir.

5. For many other languages spoken in Pakistan and Afghanistan please see Richard Strand's elaborate Webpage on Nuristan. A link to it is available from my page.

I won't continue because you may have some conditions in mind that render these suggestions pointless. But a glance at Grierson's Linguistic Survey of India will illustrate the problem of trying to be exhaustive.

Sincerely,

Peter Hook

http://www-personal.umich.edu/~pehook/index.html

- ----------------------------------------------------------

- John Clews, SESAME Computer Projects, 8 Avenue Rd, Harrogate, HG2 7PG tel: 0171 412 7826 (day); 0171 272 8397 (evening); 01423 888 432 (w/e) Email: Emeetsesame.demon.co.uk

Committee Chair of ISO/TC46/SC2: Conversion of Written Languages; Committee Member of ISO/IEC/JTC1/SC22/WG20: Internationalization; Committee Member of CEN/TC304: Information and Communications Technologies: European Localization Requirements Committee Member of the Foundation for Endangered Languages; Committee Member of ISO/IEC/JTC1/SC2: Coded Character Sets