Editor for this issue: <>
When I joined the Voynich Manuscript interest group, I dusted up and
dolled up a few computer programs I had written for my selfish purposes
in the days when I was employed as a linguist, and contributed them
to their public ftp site. Amongst those was a curiosity which might
amuse and perhaps puzzle subscribers to Linguist. To spare you
the chore of uudecoding then unzipping the files only to find
that you deem it of no interest, I shall copy hereunder the DOC
file, and follow with the uuencoded zip file. Kindly let me know
what you think.
COGNATE
an apparently
wonderfully useless program
implementing an algorithm
by
Jacques B.M. Guy
Artificial Intelligence Systems
Telecom (Australia) Research Laboratories
WHAT IS "COGNATE"?
COGNATE is the implementation of a prototype algorithm for identifying
related words across languages.
My ultimate purpose in developing COGNATE was to take a first step towards
solving a far more interesting, and difficult, problem of automatic
machine translation: given a bilingual text, find the rules for
translating from either language into the other.
Given the same list of words in two different languages, COGNATE will
determine which words are likely to be regularly derivable from each
other, and which are not. The longer the list, or the more closely related
the two languages are, the better the performance of COGNATE. For instance,
suppose that you have typed into a file 200 words in English (one per
line), and in another file the same 200 words, in the same order, in German
(again one per line). English and German are fairly close languages. Given
these two files, and no other information whatsoever, COGNATE will be able
to tell for instance that English "TWENTY" and German "ZWANZIG" are almost
certainly derivable from each other, and so are English "HONEY" and German
"HONIG"; but it will also tell you that English "HORSE" and German "PFERD"
are not so related. COGNATE will also tell you, when comparing "TWENTY"
with "ZWANZIG", that English "T" corresponds to German "Z".
Because of the very nature of the algorithm, you may encypher each file
using a simple-substitution code, without causing COGNATE to be confused.
For instance, if you have encoded the English data by shifting one letter
forward (so that "TWENTY" becomes "UXFOUZ") and the German data by shifting
one letter backward (so that "ZWANZIG" becomes "YVZMYHF"), COGNATE will
still able to tell that "UXFOUZ" and "YVZMYHF" are related, and that
"IPSTF" ("HORSE") and "OEDQC" ("PFERD") are not.
I thought up the algorithm behind COGNATE around 1981, and implemented it
first in Simula 67 on a DEC KL10. Then, as a self-inflicted challenge
which I did not expect to win, I tried to translate it into Turbo Pascal,
to run on my Kaypro II. It worked. On a Kaypro II, it would take COGNATE 40
seconds to analyze two files each containing 200 words, and find which were
related and which not. On a 386DX running at 33MHz, the same operation
looks as if it were instantaneous.
[Moderators' note: The full program file is available on the
server. To get the file, send a message to:
listserv
tamvm1.tamu.edu (if you are on the Internet)
OR
listserv
tamvm1 (if you are on the Bitnet)
The message should consist of the single line:
get cognate prog linguist
You will then receive the complete file.]
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue
ANNOUNCEMENT OF NEW JOURNAL CALL FOR PAPERS The JOURNAL OF INTERPRETATION, a quarterly journal of the Registry of Interpreters for the Deaf, Inc., publishes articles, research reports, commentaries and squibs, review articles, and book reviews. The journal reflects a broad, interdisciplinary approach to the interpretation and translation of languages. The journal expressly desires to serve as a forum for the cross-fertilization of ideas from diverse theoretical and applied fields, examining signed or spoken language interpretation and translation. Articles addressing interpretation and translation theory and practice, interpreter and translator education, and related topics are especially welcome. In addition, research and commentaries examining the interpretation and translation of signed and spoken languages from the fields of linguistics, psycholinguistics, sociolinguistics, neurolinguistics, applied linguistics, cognitive science, machine translation, discourse analysis, conversational analysis, anthropology, semiotics, and communication are appropriate for submission. Any standard format for style, notes, and references is suitable for editorial consideration. Authors of accepted articles will be required to submit copy which conforms to the editorial standards of the journal. Manuscripts may be submitted to the editor-in-chief at the address below. One copy on 8 1/2" X 11" paper is requested. Manuscripts also will be accepted in several electronic formats. Acceptable Macintosh formats are Microsoft Word, MacWrite, WordPerfect, Nisus, PageMaker, and FrameMaker; MS-DOS files may be submitted in WordPerfect 5.x, PageMaker, and Microsoft Word. Only files on 3 1/2" disks (Macintosh and MS-DOS) are acceptable. Electronic submissions should be accompanied by hard copy of the manuscript. Manuscripts should be submitted to: Sherman Wilcox, Editor-in-Chief Journal of Interpretation Department of Linguistics University of New Mexico Albuquerque, NM 87131 The Internet address for the journal is wilcoxMail to author|Respond to list|Read more issues|LINGUIST home page|Top of issuecarina.unm.edu Subscription and membership information, advertisements, and all other communication should be addressed to: Sylvia Straub Executive Director Registry of Interpreters for the Deaf, Inc. 8719 Colesville Road Suite #310 Silver Spring, MD 20901-3919