Editor for this issue: <>
A sincere thanks to all who replied. This following will definitely get me off the ground. Cheers, AMcE. ======================================================================= From: "C.M.Thomson" <tomMail to author|Respond to list|Read more issues|LINGUIST home page|Top of issuefiveg.icl.co.uk> One fairly modern MT system that has been something of a commercial success is the METAL system from SNI. Some information is given in Hammer, C. Parallel Lisp and the Text Translation System METAL on the European Declarative System in ICL Technical Journal vol 8 iss 4 Nov 1993 pp 641-654 (Oxford University Press) Gajek, O. The METAL System. in CACM vol 34 iss 9 Sept 1991 Thurmair, G. METAL: Computer Integrated Translation in Proc of the SALT Workshop 1991, Manchester I'm sure Carsten has a pile of extra references he could point you to if you contact him (email: Carsten.Hammer
zfe.siemens.de). There were several attempts at English/Japanese and Japanese/English MT from the mid 60s onwards, by various Japanese companies and academic institutions, but these were never successful (even in a restricted field like comms prococol definition the output was beyond post-processing by anyone not fluent in the source language and it would take longer to postprocess than that person could have done translating without the m/c output) - at least they were really bad at least until the late 70s, they may have moved on since then. Also there has been work on English/Russian in the UK, the US, and the USSR, and on English/French and Russian/French in France and in the USSR, that I know about. So there has been plenty going on for at least three decades - if I hadn't scrapped all my MT info when I came to Manchester 8 yrs ago I could have given you references to much of the early work. PS: METAL covers German -> English, German -> Spanish, English -> German, French -> Dutch, Dutch -> French in commercially available form (ie what you can buy from SNI today) and has other language pairs in development; it is a real commercial translator requiring very little human post-processing, not a research toy or one of those awful products that delivers so many options that the post-processing reduces length by 50%. So I think it is maybe more interesting than a lot of other MT systems. ===================================================================== From: iat
cl.cam.ac.uk Survey of systems in the market: BYTE 18(1), January 1993 Journals: Machine Translation (Kluwer), Computational Linguistics (MIT). Books: An Introduction to Machine Translation, Hutchins & Somers, 1992. ===================================================================== From: "Caoimhin P. ODonnaile" <caoimhin
smo.ac.uk> You probably know this already, but Machine Translation is a huge subject, much more difficult than first imagined, with large corporations currently spending millions of pounds per year on it, large conferences devoted to the subject and many books constantly appearing. There is a database (accesible via the Internet) somewhere in Germany (the University of Stuttgart, I think) - of computational linguistics software: parsers, generators, etc. I can look out details if you want. I am sure that most MT systems cost big money and are not openly accessible. The main cheaply available software seems to be "Globallink" for the PC - available for a few hundred pounds at a guess. I haven't seen it, but I get the impression that the quality of the translation is sometimes dire, sometimes useful. The only open-access system I have heard of is a mail-server somewhere in Finland which will return to you, parsed, a copy of a short English text which you send to it in a mail message. I think that as far as Gaelic goes, full machine translation is not an attainable goal in the short to medium term. Better to go for lesser goals, spell-checkers and the like, which are in any case prerequisites for an MT system. One of main prerequisites for any system is a good lexical database, and so the work which you and Gearo/id O/ Ne/ill are doing on the Foclo/ir Po/ca at the University of Limerick is admirable. I have had a copy of the "Learners' Irish-English Dictionary" online for many years for my own use. I typed it in and verified it myself. It is copyright and I have no permission, so I can't pass copies to other people, but if there is any way in which it can be used to assist your own work without infringing copyright - e.g. for checking a lexical database constructed from the Foclo/ir Po/ca, I will be very glad to help in any way I can. Perhaps the Educational Company of Ireland would be willing to give permission for academic use if approached. Once you have a lexical database it is very simple to construct wordlists for spell checking. WordPerfect allows you to construct spell check lexicons for your own language starting with a wordlist. Version 2 of Word for Windows does not allow this, but Version 6 which is just out (the version numbers jumped!) may allow it. After this there are all sorts of further intermediate goals - online dictionaries and terminology databases accessible from word processors, lemmatisers, parsers, thesauruses. >From what I have read, the large corporations who have attempted MT have generally found that machine *assisted* human translation, aided by tools such as those I have mentioned above, is more successful that MT. Pure MT is not currently possible except in extremely limited semantic fields (the translation of weather forescasts from English to French in Canada is the classic example); a large amount of human pre-editing and post-editing is required to achieve a presentable result. On a more positive note, even if the quality of MT is still pretty miserable, even after massive investment, I think that there may be an increasing market for very poor quality translations. Since most new writing is on computer anyway, and computers are powerful enough to produce a translation in a few seconds, many people may feel that even a lousy translation is better than no translation at all. Imagine if at the press of a key you could obtain an interlinear tranlation, shown in a different font or colour, of GAELIC-L messages. I think that many of the American subscribers to GAELIC-L who know very little Gaelic would be delighted with this. I would be delighted if such a facility was available for WELSH-L. Irish Gaelic to Scottish Gaelic translation (or vice-versa) would be very much easier than Gaelic to English translation, since Irish Gaelic and Scottish Gaelic are so similar syntactically. (They are "cognate languages", in the jargon.) It would also be a worthwhile goal since Irish Gaelic is not intelligible to most speakers of Scottish Gaelic, and vice-versa. GAELIC-L would be an ideal testing ground for such a system. ====================================================================== From: Vasu Renganathan <vasu
u.washington.edu> I am sending you a summary of responses I recently got on MT, from an NLP newsgroup for South East Asian languages. I hope it helps. The general consensus seems that software translation is probably still not very "smart" and itself will not do the job of an experienced translator fluent in both languages. It could be an aid to the translator, however, and makes that person`s job a lot easier and faster. I would like to thank everyone who responded to me. If you know of any other vendors who make/market translation software, please let me know via email. I`d be glad to update this and send it back out. As a disclaimer, I am not associated with any of these vendors. The following information is given as a public service, use it at your own risks. 1. Translation system by MRJ, Inc., 10455 White Granite Drive - Oakton, VA 22124, 703-385-0830 (voice) - 703-385-4637 (FAX). This is commercial bilingual English <-> Japanese translation systems, including OCR and MT (Machine Translation) components. 2. Language Engineering Corporation, product name: LOGOVISTA develops English-to-Japanese translation software. Tokyo-based software developer LOGOVISTA has developed a software package which supports the translation of English-language business letters and technical essays into Japanese. "LogoVista E to J" translates more rapidly than other packages and the finished text requires less rewriting, according to the developers. Versions which run on SUN, HEWLETT PACKARD, and SONY workstations, as well as APPLE "Macintosh" computers, will be released in October. NEC "PC-9801" and IBM DOS/V PC versions will be released next spring. The software, which will be sold through PC dealer KATENA, is expected to be priced under 200,000 yen ($1,600). This is an English-to-Japanese translation system called LogoVista E to J. Macintosh and Japanese Windows versions are available; both can print to a PostScript printer. LogoVista E to J includes a main dictionary with over 100,000 entries; this dictionary can be supplemented both by any of our nineteen technical dictionaries and by user dictionaries that you create. The following technical dictionaries are available: aerospace engineering, agricultural science, applied chemistry, applied physics, architecture, biology, chemistry, civil engineering, earth science, electrical engineering and electronic communications, general business, general science and technology, information science, materials science, mechanical engineering, naval architecture, physics, urban engineering, and zoology. The technical dictionaries contain a total of over 415,000 terms. The Macintosh version of LogoVista E to J requires either KanjiTalk 7.1 or US System 7.1 and the Japanese Language Kit. The Windows version requires DOS/V 5.0 or later and Japanese Windows 3.1. Both versions require at least 6MB of RAM and 30MB of hard disk space. The price of the basic system (with the 100,000-entry dictionary) is $1,995. The four largest technical dictionaries (general business, general science and technology, electrical engineering and electronic communications, and mechanical engineering) cost $995 each. The other fifteeen technical dictionaries cost $495 each. Call John Richards (johnr
lec.com), (617) 489-4000, ext. 727 for more information. 3. IBM JAPAN has developed and released for sale a translation support software which simultaneously displays the source text and the in-process translation on the same screen, showing synonyms and dictionary definitions in separate windows. The new "Translation Manager/2," the first translation support tool of its type, makes it possible to share the same data on two different PCs and boasts other features which double productivity compared to manual translation, according to IBM JAPAN. The price is 787,500 yen ($7,429). 4. Someone mentioned Duet from JustSystem (The company who made Ichitaro). "But as far as I know, it works with only PC9801 series of NEC, a DOS machine but not quite IBM-PC/AT and it's really dumb. And there are "The Translator" and "Logovista" from Katena. These guys are for Macintoshes (Logovista is available also for Windows, I think) and singnificantly smarter, especially Logovista which can handle nested clause such as "I don't think you think your boss thinks computers can think". Remember, though, that machine translation is stil at primitive level. It's just as smart as cpp (perhaps a little smarter). And you need to make a lot of investment besides money for software and hardware to cultivate your own set of dictionary for your own need (the reason Duet is still strong is this: Many companies have spend singnifican manhours to grow dictionary). And even with that, that will not irradicate the need for human translators. It helps professionals a lot by preparing a draft but it's no good for people who doesn't know English and Japanese at all.... ==================================================================== From: mbm
mtl.mit.edu For information on MIT efforts ask Robert Berwick (berwick
ai.mit.edu). ==================================================================== From: Francis Bond <bond
nttkb.ntt.jp> I am working on a Japanese-English MT system. I would be happy to send you a copy of our demonstration pamphlett, which gives a brief description of the system and lists further references. If you can print Japanese characters I can send you the .ps file, or a LaTeX or DVI file. If not I can send you a hard copy snail mail. Which would you prefer? ===================================================================== From: "J.HUTCHINS" <L101
CPCMB.EAST-ANGLIA.AC.UK> There is in fact a vast literature on the subject, there are numerous commercially available MT systems, and many MT projects, involving a very wide range of languages and different approaches. As introductions to the subject I would suggest my own books: Hutchins, W.J. (1986) Machine translation: past, present, future. [A history of MT research and systems up to 1984.] Hutchins, W.J. and Somers, H.L. (1992) An introduction to machine translation (Academic Press) [An introductory textbook for masters and Ph.D students, covering the basic approaches and details of 'typical' systems.] Books by others include: Arnold, D. et al. (1994) Machine translation: an introductory guide (Blackwell). [Just published basic introduction for non-linguists and translators] Newton, John (ed.) (1992) Computers and translation (Routledge) [A collection of introductory papers coverin a wide range of MT topics.] Slocum, John (ed.) (1988) Machine translation systems (CUP) [A collection of papers on the major MT systems.] Then there are the proceedings of conferences: MT Summit Conferences, held in 1987, 1989, 1991, and 1993 Theoretical and Methodological Issues in MT, held in 1985, 1988, 1990, 1992, and 1993. Coling conferences in recent years have contained many MT papers. For keeping up to date there is the newsletter of the International Association for Machine Translation, entitled MT News International. This is available free to all members of IAMT and its regional associations, e.g. the European Association for Machine Translation. ===================================================================== From: Patrick Jost <jost
itd.nrl.navy.mil> Two books I'd recommend are by Nirenburg (Machine Translation, Camb. U. Press) and Carbonnel et. al. (Machine Translation, a Knowledge Based Approach, Morgan Kaufman). John Hutchinson's book from Academic Press is supposedly quite good as well, but I have been unable to get a complete copy, there were printing production problems. There's a very interesting MT project called Pangloss going on at ISI... contact Ed Hovy (hovy
isi.edu) for details. There are really two approaches...going directly from language A to language B, this is "transfer" MT and using an "interlingua" so you go from language A to the IL and then to language B. Commerical systems...the leader is Systran, in La Jolla, CA. You can caontact them on 619-459-6700. Siemens is just getting ready to release their "METAL" system, I am waiting for sample translations. ===================================================================== From: Walter van den Heever <WVANDENH
dos-lan.cs.up.ac.za> We (Unit for Software Engineering in collaboration with the University of Pretoria) are developing a MT system. The project is currently in its 5'th year and a commercial system (Lexica) is presently being sold to select clients. Lexica is a syntactic transfer system, presently being extended to incorporate semantic information (basically still 80's technology). The languages attempted include both European and African languages (such as English, Afrikaans, French, Swahili, Tswana, Zulu). Based on my experience so far, my impressions are as follows: * I don't think that FAHQT of unrestricted text is possible, * MT can offer useful results in restricted domains (such as technical texts) * Users don't understand the complexity involved and often try to use the system outside its limitations, * The translation between European languages is much simpler than translation between European and African languages. Similar observations have been made concerning the translation between European and Asian languages. This is due to differences in culture and the way these languages work. * A problem we have (similar problems may or may not exist elsewhere) is to get the right people for the job. Linguists have to undergo considerable training before they are able to write a grammar suitable for computation. Computer Scientists can do that, but don't really have the necessary language skills. * The quality of MT depends greatly on the input. The old Garbage-In- Garbage-Out saying contains an element of truth in the case of MT. We have analysed text that didn't translate well and found that even we were not able to understand exactly what the author meant. After rewritting the text more plainly the translation improved considerably and we understood the original better. * The building of dictionaries is i) time-consuming, ii) costly and iii) error-prone. * In order to do translation in anything more that a toy-domain, one requires dictionaries in the order of 50 000 words. These are some very general (and by no means original) observations. ====================================================================== From: Gaelle.Recource
linguist.jussieu.fr Your question in Linguist involves a huge area: here is a short and partial answer. Many European projects were devoted to MT in official community languages. I took part in the EUROTRA research project, which was the biggest one. Its main quality was to provide at the end (december1992) a good summary of the linguistic specifications needed to build an MT system. You can get them in asking to the EC a version of the so-called EUROTRA Reference Manual. If you are really interested, don't hesitate to contact me to have more information. Note that the software itself is obsolete and of no interest, but that all the specifications were actually implemented in the nine languages. At last, you should know that several smaller projects carry on now with which you could get in contact (EUROLANG, ET-10 projects,). ===================================================================== From: Meyer S <meyes
essex.ac.uk> Firstly, here is a brief description of some MT systems that you might be interested in: $\bullet$ METAL, one of the most advanced operational systems (transfer based, making use of deep linguistic analysis) which has been developed by Siemens, Germany. You may find it easier to contact Siemens here in Britain: Siemens Group Services Limited, 83 Guildford Street, Chertsey, Surrey KT16 9AS (Tel: 0932 566791). $\bullet$ The Globalink Translation System (GTS) could be classified as a `direct' system. The quality may not be as high as some of the other systems mentioned, but it is cheap and fast. It has several British distributors, but unfortunately we only have their American address: Globalink Inc., 9302 Lee Highway, Fairfax, Virginia 22031, USA (Tel: 703 273-5600). $\bullet$ The Tovna Machine Translation System (Tovna MTS) is a transfer based system that `learns' from previous input. The UK address is: Tovna Translation Machines Ltd., EUROSOFT (UK) Ltd., Cottons Centre, Cottons Lane, Tooley St., LONDON SE1 2QL (Tel: 234 6635). $\bullet$ Systran is an amended version of what they call a `direct' translation system, which only performs a shallow analysis of the input. The main distributor of Systran is the Gachot company in Soisy-sous-Montmorency (near Paris), France. A new English company is negotiating the right to distribute Systran in Britain. The main user of Systran in Britain is: Rank Xerox Ltd., Parkway, Marlow, Bucks SL7 1YL (Tel: 0628 890000). $\bullet$ The Logos system is (as far as we know) a transfer based system that makes use of a deeper linguistic analysis of input. The address of Logos is: Logos Corporation, 45 Park Place So, Suite 214, Morristown, NJ 07960, USA. We do not know of an English distributor, nor of any main users. $\bullet$ Weidner's MicroCAT is an interactive system. The European subsidiary of Weidner is: WTE (Weidner Translation (Europe) Limited), Fryern House, 125 Winchester Road, Chandler's Ford, Eastleigh, SO5 2DR. One of the main users of Weidner's MicroCAT is Perkins Engines, Peterborough. $\bullet$ DLT is an interlingual system which uses an interlingua based on Esperanto as a `bridge' between languages. This package is developed by the Utrecht software company: Buro voor Systeemontwikkeling (BSO), The Netherlands. Secondly, the following books may be of interest: ``Machine Translation -- An Introductory Guide'', by Siety Meijer, Lorna Balkan, Doug Arnold, Louisa Sadler and R Lee Humphreys. NCC Blackwell. Machine Translation, John Hutchins and Harold Sommers. (also discusses non-commercial systems) ====================================================================== From: Niek van der Donk <N.J.M.vdrDonk
kub.nl> Machine translation : a view from the lexicon / Bonnie Jean Dorr. - Cambridge, Mass [etc.] : MIT Press, cop. 1993. - XX, 432 p. : ill. ; 24 cm. - (Artificial intelligence) Linguistic issues in machine translation / edited by Frank Van Eynde. - London [etc.] : Pinter, 1993. - viii, 239 p. : ill. ; 24 cm. - (Communication in artificial intelligence series) Progress in machine translation / ed. by Sergei Nirenburg. - Amsterdam [etc.] : IOS Press ; Tokyo [etc.] : Ohmsha, 1993. - X, 320 p. : ill. ; 24 cm Lit. opg.: p. [297]-318. - Index. I Machine translation : a knowledge-based approach / Sergei Nirenburg ... [et al.]. - San Mateo, Cal.: Morgan Kaufmann, cop. 1992. - XIV, 258 p. : ill. ; 24 cm An introduction to machine translation / W.John Hutchins and Harold L. Somers. - London [etc.] : Academic Press, 1992. - XXI, 362 p. : fig. ; 25 cm Bibliogr.: p. 335-350. - Index. Towards high-precision machine translation : based on contrastive textology / John Laffling. - Berlin [etc.] : Foris, 1991. - VII, 178 p. : ill. ; 25 cm. - (Distributed language translation ; 7) Machine translation summit / editor-in-Chief M. Nagao ; editors H. Tanaka ... [et al.]. - Tokyo : Ohmsha, cop. 1989. - XIV, 224 p. : ill. ; 27 cm Proceedings of the three-day Machine Translation Summit held at Japan's Hakone Prince Hotel from September 16, 1987 Machine translation : how far can it go? / Makoto Nagao ; transl. by Norman D. Cook. - Oxford [etc.] : Oxford University Press, 1989. - xii, 150 p. : ill. ; 23 cm New directions in machine translation : conference proceedings, Budapest 18-19 Augustus, 1988 / Dan Maxwell, Klaus Schubert, Toon Witkam (eds.). - Dordrecht [etc.] : Foris, 1988. - IV, 259 p. ; 24 cm. - (Distributed language translation ; 4) ===================================================================== From: caffrey
MIT.EDU Do a litterature search for JONATHAN SLOCUM who has done reviews of MT systems. Also write to the Centre for Machine translation at Carnegie Mellon U. in Pittsburgh. ====================================================================== From: Eduard Hovy <hovy
ISI.EDU> Oi, this is a big question, more than I have time or patience to answer. I suggest you read the following, in order: - BYTE magazine, January 1993, special issue on MT, 3 main articles. - Machine Translation, John Hutchins, approx. 1985. - Computational Linguistics special issue on MT, 11(1 and 2-3), 1986. Then please ask again about the types of systems you're interested in. ======================================================================= From: R Chandrasekar <mickey
saathi.ncst.ernet.in> I work in Machine Translation (MT). In my PhD thesis, I am arguing that one should try to use all sorts of methods (including heuristic simplification) to attack the formidable problems of MT. I work at and R&D Centre in Bombay, where we are looking at translation from English to Hindi. BTW, I spent some time as a visiting researcher at the Center for Machine Translation at Carnegie-Mellon Univ, Pittsburgh, USA. Do you know about this Center? If you are interested, I could send you a list of books on Machine Translation. If you want to know some place in the UK where there is considerable MT activity, try contacting: Dr Harold L Somers Centre for Computational Linguistics UMIST PO Box 88, Manchester UK Email: harold
ccl.umist.ac.uk ====================================================================== From: Dan Maxwell <100101.2276
CompuServe.COM> In response to your request for information, there are several books which survey several projects. One of these is by Hutchins, W.J. 1986, "Machine Translation, Present, Past and Future", Chichester:Ellis Horwood. Another is a more recent one (about 1989) by Jonathan Slocum, I believe, but I don't know the title. There is a series of six books about the DLT (Distributed Language Translation) project, of which I was a part, published by Foris publications, Dordrecht, NL. One of these, "New Directions in Machine Translation" is actually the articles from a conference on MT organized by the company sponsoring DLT. It covers various topics and projects within MT, including an update of Hutchins' book. Hutchins' work in particular shows that there are/have been quite a lot of projects, but I have the impression that most of them have rather little published work written about them. And a lot of the articles that I have seen are oriented more toward the computational side of MT rather than the linguistic side. I recommend Hutchins' book as a starter and then particularly #5 of the DLT series, "Working with analogical semantics", by Victor Sadler. It was one of the first treatments of corpus-based approaches, which now seem to be widely used, judging from recent issues of "Computational Linguistics". ===================================================================== From: Merrill=Kashiwabara%HQ%Rational
Vines1.ratsys.com I read your request for information on MT systems, but have very little to offer you except a few companies which we looked into as part of our software localization efforts. The companies with the longest records seem to be SYSTRAN, which is a descendant of the old DARPA machine translation efforts. They have remote facility which allows you to send text and certain types of formatted information over the wire to their facility for translation and re- transmission back to the client. Their translation engines seem to be hand- crafted pragmatically-oriented rather than based on a particular theory or philosophy. Their heuristics are empirically derived. I don't have a contact at Systran, but since they've been around since the '60's, I think that that information is probably readily available. Another machine translation system is the PC-based Global-link software product suite which has a limited vocabulary and subject base and covers 5 major European Languages. The engine seems to be an exception-based lookup table(s). We had a lot of fun translating to and from several languages, with sometimes bizarre results. We examined several products, and I have the literature in hardcopy somewhre, but I'd have to dig it out of the high entropy field which surrounds my desk, so it might take a couple of days. Are you interested in finding an MT system, or in a general survey of the players an the existing techniques being used? __________________________________________________________________ Annette McElligott, CSIS Dept., University of Limerick, Ireland. Tel: +353 61 333644 ext. 5024; Fax: +353 61 330876 Email: mcelligo
itdsrv1.ul.ie or mcelligotta
ul.ie