Editor for this issue: Dina Kapetangianni <dina
linguistlist.org>
6th EAMT Workshop: Teaching Machine Translation Date: 14 - 15 November 2002 Venue: UMIST, Manchester, England Web-site: http://www.ccl.umist.ac.uk/events/eamt-bcs/cfp.html - ------------------------------------------------------------- The deadline for the submission of extended abstracts expired on Wednesday, 31 July 2002. You may for some reason have missed that deadline. Late submissions received by (and preferably before, please) Thursday 8th August will be welcome. The Call for Papers is appended below. With kind regards, Roger Harris. - -------------------------------------------------------------------------- - ------------------- Call for Papers The sixth EAMT Workshop will take place on 14-15 November 2002 hosted by the Centre for Computational Linguistics, UMIST, Manchester, England. Organised by the European Association for Machine Translation, in association with the Natural Language Translation Specialist Group of the British Computer Society, the Workshop will focus on the topic of: Teaching Machine Translation The following topics are of interest: why and to whom should MT be taught? teaching the theoretical background of MT: linguistics, computer science, translation theory addressing preconceptions about MT in the classroom the use of commercial MT programs in hands-on teaching teaching computational aspects of MT to non-computational students web-based distance learning of MT MT education and industry: bridging the gap between academia and the real world teaching pre- and post-editing skills to MT users teaching MT evaluation building modules or `toy' MT systems in the laboratory experiences of the evaluation of MT instruction the role of MT in language learning translation studies and MT etc. We invite submissions of an extended abstract of your proposed paper, up to two pages, summarizing the main points that will be made in the actual paper. Submissions will be reviewed by members of the Programme Committee. Authors of accepted papers will be asked to submit a full version of the paper, maximum 12 pages, which will be included in the proceedings. A stylefile for accepted submissions will be available in due course. Initially, an extended abstract should be sent, preferably by email as an attachment in any of the standard formats (doc, html, pdf, ps) or as plain text, to Harold.SomersMail to author|Respond to list|Read more issues|LINGUIST home page|Top of issueumist.ac.uk. Otherwise, hardcopy can be sent to: Harold Somers, Centre for Computational Linguistics, UMIST, PO Box 88, Manchester M60 1QD, England, or by fax to +44 161 200 3091. Programme Committee Harold Somers, UMIST, Manchester Derek Lewis, University of Exeter Ruslan Mitkov, University of Wolverhampton Mikel Forcada, Universitat d'Alacant Karl-Heinz Freigang, Universit�t des Saarlandes David Wigg, South Bank University, London John Hutchins, EAMT Roger Harris, BCS Important dates: Deadline for extended abstract: 31 July 2002: EXPIRED Acceptance notification: 6 September 2002 Final copies due: 14 October 2002 Conference dates: 14-15 November 2002 - -----------------------------------------
Call for papers (TAL journal): http://www.atala.org/tal/ Automated Learning of Language Models ============================ Deadline for submission : October 7, 2002 Issue coordinated by Mich�le Jardino (CNRS, LIMSI), and Marc El-Beze (LIA, University of Avignon) . Language Models (LM) play a crucial role in the working of Automated Natural Language Processing systems, when real-life problems (often very large ones) are being dealt with. Instances are Speech Recognition, Machine Translation and Information Retrieval. If we want these systems to adapt to new applications, or to follow the evolution in user behaviour, we need to automatize the learning of parameters in the models we use. Adaptation should occur in advance or in real time. Some applications do not allow us to build an adequate corpus, either from a quantitative or qualitative point of view. The gathering of learning data is made easier by the richness of Web resources, but in that huge mass, we have to effectively separate the wheat from the chaff. When asked about the optimal size for a learning corpus, are we satisfied to answer "The bigger, the better"? Rather than training one LM on a gigantic learning corpus, would it not be advisable to fragment this corpus into linguistically coherent segments, and learn several language models, whose scores might be combined when doing the test (model mixture)? In the case of n-gram models, which is the optimal value for n? Should it be fixed or variable? A larger value allows us to capture linguistic constraints over a context which goes beyond the mere two preceding words of the classic trigram. However, increasing n threatens us with serious coverage problems. Which is the best trade-off between these two opposite constraints? How can we smooth models in order to approximate phenomena that have not been learned? Which alternatives are to be chosen, using which more general information (lesser-order n-grams, n-classes?) Beyond the traditional opposition between numerical and knowledge-based approaches, there is a consensus about the introduction of rules into stochastic models, or probability into grammars, hoping to get the best of both strategies. Hybrid models can be conceived in several ways, depending on which choices are made regarding both of their sides, and also, the place where coupling occurs. Because of discrepancies between the language a grammar generates, and actually observed syntagms, some researchers decided to reverse the situation and derive the grammar from observed facts. However, this method yields disappointing results, since it does not perform any better than n -gram methods, and is perhaps inferior. Shouldn't we introduce here a good deal of supervision, if we want to reach this goal? Topics (non-exhaustive list): ==================== In this special issue, we would like to publish either innovative papers, or surveys and prospective essays dealing with Language Models (LM), Automated Learning of their parameters, and covering one of following subtopics: * Language Models and Resources: - determination of the adequate lexicon - determination of the adequate corpus * Topical Models * LM with fixed or variable history * Probabilistic Grammars * Grammatical Inference * Hybrid Language Models * Static and dynamic adaptation of LMs * Dealing with the Unknown - Modelling words which do not belong to the vocabulary - Methods for smoothing LMs * Supervised and unsupervised learning of LMs - Automated classification of basic units - Introducing linguistic knowledge into LMs * Methods for LM learning - EM, MMI, others? * Evaluation of Language Models * Complexity and LM theory * Applications: - Speech Recognition - Machine Translation - Information Retrieval Format: ===== Papers (25 pages maximum) are to be submitted in Word ou LaTeX. Style sheets are available at HERMES : http://www.hermes-science.com/ Language: ======= Articles can be written either in French or in English, but English will be accepted from non-French speaking authors only. Deadlines: ======= Submission deadline is October 7, 2002. Authors who plan to submit a paper are invited to contact Mich�le Jardino and / or Marc El-Beze ( mailto:tal.mlMail to author|Respond to list|Read more issues|LINGUIST home page|Top of issuelimsi.fr ) before September 15, 2002. Articles will be reviewed by a member of the editorial board and two external reviewers designed by the editors of this issue. Decisions of the editorial board and referees' report will be transmitted to the authors before November 20, 2002. The final version of the accepted papers will be required by February 20, 2003. Publication is planned during the spring of 2003. Submission: ======== Submissions must be sent electronically to: Mich�le Jardino ( mailto:jardino
limsi.fr ) Marc El-B�ze ( mailto:marc.elbeze
lia.univ-avignon.fr ) or, in paper version (four copies), posted to: Marc El-Beze Laboratoire d'Informatique LIA - CERI BP 1228 84 911 AVIGNON CEDEX 9 FRANCE