LINGUIST List 13.2021

Sat Aug 3 2002

Calls: Computational Linguistics

Editor for this issue: Dina Kapetangianni <>

As a matter of policy, LINGUIST discourages the use of abbreviations or acronyms in conference announcements unless they are explained in the text.


  1. Roger Harris, EAMT Workshop: last minute submission
  2. Marc El-Beze, Call for Papers : special issue of TAL on Language Models

Message 1: EAMT Workshop: last minute submission

Date: Fri, 2 Aug 2002 11:31:45 +0100
From: Roger Harris <>
Subject: EAMT Workshop: last minute submission

6th EAMT Workshop: Teaching Machine Translation
Date: 14 - 15 November 2002
Venue: UMIST, Manchester, England
- -------------------------------------------------------------

The deadline for the submission of extended abstracts expired 
on Wednesday, 31 July 2002. You may for some reason have missed 
that deadline. 

Late submissions received by (and preferably before, please) 
Thursday 8th August will be welcome. The Call for Papers is 
appended below.

With kind regards,

Roger Harris.

- --------------------------------------------------------------------------
- -------------------

Call for Papers

The sixth EAMT Workshop will take place on 14-15 November 2002
hosted by the Centre for Computational Linguistics, UMIST,
Manchester, England.

Organised by the European Association for Machine Translation,
in association with the Natural Language Translation Specialist Group
of the British Computer Society, the Workshop will focus on the topic

 Teaching Machine Translation

The following topics are of interest:

 why and to whom should MT be taught?
 teaching the theoretical background of MT: linguistics, computer
 translation theory
 addressing preconceptions about MT in the classroom
 the use of commercial MT programs in hands-on teaching
 teaching computational aspects of MT to non-computational students
 web-based distance learning of MT MT education and industry:
 bridging the gap between academia and the real world
 teaching pre- and post-editing skills to MT users
 teaching MT evaluation
 building modules or `toy' MT systems in the laboratory
 experiences of the evaluation of MT instruction
 the role of MT in language learning
 translation studies and MT

We invite submissions of an extended abstract of your proposed paper,
up to two pages, summarizing the main points that will be made in
the actual paper.

Submissions will be reviewed by members of the Programme Committee.
Authors of accepted papers will be asked to submit a full version of
the paper, maximum 12 pages, which will be included in the

A stylefile for accepted submissions will be available in due course.

Initially, an extended abstract should be sent, preferably by email
as an attachment in any of the standard formats (doc, html, pdf, ps)
or as plain text, to

Otherwise, hardcopy can be sent to:
Harold Somers, Centre for Computational Linguistics, UMIST, PO Box
88, Manchester M60 1QD, England, or by fax to +44 161 200 3091.

Programme Committee

 Harold Somers, UMIST, Manchester
 Derek Lewis, University of Exeter
 Ruslan Mitkov, University of Wolverhampton
 Mikel Forcada, Universitat d'Alacant
 Karl-Heinz Freigang, Universit�t des Saarlandes
 David Wigg, South Bank University, London
 John Hutchins, EAMT
 Roger Harris, BCS

Important dates:

Deadline for extended abstract: 31 July 2002: EXPIRED
Acceptance notification: 6 September 2002
Final copies due: 14 October 2002
Conference dates: 14-15 November 2002

- -----------------------------------------

Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue

Message 2: Call for Papers : special issue of TAL on Language Models

Date: Fri, 02 Aug 2002 17:53:20 +0200
From: Marc El-Beze <>
Subject: Call for Papers : special issue of TAL on Language Models

 Call for papers (TAL journal):

 Automated Learning of Language Models

Deadline for submission : October 7, 2002 
Issue coordinated by
 Mich�le Jardino (CNRS, LIMSI), and
 Marc El-Beze (LIA, University of Avignon) .

Language Models (LM) play a crucial role in the working of Automated
Natural Language Processing systems, when real-life problems (often
very large ones) are being dealt with. Instances are Speech
Recognition, Machine Translation and Information Retrieval. If we want
these systems to adapt to new applications, or to follow the evolution
in user behaviour, we need to automatize the learning of parameters in
the models we use. Adaptation should occur in advance or in real
time. Some applications do not allow us to build an adequate corpus,
either from a quantitative or qualitative point of view. The gathering
of learning data is made easier by the richness of Web resources, but
in that huge mass, we have to effectively separate the wheat from the
When asked about the optimal size for a learning corpus, are we 
satisfied to answer "The bigger, the better"?
Rather than training one LM on a gigantic learning corpus, would it not 
be advisable to fragment this corpus into linguistically coherent 
segments, and learn several language models, whose scores might be 
combined when doing the test (model mixture)?
In the case of n-gram models, which is the optimal value for n? Should 
it be fixed or variable?
A larger value allows us to capture linguistic constraints over a 
context which goes beyond the mere two preceding words of the classic 
trigram. However, increasing n threatens us with serious coverage 
problems. Which is the best trade-off between these two opposite 

How can we smooth models in order to approximate phenomena that have
not been learned? Which alternatives are to be chosen, using which
more general information (lesser-order n-grams, n-classes?)

Beyond the traditional opposition between numerical and
knowledge-based approaches, there is a consensus about the
introduction of rules into stochastic models, or probability into
grammars, hoping to get the best of both strategies. Hybrid models can
be conceived in several ways, depending on which choices are made
regarding both of their sides, and also, the place where coupling
occurs. Because of discrepancies between the language a grammar
generates, and actually observed syntagms, some researchers decided to
reverse the situation and derive the grammar from observed
facts. However, this method yields disappointing results, since it
does not perform any better than n -gram methods, and is perhaps
inferior. Shouldn't we introduce here a good deal of supervision, if
we want to reach this goal?

Topics (non-exhaustive list):
In this special issue, we would like to publish either innovative 
papers, or surveys and prospective essays dealing with Language Models 
(LM), Automated Learning of their parameters, and covering one of 
following subtopics:

 * Language Models and Resources:
 - determination of the adequate lexicon
 - determination of the adequate corpus
 * Topical Models
 * LM with fixed or variable history
 * Probabilistic Grammars
 * Grammatical Inference
 * Hybrid Language Models
 * Static and dynamic adaptation of LMs
 * Dealing with the Unknown
 - Modelling words which do not belong to the vocabulary
 - Methods for smoothing LMs
 * Supervised and unsupervised learning of LMs
 - Automated classification of basic units
 - Introducing linguistic knowledge into LMs
 * Methods for LM learning
 - EM, MMI, others?
 * Evaluation of Language Models
 * Complexity and LM theory
 * Applications:
 - Speech Recognition
 - Machine Translation
 - Information Retrieval

Papers (25 pages maximum) are to be submitted in Word ou LaTeX.
Style sheets are available at HERMES : 

Articles can be written either in French or in English, but English will 
be accepted from non-French speaking authors only.

Submission deadline is October 7, 2002. Authors who plan to submit a 
paper are invited to contact
Mich�le Jardino and / or Marc El-Beze ( ) before 
September 15, 2002.
Articles will be reviewed by a member of the editorial board and two 
external reviewers designed by the editors of this issue. Decisions of 
the editorial board and referees' report will be transmitted to the 
authors before November 20, 2002.
The final version of the accepted papers will be required by February 
20, 2003. Publication is planned during the spring of 2003.

Submissions must be sent electronically to:
Mich�le Jardino ( )
Marc El-B�ze ( )
or, in paper version (four copies), posted to: 
Marc El-Beze Laboratoire d'Informatique
LIA - CERI BP 1228
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue