LINGUIST List 16.800

Wed Mar 16 2005

Support: Comp Ling: PhD Student, University of Edinburgh

Editor for this issue: Jessica Boynton <>

To post to LINGUIST, use our convenient web form at


        1.    Mirella Lapata, Computational Linguistics: PhD Student, University of Edinburgh, United Kingdom

Message 1: Computational Linguistics: PhD Student, University of Edinburgh, United Kingdom

Date: 16-Mar-2005
From: Mirella Lapata <>
Subject: Computational Linguistics: PhD Student, University of Edinburgh, United Kingdom

University or Organization: University of Edinburgh
Job Rank: PhD
School of Informatics, University of Edinburgh

The Institute of Communicating and Collaborative Systems (ICCS) within
the Division of Informatics and the Human Communication Research
Centre (HCRC) invites applications for a three-year EPSRC studentship
award to commence in September 2005. The successful applicant will work
on a project aiming to devise unsupervised models for word sense
disambiguation. A brief summary of the aims of this project is given

Graphical Models for Word Sense Disambiguation

The most accurate techniques for word sense disambiguation (WSD) to
date are those which are trained on text in which each word has been
manually annotated with its intended sense. A major shortcoming of
these methods, though, is that accuracy is strongly correlated with
the quantity of training data available, and this is in short supply
because its production is very labour intensive. For many words the
distribution of their senses is highly skewed and WSD systems work
best when they take the most frequent sense into account. However, the
most frequent sense of a word is often not known, particularly in
domains (subject areas) in which no text has ever been manually

This project is concerned with developing novel algorithms for
alleviating the data requirements for large scale WSD. More
specifically the project will involve:

o Exploring the use of probabilistic graphical models for word sense
disambiguation. Graphical models are a powerful modeling framework
that is well-suited for characterizing and studying the interactions
among varied information sources, thus allowing to represent
concurrently many aspects of the WSD problem.

o devising sense ranking models for structured (e.g., WordNet) and
unstructured (e.g., dictionary definitions) sense inventories.

o Demonstrate the benefit of unsupervised WSD in application to
Question Answering.

The EPSRC baseline rate of maintenance is currently approx. £12.000
and the studentship will also pay the three years' tuition fees at
home/EU rates. Applicants should have a good honours degree or
equivalent in Computer Science or Computational
Linguistics. Programming skills, preferably in Perl, Java, C or C++,
are essential. Familiarity with statistical NLP, machine learning
methods and corpus processing is an advantage.

The project will be conducted in collaboration with the Natural
Language and Computational Linguistics (NLCL) group at the University
of Sussex (see ICCS
and HCRC have close research links with a number of other academic
institutions (e.g., Saarland University, DFKI, Stanford University)
and companies from which the student will benefit.

For further information about the project please e-mail Dr. Mirella
Lapata ( Application forms and details of how to
apply are on-line at
PLEASE MARK 'Graphical Models for Word Sense Disambiguation' ON THE

Address for Applications:
College of Science & Engineering, The University of Edinburgh
The Weir Building, The King's Buildings
Edinburgh EH9 3JY
United Kingdom

Contact Information:
Dr. Mirella Lapata

- - - - - -

United Kingdom
Respond to list|Read more issues|LINGUIST home page|Top of issue