LINGUIST List 15.1931

Sun Jun 27 2004

Jobs: Info Extraction: Research Associate, Cambridge

Editor for this issue: Sarah Murray <>


  1. sht25, English & Information Extraction: Research Associate, Cambridge

Message 1: English & Information Extraction: Research Associate, Cambridge

Date: Fri, 25 Jun 2004 07:01:42 -0400 (EDT)
From: sht25 <>
Subject: English & Information Extraction: Research Associate, Cambridge

University or Organization: University of Cambridge
Department: Computer Laboratory
Rank of Job: Research Associate
Specialty Areas: Computational Linguistics, Information Extraction,
Required Language(s):English (Code = ENG)


University of Cambridge
Computer Laboratory

Salary: �18,893 - �28,279 pa
Limit of tenure: up to 36 months

Applications are invited for two Research Associates to develop
automatic information extraction (IE) from the biomedical
literature. The project will study adaptive textual information
extraction, initially tuned to literature about the important model
organism Drosophila, in order to extract and classify information
concerning gene function, gene expression patterns and gene regulatory
terms from Medline abstracts and relevant databases. The project is a
collaboration between the NLIP group in the Computer Laboratory
( and the FlyBase
( and FlyMine ( groups
in the Department of Genetics.

This project will build on existing technology in Cambridge for the
analysis of natural language text and for computer-based curation
(curation is the process of extracting and linking gene information to
the literature). Proposed start date: 1 October 2004 or as soon as
possible thereafter.


Post 1:
Research in adaptive IE tools, parsing of free biomedical text and
development of an IE system building on existing code base mostly
implemented in C or Common Lisp.

Post 2:
Integration of the IE system into the data curation process,
development of interfaces between the IE system and existing web-based
automated work flows for curation and literature access, ensuring
compatibility with grid standards and protocols. This post will serve
as a liaison between the Computer Laboratory and Genetics.

Desirable qualifications and experience include:

Post 1: A PhD or equivalent experience in computational linguistics/
natural language engineering, particularly statistical NLP and
parsing, and information extraction/named entity recognition.
Programming: C(++) / Common Lisp, in a Unix/ Linux environment.

Post 2: A PhD, MSc or equivalent experience in computer
science. Programming: C(++), Unix/Linux, Java, XML/XSLT, Web
programming. An interest in one or more of: Web / Grid services,
Semantic Web Technologies (RDF, OWL), computational linguistics/ NLP,
information extraction, e-learning.

Applicants should send a cover letter, a completed PD18 form, a full CV,
the names and addresses of three academic/professional referees to
Simone Teufel, Computer Laboratory, JJ Thomson Avenue, Cambridge, CB3

Closing date: 21 July 2004.

For further information/job description, please email

Address for Applications:

	Attn: Dr Simone Teufel
	Computer Laboratory
	JJ Thomson Avenue
	Cambridge, Cambridgeshire CB3 0FD
	United Kingdom 

	Applications are due by 21-Jul-2004

Contact Information:
	Dr Simone Teufel

This announcement was accompanied by a donation to the LINGUIST List! 
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue