LINGUIST List 10.1337

Sat Sep 11 1999

Jobs: Software Dev/Medical Text, Sp/Lang Technologies

Editor for this issue: Karen Milligan <>


  1. Berman, Jules (NCI), Software Development/Medical Text
  2. Pfaraud, Speech and Language Technologies

Message 1: Software Development/Medical Text

Date: Thu, 9 Sep 1999 13:24:18 -0400
From: Berman, Jules (NCI) <>
Subject: Software Development/Medical Text

New NIH Small Business Innovation Research (SBIR) Program contract proposals
have just been announced and are available at:

Attached is a contract announcement that may be of interest to you or your
company. If you have any questions regarding this contract proposal , they
must be referred through NCI's SBIR Contracting Officer: 
 Mr. Joseph Bowe
 Phone: (301) 435-3810
 Fax: (301) 480-0309

179 Encoding Surgical Pathology Data into Standard Nomenclature within
XML The Resources Development Branch of the Cancer Diagnosis Program
is seeking software development proposals to convert surgical
pathology report data into XML documents containing tagged encrypted
identifiers, tagged demographics, and tagged medical codes (UMLS or
SNOMED, or SNOMED + LOINC). The long-term goal is to convert pathology
data into a structure and format that will support queries from
standard network protocols. Applicants will develop algorithms to
parse pathology reports into fields,then sentences, then text phrases
and will match the text phrases with a coded standard
nomenclature. The parsing/matching algorithms may use term frequency
tables, grammatical rules, context sensitivity, or other algorithmic
approaches, but the proposal must explain and justify the proposed
algorithms. The proposed algorithms should provide a way of dealing
with misspellings, negations, run-on sentences, and improperly
inconsistently delimited sentences. The applicant will propose and
describe methodology to assess the accuracy of coding. Phase 1
(feasibility) will consist of the development and testing (for
accuracy) of implementations that convert the text of actual pathology
reports into medical codes. Phase 2 involves the preparation of
software to produce XML files that encapsulate the tagged coded data
as described in the following requirements. The applicant will design
a DTD (Data Type Declaration) for the XML file to include the fields
and subfields of data contained in pathology reports (e.g. surgical
pathology case number, coded patient identifier(s), date of biopsy,
demographics, clinical history, specimen type, specimen number,
diagnosis, microscopic description, comment). The software application
will be developed using a collection of textually real pathology
reports consisting of a combined total of at least 5,000 surgical
pathology reports acquired from at least two institutions with which
the applicant has formed appropriate collaborations. The reports may
be made false or anonymized by the collaborating institution so that
the software developer receives the report files in a format from
which patients cannot be identified (e.g. patient names can be encoded
via a one-way hash, and each piece of demographic information can be
translated to a false data element). The software implementation will
convert electronic files of surgical pathology reports into XML
documents wherein the original text has been converted to coded
medical terms. The accuracy of the software implementation should be
tested by comparison with a separate set of reports (at least 200)
that have been manually encoded and manually converted to XML
files. The implementation of the algorithms will be written in Perl or
Java, with well-annotated source code, and must be designed to have a
practical GUI and to permit software users to upload pathology reports
over the Internet and to view the XML output on an XML-capable web browser.
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue

Message 2: Speech and Language Technologies

Date: Thu, 9 Sep 1999 11:30:12 EDT
From: Pfaraud <>
Subject: Speech and Language Technologies

- ---------------------------------
- ---------------------------------

At the center of a rapidly expanding worldwide market

Lernout & Hauspie ( L&H TM) is a world leader in the field of advanced speech 
and language technologies, products, solutions and services.
Its core technologies include automatic speech recognition, text-to-speech, 
digital speech compression, text-to-text language translation, and linguistic 
With more than 1,700 employees worldwide, primarily speech linguists, 
scientists, and engineers, the company's research and development operations 
comprise one of the most largest commercial speech and linguistics 
laboratories in the world.
The company is commited to the idea that the human voice is the most natural 
interface for use with virtually any device or software program, and that 
globalization and growth of the internet requires real-time translation for 
multiple languages.
Our goal is to provide easy-to-use technologies and products that enable 
people to interact by voice with the machine that surround them, regardless 
of the language they speak.

J In 1998, Revenues increased by more than 100 % for the fourth straight year.
J In early 1999, Microsoft increased its investment in L&H , adding $15 
million and raising its ownership position to 7%.
J Also in early 1999, Intel has invested $30 million in L&H.

For more information, please visit Lernout & Hauspie on the World Wide Web at or .


This division provides solutions to OEMs and corporate customers, based on 
L&H speech and language technologies. These can be software development kits, 
user interface modules, or entire applications.

The division is organized in department for the following market : Embedded 
solutions, PC/Multimedia, Automobile and Telecommunication.


The telecom department is part of the TECHNOLOGIES & SOLUTIONS Division.

The telecom department sells dialog applications (i.e. Interactive Speech 
Systems) through specific customer projects, that allow end-users to 
communicate with the computer using ordinary speech devices such as 
These applications are build using L&H advanced ASR and TTS engines, and 
could take place within call centers solutions. They greatly reduce costs, 
increase the availibility of the call centres, raise productivity and, last 
but not the least, offer a powerful service to the end user. 
Applications examples : banking services, flight reservation system, stock 
market, E Commerce, �

- --------------------------------------------------------------
THE JOB: Linguiste Departement Telecom
- --------------------------------------------------------------


Integre au sein de notre equipe technique, vous participez a la realisation : 
- de Serveurs Vocaux Interactifs (SVI), dans le cadre de nos projets de 
centre d'appels, 
- en utilisant les technologies avancees de reconnaissance et de synthese 
vocale du leader mondial, 
- pour le compte de nos clients et partenaires, 
- et sous la responsabilite d'un chef de projet.

Vous assistez le chef de projet dans la conception generale du systeme de 
dialogue interactif, et dans la definition detaillee de l'arborescence et de 
l'ergonomie du dialogue.
Vous etes en charge du parametrage du moteur de reconnaissance de la voix. Ce 
moteur de reconnaissance necessite un formalisme specifique (recognition 
grammars) pour reconnaetre et comprendre les expressions des utilisateurs, a
chaque etape du dialogue.
Vous participez activement a la mise au point du systeme (Wizard of OZ), et 
aux differentes etapes de test.
Vous menez des enquetes pour connaitre la satisfaction des utilisateurs du 
systeme, et vous en analysez les resultats.

Poste base a Marne La Vallee (77) FRANCE


Formation :
- Universitaire avec specialisation en linguistique

Experiences professionnelles : 
- Vous justifiez idealement d'une experience sur les technologies appliquees 
au langage.
- Vous savez construire des " Recognition Grammars ", ou vous avez une 
experience en programmation informatique.
- Les candidatures des debutants seront etudiees

Vous etes rigoureux, vous appreciez le travail en equipe, et vous avez le 
sens de la relation client. Une competence fonctionnelle sur un metier 
vertical serait un plus tres apprecie.

Langues :
- Votre langue maternelle est le francais
- Bon niveau d'anglais indispensable

Merci d'adresser votre dossier de candidature (lettre, CV, photo) a :
A l'attention de Patrice Faraud
44, avenue Georges Pompidou
92300 Levallois-Perret FRANCE
Tel : 01 41 49 98 70
Fax : 01 41 49 98 71
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue