Featured Linguist!

Jost Gippert: Our Featured Linguist!

"Buenos dias", "buenas noches" -- this was the first words in a foreign language I heard in my life, as a three-year old boy growing up in developing post-war Western Germany, where the first gastarbeiters had arrived from Spain. Fascinated by the strange sounds, I tried to get to know some more languages, the only opportunity being TV courses of English and French -- there was no foreign language education for pre-teen school children in Germany yet in those days. Read more

Donate Now | Visit the Fund Drive Homepage

Amount Raised:


Still Needed:


Can anyone overtake Syntax in the Subfield Challenge ?

Grad School Challenge Leader: University of Washington

Publishing Partner: Cambridge University Press CUP Extra Publisher Login
amazon logo
More Info

New from Oxford University Press!


What is English? And Why Should We Care?

By: Tim William Machan

To find some answers Tim Machan explores the language's present and past, and looks ahead to its futures among the one and a half billion people who speak it. His search is fascinating and important, for definitions of English have influenced education and law in many countries and helped shape the identities of those who live in them.

New from Cambridge University Press!


Medical Writing in Early Modern English

Edited by Irma Taavitsainen and Paivi Pahta

This volume provides a new perspective on the evolution of the special language of medicine, based on the electronic corpus of Early Modern English Medical Texts, containing over two million words of medical writing from 1500 to 1700.

E-mail this page 1

Dissertation Information

Title: Contraintes sur la sélection des informations dans les définitions terminographiques : vers des modèles relationnels génériques pertinents Add Dissertation
Author: Selja Seppälä Update Dissertation
Email: click here to access email
Homepage: http://seljaseppala.wordpress.com/
Degree Awarded: University of Geneva , Département de Traitement Informatique Multilingue
Completed in:
Linguistic Subfield(s): Computational Linguistics Semantics Text/Corpus Linguistics Lexicography
Director(s): Bruno de Bessé

Abstract: Definitions are included in terminological resources to ensure that they
fulfill the function of conveying information about the meaning and
usage of terms in the domain; they facilitate and enhance
communication. The activity of definition writing is still mostly realized
manually. Terminologists would however greatly benefit from the
assistance of (semi-)automatic definition writing tools. Such tools would
not only accelerate the process of writing definitions but also enhance
the consistency and thus the overall quality of the definitions produced.

The general objective of my work is thus to conceive and implement
generic tools to assist in definition writing, whatever the terminographic
context, the domain or the language. In my thesis, I explore more
specifically the nature of dictionary definitions and of the activity of
definition writing in terminology. The main research topic of my thesis
relates to the selection of defining information.

Typically, terminologists construct definitions using information in texts
written by domain experts. However, not all the pieces of information
found in these texts can be considered as defining and, when they are,
not all of them are considered relevant to be included in a definition.
One of the most challenging tasks of definition writing is therefore the
selection of defining information. Thus, the two main questions raised
by definition writing and which ought to be addressed in order to
conceive and implement generic definition writing tools are the

-What determines or influences information selection?
-What types of information are relevant to defining?

Considering the different factors that are acknowledged to constrain
the selection of defining information, the one constraint that is, prima
facie, the most independent from any domain and language is the level
of reality. I therefore make the hypothesis that information selection is
partly a function of the type of entity defined. If this hypothesis is
verified, it is possible to propose defining models based on the
properties and relations characterizing each type of entity.

To test this hypothesis, I propose to adopt the categories of an existing
realist upper-level ontology, the Basic Formal Ontology (BFO), and
their specifications. This ontology is aimed at representing the type of
things that exist in the world, their properties and their relations to
other types of entities. In BFO, entity types are organized according to
philosophical distinctions and they are consistent with the scientific
knowledge of the world. I propose to adapt these categories to creating
relational models, and to use these models to describe the internal
structure of existing definitions. The idea is that large-scale multi-
domain and multilingual corpus analyses can be used to test the
hypothesis and, if verified, to implement these models in a (semi-
)automatic definition writing tool.

A pilot experiment based on a corpus analysis of a sample of 240
terminological definitions extracted from 15 domains yielded
encouraging results, with almost 75 % of the relations expressed in the
analyzed definitions pertaining to the models associated with each
entity type. This empirical study shows, moreover, which relations in
these generic models are most relevant in terminological definitions.
These results tend to confirm the tested hypothesis. The theoretical
considerations underlying this methodological proposition also
contribute to the foundations of an integrated theory of definitions in