Publishing Partner: Cambridge University Press CUP Extra Publisher Login

New from Cambridge University Press!


Revitalizing Endangered Languages

Edited by Justyna Olko & Julia Sallabank

Revitalizing Endangered Languages "This guidebook provides ideas and strategies, as well as some background, to help with the effective revitalization of endangered languages. It covers a broad scope of themes including effective planning, benefits, wellbeing, economic aspects, attitudes and ideologies."

E-mail this page

We Have a New Site!

With the help of your donations we have been making good progress on designing and launching our new website! Check it out at!
***We are still in our beta stages for the new site--if you have any feedback, be sure to let us know at***

Dissertation Information

Title: The Semantic Structure of Roget's: A whole-language Thesaurus Add Dissertation
Author: Leonard Old Update Dissertation
Email: click here to access email
Institution: Indiana University Bloomington, School of Library and Information Science
Completed in: 2003
Linguistic Subfield(s): Semantics; Lexicography;
Director(s): Charles Davis
Ralf Shaw

Abstract: This study analyzed a database version of Roget’s Thesaurus (Roget’s International Thesaurus, 3rd Edition, 1962) for frequency and connectivity patterns among the words, senses, and cross-references in order to identify the implicit conceptual structure. Using descriptive statistics, lattices, and information maps, semantic patterns implicit in the data, at both the local and global levels of the structure, were identified.

The explicit organizational structure of the thesaurus is, at the local level, sets of synonyms; and at the global level, a hierarchy of concepts. In contrast, the implicit organization at the local level has the characteristics of dictionary sense definitions (genus and differentiae), and at the global level has the characteristics of a small-world social network. The concept of genus and differentiae provides a model that can be seen to account for the distribution of polysemy within senses and across the Thesaurus. The small-world network model can be seen to account for the incidence of semantic hubs and authorities among cross-references, and conceptual and semantic switching centers among senses and words in the Thesaurus.

Previous work on Roget’s Thesaurus calculated chains and equivalence relations algorithmically from senses and words. In that research it was found that there is an inner semantic core of most-densely-connected words and senses. This study expanded on that research identifying the semantic structure of the inner core and relating it to the top most polysemous words in Roget’s.

While the largest thesaurus Categories relate to concrete objects such as plants, animals, food, clothing and technology, the most-connected words (in terms of numbers of senses and synonyms) were found to relate to abstract concepts such as motion, agitation and what appear to be concepts related to survival. This observation was supported by frequency counts, and global cross-reference and word connectivity patterns.