|Title:||The Semantic Structure of Roget's: A whole-language Thesaurus||Add Dissertation|
|Author:||Leonard Old||Update Dissertation|
|Email:||click here to access email|
|Institution:||Indiana University Bloomington, School of Library and Information Science|
|Linguistic Subfield(s):||Semantics; Lexicography;|
|Abstract:||This study analyzed a database version of Roget’s Thesaurus (Roget’s International Thesaurus, 3rd Edition, 1962) for frequency and connectivity patterns among the words, senses, and cross-references in order to identify the implicit conceptual structure. Using descriptive statistics, lattices, and information maps, semantic patterns implicit in the data, at both the local and global levels of the structure, were identified.
The explicit organizational structure of the thesaurus is, at the local level, sets of synonyms; and at the global level, a hierarchy of concepts. In contrast, the implicit organization at the local level has the characteristics of dictionary sense definitions (genus and differentiae), and at the global level has the characteristics of a small-world social network. The concept of genus and differentiae provides a model that can be seen to account for the distribution of polysemy within senses and across the Thesaurus. The small-world network model can be seen to account for the incidence of semantic hubs and authorities among cross-references, and conceptual and semantic switching centers among senses and words in the Thesaurus.
Previous work on Roget’s Thesaurus calculated chains and equivalence relations algorithmically from senses and words. In that research it was found that there is an inner semantic core of most-densely-connected words and senses. This study expanded on that research identifying the semantic structure of the inner core and relating it to the top most polysemous words in Roget’s.
While the largest thesaurus Categories relate to concrete objects such as plants, animals, food, clothing and technology, the most-connected words (in terms of numbers of senses and synonyms) were found to relate to abstract concepts such as motion, agitation and what appear to be concepts related to survival. This observation was supported by frequency counts, and global cross-reference and word connectivity patterns.