Publishing Partner: Cambridge University Press CUP Extra Publisher Login

New from Cambridge University Press!


Revitalizing Endangered Languages

Edited by Justyna Olko & Julia Sallabank

Revitalizing Endangered Languages "This guidebook provides ideas and strategies, as well as some background, to help with the effective revitalization of endangered languages. It covers a broad scope of themes including effective planning, benefits, wellbeing, economic aspects, attitudes and ideologies."

E-mail this page

We Have a New Site!

With the help of your donations we have been making good progress on designing and launching our new website! Check it out at!
***We are still in our beta stages for the new site--if you have any feedback, be sure to let us know at***

Dissertation Information

Title: Czech Syntactic Lexicon Add Dissertation
Author: Hana Skoumalov√° Update Dissertation
Email: click here to access email
Institution: Charles University in Prague, Institute of Theoretical and Computational Linguistics
Completed in: 2001
Linguistic Subfield(s): Computational Linguistics; Syntax; Lexicography;
Subject Language(s): Czech
Director(s): Jarmila Panevova

Abstract: In this work, an electronic lexicon of Czech verbs is presented. The lexicon contains valency frames of ca 15,000 Czech verbs, and its purpose is to enrich information contained in other electronic dictionaries. The trend of recent years is to make large-scale reusable sources which can be combined with other sources. This work shows how the lexicon cooperates with an existing morphological lexicon and how it can be used in various NLP systems.

Chapter 2 discusses several theoretical approaches in comparison with Functional Generative Description (FGD), which is used for the dictionary. The explication concentrates especially on the structure of lexicons in single theories. A lexicon usually conforms certain preconditions resulting from using a given theoretical framework, and so the possibility of creating a lexicon which would be transferable to another theoretical framework is explored.

Chapter 3 discusses the possibility of using existing sources, with respect to the desired result and the theoretical framework adopted for the work. There were already several Czech syntactic lexicons created in the past, but unfortunately their reuse would be rather difficult. This chapter mentions several such attempts, and describes in detail a lexicon which is used.

Chapter 4 describes the verb frame. First, the format of the lexical entry is described, then various types of reflexive constructions in Czech, and their encoding in the lexicon are discussed. In the next section, possible diatheses of the basic (active) frame are shown, and it is also discussed which of these diatheses can be added to the dictionary on a regular basis and which have to be treated as exceptions. The last section describes so called equi and raising verbs.

In Chapter 5, the procedure of automatic conversion of the source dictionary to the proposed format is shown. For this conversion, an algorithm was created which assigns the functors (semantic roles) to single members of a frame. The output of this procedure will serve as an input for an editor. It is discussed what amount of the source data can be completed by this procedure and what amount needs post-editing. It is also shown how the resulting lexicon can be used in NLP systems.

Chapter 6 sums up. In Section 6.1, verbs are sorted into groups according their frames, and the results are compared with results of other researchers. In Section 6.2, perspectives of the language processing based on symbolic methods are discussed, and the possible usage of the lexicon in corpus linguistics.