LINGUIST List 13.967

Mon Apr 8 2002

Diss: Computational Ling: Skoumalova "Czech..."

Editor for this issue: Karolina Owczarzak <karolinalinguistlist.org>


Directory

  1. Hana.Skoumalova, Computational Ling: Skoumalova "Czech syntactic lexicon"

Message 1: Computational Ling: Skoumalova "Czech syntactic lexicon"

Date: Mon, 08 Apr 2002 12:33:49 +0000
From: Hana.Skoumalova <Hana.Skoumalovaff.cuni.cz>
Subject: Computational Ling: Skoumalova "Czech syntactic lexicon"


New Dissertation Abstract

Institution: Charles University
Program: Institute of Theoretical and Computational Linguistics
Dissertation Status: Completed
Degree Date: 2001

Author: Hana Skoumalova 
Dissertation Title: 
Czech syntactic lexicon

Dissertation URL: http://utkl.ff.cuni.cz/~skoumal/dissertation

Linguistic Field: Syntax, Lexicography, Computational Linguistics

Dissertation Director 1: Jarmila Panevova


Dissertation Abstract: 

In this work, an electronic lexicon of Czech verbs is presented. The
lexicon contains valency frames of ca 15,000 Czech verbs, and its
purpose is to enrich information contained in other electronic
dictionaries. The trend of recent years is to make large-scale
reusable sources which can be combined with other sources. This work
shows how the lexicon cooperates with an existing morphological
lexicon and how it can be used in various NLP systems.

Chapter 2 discusses several theoretical approaches in comparison with
Functional Generative Description (FGD), which is used for the
dictionary. The explication concentrates especially on the structure
of lexicons in single theories. A lexicon usually conforms certain
preconditions resulting from using a given theoretical framework, and
so the possibility of creating a lexicon which would be transferable
to another theoretical framework is explored.

Chapter 3 discusses the possibility of using existing sources, with
respect to the desired result and the theoretical framework adopted
for the work. There were already several Czech syntactic lexicons
created in the past, but unfortunately their reuse would be rather
difficult. This chapter mentions several such attempts, and describes
in detail a lexicon which is used.

Chapter 4 describes the verb frame. First, the format of the lexical
entry is described, then various types of reflexive constructions in
Czech, and their encoding in the lexicon are discussed. In the next
section, possible diatheses of the basic (active) frame are shown, and
it is also discussed which of these diatheses can be added to the
dictionary on a regular basis and which have to be treated as
exceptions. The last section describes so called equi and raising
verbs.

In Chapter 5, the procedure of automatic conversion of the source
dictionary to the proposed format is shown. For this conversion, an
algorithm was created which assigns the functors (semantic roles) to
single members of a frame. The output of this procedure will serve as
an input for an editor. It is discussed what amount of the source data
can be completed by this procedure and what amount needs
post-editing. It is also shown how the resulting lexicon can be used
in NLP systems.

Chapter 6 sums up. In Section 6.1, verbs are sorted into groups
according their frames, and the results are compared with results of
other researchers. In Section 6.2, perspectives of the language
processing based on symbolic methods are discussed, and the possible
usage of the lexicon in corpus linguistics.
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue