LINGUIST List 15.1670

Thu May 27 2004

Diss: Semantics/Syntax: Sofronie: 'Categorial...'

Editor for this issue: Takako Matsui <takolinguistlist.org>


Directory

  1. dsofronie, Categorial Grammars acquisition to simulate natural language learning...

Message 1: Categorial Grammars acquisition to simulate natural language learning...

Date: Thu, 27 May 2004 12:46:15 -0400 (EDT)
From: dsofronie <dsofroniefree.fr>
Subject: Categorial Grammars acquisition to simulate natural language learning...

Institution: University of Lille, France
Program: PhD
Dissertation Status: Completed
Degree Date: 2004

Author: Daniela Sofronie

Dissertation Title: Categorial Grammars acquisition to simulate
natural language learning with semantic help

Linguistic Field: Computational Linguistics, Semantics, Syntax,
Text/Corpus Linguistics, Language Acquisition 

Subject Language: French (code: FRN)

Dissertation Director 1: Remi Gilleron
Dissertation Director 2: Isabelle Tellier
Dissertation Director 3: Marc Tommasi

Dissertation Abstract:

Natural language acquisition is still a challenge for modern research,
more especially as this task requires a multi-field approach,
including cognitive sciences, linguistics and data processing. This
thesis treats a under-part of this vast field, the acquisition of the
syntax of a language using the semantics, formalized like a process of
grammatical inference. The theory of the formal languages, the logic
and the formal learning theory contribute there by offering three
formal models: categorial grammars to represent syntax, the logic of
Montague from which a simplified semantics is extracted and the model
of identification in the limit, from positive examples, of Gold, like
support of the process of inference. The choice of these models
results from an exploration of the psycholinguistics and cognitive
studies on the childish acquisition which support the following
assumptions: acquisition only takes place in the presence of positive
examples; there exists some knowledge of semantic nature which is
innate or which can be extracted directly from the environment. Our
research concentrated on the class of AB or classical categorial
grammars which gave place these last years to some interesting
learnability results within the model of Gold (mainly dues to
Kanazawa). This class deserves to be studied because its members
allow to generate the whole context-free languages and because the
interface which it allows with a semantic interpretation makes it able
to model certain characteristics of the natural languages. But the
known results of learnability relate only some subclasses (the class
of rigid grammars) or give place to crippling algorithms (classes of
grammars k-valued with k > 1). We define a new subclass of classical
categorial grammars in the same time interesting from a
language-theoretic point of view (since its members allow to generate
the whole structured languages of classical categorial grammars) and
from the point of view of machine learning (since it is learnable in
Gold's model if adapted data are provided). To test the validity and
the effectiveness of our proposal we constituted a corpus of French
texts with semantic annotations. The results of the experiments are
promising, especially with regard to the influence of certain factors
like the order of the sentences (from the shortest to the longest) and
the redundancy of the vocabulary, which proves to be beneficial,
confirming the assumptions.
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue