* * * * * * * * * * * * * * * * * * * * * * * *
LINGUIST List logo Eastern Michigan University Wayne State University *
* People & Organizations * Jobs * Calls & Conferences * Publications * Language Resources * Text & Computer Tools * Teaching & Learning * Mailing Lists * Search *
* *
LINGUIST List 17.740

Fri Mar 10 2006

Diss: Computational Ling: Sofkova Hashemi: 'Automatic..'

Editor for this issue: Takako Matsui <takolinguistlist.org>


To post to LINGUIST, use our convenient web form at http://linguistlist.org/LL/posttolinguist.html.
Directory
        1.    Sylvana Sofkova Hashemi, Automatic Detection of Grammar Errors in Primary School Children's Texts. A Finite State Approach.


Message 1: Automatic Detection of Grammar Errors in Primary School Children's Texts. A Finite State Approach.
Date: 08-Mar-2006
From: Sylvana Sofkova Hashemi <sylvanaling.gu.se>
Subject: Automatic Detection of Grammar Errors in Primary School Children's Texts. A Finite State Approach.


Institution: Göteborg University
Program: Department of Linguistics
Dissertation Status: Completed
Degree Date: 2003

Author: Sylvana Sofkova Hashemi

Dissertation Title: Automatic Detection of Grammar Errors in Primary School
Children's Texts. A Finite State Approach.

Dissertation URL: http://www.ling.gu.se/~sylvana/

Linguistic Field(s): Computational Linguistics
Language Acquisition
Syntax

Subject Language(s): Swedish (swe)


Dissertation Director(s):
Robin Cooper

Dissertation Abstract:

This thesis concerns the analysis of grammar errors in Swedish texts written by
primary school children and the development of a finite state system for finding
such errors. Grammar errors are more frequent for this group of writers than for
adults and the distribution of the error types is different in children's texts.
In addition, other writing errors above word-level are discussed here, including
punctuation and spelling errors resulting in existing words.

The method used in the implemented tool FiniteCheck involves subtraction of
finite state automata that represent grammars with varying degrees of detail,
creating a machine that classifies phrases in a text containing certain kinds of
errors. The current version of the system handles errors concerning agreement in
noun phrases, and verb selection of finite and non-finite forms. At the lexical
level, we attach all lexical tags to words and do not use a tagger which could
eliminate information in incorrect text that might be needed later to find the
error. At higher levels, structural ambiguity is treated by parsing order,
grammar extension and some other heuristics.

The simple finite state technique of subtraction has the advantage that the
grammars one needs to write to find errors are always positive, describing the
valid rules of Swedish rather than grammars describing the structure of errors.
The rule sets remain quite small and practically no prediction of errors is
necessary.

The linguistic performance of the system is promising and shows comparable
results for the error types implemented to other Swedish grammar checking tools,
when tested on a small adult text not previously analyzed by the system. The
performance of the other Swedish tools was also tested on the children's data
collected for this study, revealing quite low recall rates. This fact motivates
the need for adaptation of grammar checking techniques to children, whose errors
are different from those found in adult writers and pose more challenge to
current grammar checkers, that are oriented towards texts written by adult writers.

The robustness and modularity of FiniteCheck makes it possible to perform both
error detection and diagnostics. Moreover, the grammars can in principle be
reused for other applications that do not necessarily have anything to do with
error detection, such as extracting information in a given text or even parsing.

Key Words: grammar errors, spelling errors, punctuation, children's writing,
Swedish, language checking, light parsing, finite state technology


Respond to list|Read more issues|LINGUIST home page|Top of issue




Please report any bad links or misclassified data

LINGUIST Homepage | Read LINGUIST | Contact us

NSF Logo

While the LINGUIST List makes every effort to ensure the linguistic relevance of sites listed
on its pages, it cannot vouch for their contents.