LINGUIST List 17.740|
Fri Mar 10 2006
Diss: Computational Ling: Sofkova Hashemi: 'Automatic..'
Editor for this issue: Takako Matsui
To post to LINGUIST, use our convenient web form at http://linguistlist.org/LL/posttolinguist.html.
Automatic Detection of Grammar Errors in Primary School Children's Texts. A Finite State Approach.
Message 1: Automatic Detection of Grammar Errors in Primary School Children's Texts. A Finite State Approach.
From: Sylvana Sofkova Hashemi <sylvanaling.gu.se>
Subject: Automatic Detection of Grammar Errors in Primary School Children's Texts. A Finite State Approach.
Institution: Göteborg University
Program: Department of Linguistics
Dissertation Status: Completed
Degree Date: 2003
Author: Sylvana Sofkova Hashemi
Dissertation Title: Automatic Detection of Grammar Errors in Primary School
Children's Texts. A Finite State Approach.
Dissertation URL: http://www.ling.gu.se/~sylvana/
Linguistic Field(s): Computational Linguistics
Subject Language(s): Swedish (swe)
This thesis concerns the analysis of grammar errors in Swedish texts written by
primary school children and the development of a finite state system for finding
such errors. Grammar errors are more frequent for this group of writers than for
adults and the distribution of the error types is different in children's texts.
In addition, other writing errors above word-level are discussed here, including
punctuation and spelling errors resulting in existing words.
The method used in the implemented tool FiniteCheck involves subtraction of
finite state automata that represent grammars with varying degrees of detail,
creating a machine that classifies phrases in a text containing certain kinds of
errors. The current version of the system handles errors concerning agreement in
noun phrases, and verb selection of finite and non-finite forms. At the lexical
level, we attach all lexical tags to words and do not use a tagger which could
eliminate information in incorrect text that might be needed later to find the
error. At higher levels, structural ambiguity is treated by parsing order,
grammar extension and some other heuristics.
The simple finite state technique of subtraction has the advantage that the
grammars one needs to write to find errors are always positive, describing the
valid rules of Swedish rather than grammars describing the structure of errors.
The rule sets remain quite small and practically no prediction of errors is
The linguistic performance of the system is promising and shows comparable
results for the error types implemented to other Swedish grammar checking tools,
when tested on a small adult text not previously analyzed by the system. The
performance of the other Swedish tools was also tested on the children's data
collected for this study, revealing quite low recall rates. This fact motivates
the need for adaptation of grammar checking techniques to children, whose errors
are different from those found in adult writers and pose more challenge to
current grammar checkers, that are oriented towards texts written by adult writers.
The robustness and modularity of FiniteCheck makes it possible to perform both
error detection and diagnostics. Moreover, the grammars can in principle be
reused for other applications that do not necessarily have anything to do with
error detection, such as extracting information in a given text or even parsing.
Key Words: grammar errors, spelling errors, punctuation, children's writing,
Swedish, language checking, light parsing, finite state technology
Respond to list|Read more issues|LINGUIST home page|Top of issue
Please report any bad links or misclassified data
LINGUIST Homepage | Read
LINGUIST | Contact us
While the LINGUIST List makes every effort to ensure the linguistic relevance of sites listed
on its pages, it cannot vouch for their contents.