LINGUIST List 15.2577

Thu Sep 16 2004

FYI: Assessing Well-formedness Using Google Script

Editor for this issue: Ann Sawyer <>


  1. Danko Sipka, Assessing well-formedness using number of hits in Google

Message 1: Assessing well-formedness using number of hits in Google

Date: Tue, 14 Sep 2004 00:30:33 -0700
From: Danko Sipka <>
Subject: Assessing well-formedness using number of hits in Google

Dear Linguists,

I frequently use Google to determine lexical and morphosyntactic
well-formedness of two options in various languages. I advise my
students to do the same. In order to save time required to go to
Google two times for one inquiry, I have created a simple script at:

which lets you enter two options, choose the target language and then
get hits for both options in one window. For example, if a student of
English enters take the liberty as the first option and take a liberty
as the second, it will be possible to determine that the first option
is well-formed while the other is not. Similarly, if a student of
Russian enters v vuz as one option and na vuz as the other and selects
Russian as the target language, it will be obvious that the first
option is well formed while the second is not. A student of German can
check the gender of a noun, a student of Polish masculine inanimate
genitive singular ending, etc.

I plan to add lemmatizers for several Slavic languages which would make
it possible to search words in all their inflectional forms but even in
this form the script may be of interest.



Danko Sipka
Research Associate Professor and Acting Director
Critical Languages Institute (
Arizona State University
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue