Featured Linguist!

Jost Gippert: Our Featured Linguist!

"Buenos dias", "buenas noches" -- this was the first words in a foreign language I heard in my life, as a three-year old boy growing up in developing post-war Western Germany, where the first gastarbeiters had arrived from Spain. Fascinated by the strange sounds, I tried to get to know some more languages, the only opportunity being TV courses of English and French -- there was no foreign language education for pre-teen school children in Germany yet in those days. Read more



Donate Now | Visit the Fund Drive Homepage

Amount Raised:

$33698

Still Needed:

$41302

Can anyone overtake Syntax in the Subfield Challenge ?

Grad School Challenge Leader: University of Washington


Publishing Partner: Cambridge University Press CUP Extra Publisher Login
amazon logo
More Info

Software Details

Title: NooJ: Finite-State Language Processing
Submitter: Chris Humphrey
Description: http://www.nooj4nlp.net/pages/nooj.html

NooJ is both a corpus processing tool and a linguistic development
environment: it allows linguists to formalize several levels of linguistic
phenomena: orthography and spelling, lexicons for simple words, multiword
units and frozen expressions, inflectional, derivational and productive
morphology, local, structural syntax and transformational syntax. For each
of these levels, NooJ provides linguists with one or more formal tools
specifically designed to facilitate the description of each phenomenon, as
well as parsing tools designed to be as computationally efficient as
possible. This approach distinguishes NooJ from most computational
linguistic tools, which provide a single formalism that should describe
everything. As a corpus processing tool, NooJ allows users to apply
sophisticated linguistic queries to large corpora in order to build indices
and concordances, annotate texts automatically, perform statistical
analyses, etc.

NooJ is freely available and linguistic modules can already be downloaded
for Acadian, Arabic, Armenian, Bulgarian, Catalan, Chinese, Croatian,
French, English, German, Hebrew, Greek, Hungarian, Italian, Polish,
Portuguese, Spanish and Turkish.
Linguistic Field(s): Morphology
Syntax
Text/Corpus Linguistics

Language Specialty: Armenian
Bulgarian
Chinese, Mandarin
Catalan-Valencian-Balear
English
French
German
Greek, Modern
Hebrew
Hungarian
Italian
Portuguese
Polish
Spanish
Turkish
Croatian

LL Issue: 21.4878
Date Posted: 03-Dec-2010

Search Again

Back to Software Index