Publishing Partner: Cambridge University Press CUP Extra Publisher Login

Software Details

Title: CLAIRLIB release
Submitter: Mark Joseph
Description: Clairlib, The Clair Library is now available


The University of Michigan's CLAIR (Computational Linguistics And
Information Retrieval) group ( is happy to
present the second release of clairlib, the Clair library.

The Clair library is written in Perl and is intended to simplify a number
of generic tasks in Natural Language Processing (NLP) and Information
Retrieval (IR). Its architecture also allows for external software to be
plugged in with very little effort.

Clairlib features a tiered architecture with a core shared by all
applications and subject-specific libraries (currently in political science
and bioinformatics).


Native: Tokenization, Summarization, LexRank, Biased LexRank, Document
Clustering, Document Indexing, PageRank, Biased Pagerank, Web Graph
Analysis, Bioinformatics Text Analysis, Political Science Text Analysis,
Network Building, Power Law Distribution Analysis, Network Analysis and
Computation (Watts-Strogatz Clustering Coefficient, Cosines, Random Walks),
Tf, Idf

Imported: Stemming, Sentence Segmentation, Web Page Download, Web Crawling,
XML Parsing, XML Tree Building, XML Writing


This work has been supported in part by grants R01 LM008106 'Representing
and Acquiring Knowledge of Genome Regulation' and U54 DA021519 'National
center for integrative bioinformatics', both from the National Institutes
of Health as well as grants IDM 0329043 'Probabilistic and link-based
Methods for Exploiting Very Large Textual Repositories' and DHB 0527513
'The Dynamics of Politcal Representation and Political Rhetoric,' both from
the National Science Foundation.


The Clair Library is developed by the Clair group at the University of
Michigan. It encompasses the functionality of MEAD and perltree, two of
CLAIR's earlier releases.

Project design: Dragomir R. Radev

Main implementers: Anthony Fader, Mark Hodges, and Dragomir R. Radev

Additional code by: Timothy Allison, Michael Dagitses, Aaron Elkiss, Gunes
Erkan, Scott Gifford, Mark Joseph, Samuela Pollack, and Adam Winkel
Linguistic Field(s): Computational Linguistics

LL Issue: 17.3080
Date Posted: 19-Oct-2006

Search Again

Back to Software Index