LINGUIST List 15.976

Tue Mar 23 2004

Software: Request for Feedback

Editor for this issue: Neil Salmond <neillinguistlist.org>


Directory

  • Andrew Kehoe, Request for Feedback

    Message 1: Request for Feedback

    Date: Mon, 22 Mar 2004 10:09:38 -0500 (EST)
    From: Andrew Kehoe <andrewrdues.liv.ac.uk>
    Subject: Request for Feedback


    Dear Colleague

    The Research and Development Unit for English Studies is made up of a small team of corpus linguists, software engineers and statisticians. Our aim is to carry out fundamental and applied research in corpus linguistics, with a view to developing new descriptions of the English language in use, and tools for the extraction and management of knowledge in electronic databases.

    For the past 3 years we have been working on a government-funded project called SHARES (System of Hypermatrix Analysis, Retrieval, Evaluation and Summarisation), the aim of which is to test the hypothesis that similar patterns of lexical repetition are sufficiently maintained across differently authored documents on similar topics to support a high-performance retrieval engine.

    We have developed an intertextual mechanism for the identification and ranking of documents in terms of their relatedness to one or more exemplar texts. The SHARES approach is novel in taking the degree of Lexical Cohesion between texts as the primary criterion for document similarity.

    We have produced an online demo system and user guide, and would appreciate your feedback:

    http://www.rdues.liv.ac.uk/sharesguide/

    This demo system uses a small subset of the US TDT2 (Topic Detection and Tracking) corpus, with 11 topics with 3 English articles on each topic. It allows the comparison of article pairs or of 1 article with all other articles in the test corpus. Stemming and weighting options are available. This is a cut-down version of our full SHARES software, designed for faster online access.

    A feedback form is provided on our website (http://www.rdues.liv.ac.uk/sfeedback.shtml) for your use. You may send comments by email to webmaster rdues.liv.ac.uk if you prefer.

    Thank you in advance

    Andrew Kehoe Research and Development Unit for English Studies University of Liverpool, UK

    Subject Language: English (Language Code: ENG)