LINGUIST List 13.3157

Mon Dec 2 2002

Software: MEAD - Multidocument Summarization Environment

Editor for this issue: James Yuells <jameslinguistlist.org>

Directory

Dragomir Radev, MEAD - Multidocument Summarization Environment

Message 1: MEAD - Multidocument Summarization Environment

Date: Tue, 26 Nov 2002 18:16:38 +0000
From: Dragomir Radev <radevUmich.edu>
Subject: MEAD - Multidocument Summarization Environment

MEAD v3.07 released -------------------

http://www.summarization.com/mead

MEAD is a multi-document summarization system with multi-lingual capabilities. The MEAD system implements extractive summarization, whereby summaries are produced by selecting a subset of highly relevant sentences from the cluster's overall set of sentences. MEAD can summarize clusters of English documents on most POSIX-conforming operating systems and can summarize clusters of Mandarin Chinese documents on a subset of these operating systems.

The MEAD system has been under development since 2000. Versions 1.0 and 2.0 were developed at the University of Michigan. Version 3.0 was developed at a six-week workshop at Johns Hopkins University. Versions 3.01 through 3.06 were incremental improvements by the JHU workshop team members. With version 3.07, development of MEAD has moved back to the University of Michigan.

MEAD 3.07 represents a major refactoring of previous MEAD versions. The current version supports all the functionality of previous versions, but also has many new features. Some of these are:

- Version 3.07 is much more configurable than previous versions of MEAD. It allows for both system-wide and user-specific configuration files.

- It has a simplified user interface. Previous versions required the user to manually edit a mead.config file and use a combination of Unix shell commands to produce summaries. While the current version still supports this interface, MEAD 3.07 has a single script interface that essentially eliminates the need for manual editing of mead.config files.

- MEAD Eval, a previously free-standing tool for evaluating summarizers, has been incorporated with the current version of MEAD. This allows users to evaluate existing summaizers, as well as evaluate the performance of the base MEAD system and any user modifications. MEAD Eval supports co-selection (percent agreement, precision, recall, Kappa) and content-based evaluation metrics (such as word overlap and longest common subsequence), as well as relative utility.

- MEAD uses an extensive collection of custom Perl modules that may be suitable for use in many natural language applications, including for example, question answering or novelty detection.

- Random and lead-based summarizers have been incorporated into the MEAD framework. These summarizers provide useful examples of how to create new MEAD modules.

- The documentation for the current version has been expanded, and now includes a significant number of example use cases and tutorials for customizing each of MEAD's modules.

To download MEAD, including documentation, and view online docs, visit: http://www.summarization.com/mead

People who have worked on MEAD include:

Dragomir Radev, Sasha Blair-Goldensohn, John Blitzer, Arda Celebi, Elliott Drabek, Wai Lam, Danyu Liu, Hong Qi, Horacio Saggion, Simone Teufel, Michael Topper, Adam Winkel