LINGUIST List 25.4930

Fri Dec 05 2014

Software: Computational Linguistics: DKPro Core (1.7.0)

Editor for this issue: Andrew Lamont <alamontlinguistlist.org>


Date: 05-Dec-2014
From: Tristan Miller <millerukp.informatik.tu-darmstadt.de>
Subject: Computational Linguistics: DKPro Core (1.7.0)
E-mail this message to a friend

We are pleased to announce the release of DKPro Core, version 1.7.0 (ASL & GPL), a collection of interoperable software components for natural language processing (NLP) based on the Apache UIMA framework.

- http://code.google.com/p/dkpro-core-asl
- http://code.google.com/p/dkpro-core-gpl

Analysis components

- hunpos - wrapper for hunpos, a HMM pos tagger including models for many languages;
- langdetect - wrapper for language-detection, a language detection tool for Java;
- mallet - wrapper for topic modelling using MALLET;
- textnormalizer - original components for text normalization, e.g. spelling correction, umlaut normalization, expressive lengthening normalization.

Data formats

- io.conll - support for CoNLL 2000, 2002, 2009 and 2012 formats;
- io.ditop - support for DiTop topic model visualization format;
- io.penntree - support for combined and chunked formats;
- io.tueppdz - support for TüPP-D/Z format.

Further highlights in this release include:

- Upgrade to Apache UIMA 2.6.0;
- Upgrade LanguageTools to version 2.7;
- Upgrade MaltParser to version 1.8;
- Upgrade Stanford CoreNLP to version 3.4.1;
- Support additional MaltParser models: Bengali, Farsi, Polish;
- Support additional MSTParser models: Croatian;
- Support additional OpenNLP models: Spanish;
- Support additional Stanford CoreNLP models: Spanish, English caseless, shift-reduce parser models.

A more detailed overview of the changes in this release can be found here:
https://code.google.com/p/dkpro-core-asl/issues/list?can=1&q=milestone%3D1.7.0&colspec=ID+Type+Status+Priority+DKPro+Module+Milestone+Owner+Summary&cells=tiles

When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects - they may not be compatible with each other.

Linguistic Field(s): Computational Linguistics

Page Updated: 05-Dec-2014