LINGUIST List 31.3638

Wed Nov 25 2020

Software: KLPT - the Kurdish Language Processing Toolkit

Editor for this issue: Everett Green <everettlinguistlist.org>



Date: 19-Nov-2020
From: Sina Ahmadi <ahmadi.sinaoutlook.com>
Subject: KLPT - the Kurdish Language Processing Toolkit
E-mail this message to a friend

[with apologies for cross-posting]

I am thrilled to be releasing the Kurdish Language Processing Toolkit (https://github.com/sinaahmadi/klpt).

KLPT is a natural language processing (NLP) toolkit in Python for the Kurdish language, a less-resourced Indo-European language which is spoken by 20-30 million speakers. This initial version comes with four core modules for the Sorani and Kurmanji dialects of Kurdish, namely preprocess, stem, transliterate and tokenize, and addresses basic language processing tasks such as:

- text preprocessing
- stemming
- tokenziation
- spell error detection and correction
- morphological analysis

More importantly, it is an open-source project!

I hope that this toolkit will pave the way for further advances in Kurdish language processing and that it receives more attention in the NLP field.

Best regards,
Sina Ahmadi
http://sinaahmadi.github.io/

Linguistic Field(s): Computational Linguistics

Subject Language(s): Kurdish, Central (ckb)
                            Kurdish, Northern (kmr)
                            Kurdish, Southern (sdh)


Page Updated: 25-Nov-2020