LINGUIST List 31.3638
Wed Nov 25 2020
Software: KLPT - the Kurdish Language Processing Toolkit
Editor for this issue: Everett Green <everettlinguistlist.org>
Date: 19-Nov-2020
From: Sina Ahmadi <ahmadi.sina
outlook.com>
Subject: KLPT - the Kurdish Language Processing Toolkit
E-mail this message to a friend [with apologies for cross-posting]
I am thrilled to be releasing the Kurdish Language Processing Toolkit (
https://github.com/sinaahmadi/klpt).
KLPT is a natural language processing (NLP) toolkit in Python for the Kurdish language, a less-resourced Indo-European language which is spoken by 20-30 million speakers. This initial version comes with four core modules for the Sorani and Kurmanji dialects of Kurdish, namely preprocess, stem, transliterate and tokenize, and addresses basic language processing tasks such as:
- text preprocessing
- stemming
- tokenziation
- spell error detection and correction
- morphological analysis
More importantly, it is an open-source project!
I hope that this toolkit will pave the way for further advances in Kurdish language processing and that it receives more attention in the NLP field.
Best regards,
Sina Ahmadi
http://sinaahmadi.github.io/ Linguistic Field(s): Computational Linguistics
Subject Language(s):
Kurdish, Central (ckb) Kurdish, Northern (kmr) Kurdish, Southern (sdh)
Page Updated: 25-Nov-2020