Publishing Partner: Cambridge University Press CUP Extra Publisher Login
amazon logo
More Info


New from Oxford University Press!

ad

Linguistic Diversity and Social Justice

By Ingrid Piller

Linguistic Diversity and Social Justice "prompts thinking about linguistic disadvantage as a form of structural disadvantage that needs to be recognized and taken seriously."


New from Cambridge University Press!

ad

Language Evolution: The Windows Approach

By Rudolf Botha

Language Evolution: The Windows Approach addresses the question: "How can we unravel the evolution of language, given that there is no direct evidence about it?"


The LINGUIST List is dedicated to providing information on language and language analysis, and to providing the discipline of linguistics with the infrastructure necessary to function in the digital world. LINGUIST is a free resource, run by linguistics students and faculty, and supported primarily by your donations. Please support LINGUIST List during the 2016 Fund Drive.

Academic Paper


Title: A new PPM variant for Chinese text compression
Author: Peiliang Wu
Institution: University of Wales, Bangor
Author: W. J. Teahan
Institution: University of Wales, Bangor
Linguistic Field: Computational Linguistics; Writing Systems
Subject Language: Chinese, Mandarin
Abstract: Large alphabet languages such as Chinese are very different from English, and therefore present different problems for text compression. In this article, we first examine the characteristics of Chinese, then we introduce a new variant of the Prediction by Partial Match (PPM) model especially for Chinese characters. Unlike the traditional PPM coding schemes, which encodes an escape probability if a novel character occurs in the context, the new coding scheme directly encodes the order first before encoding a symbol, without having to output an escape probability. This scheme achieves excellent compression rates in comparison with other schemes on a variety of Chinese text files.

CUP AT LINGUIST

This article appears IN Natural Language Engineering Vol. 14, Issue 3, which you can READ on Cambridge's site or on LINGUIST .



Add a new paper
Return to Academic Papers main page
Return to Directory of Linguists main page