Review of  A Frequency Dictionary of Czech

Reviewer: Michael Grosvald
Book Title: A Frequency Dictionary of Czech
Book Author: František Čermák Michal Kren
Publisher: Routledge (Taylor and Francis)
Linguistic Field(s): Applied Linguistics
General Linguistics
Language Documentation
Text/Corpus Linguistics
Subject Language(s): Czech
Issue Number: 22.2825

AUTHORS: František Čermák & Michal Křen
TITLE: A Frequency Dictionary of Czech
SUBTITLE: Core Vocabulary for Learners
SERIES TITLE: Routledge Frequency Dictionaries
PUBLISHER: Routledge (Taylor and Francis)
YEAR: 2010

Michael Grosvald, Department of Neurology, University of California at Irvine


This is the latest in the Routledge series of frequency-based learning
dictionaries, which already includes similar titles for Spanish, German,
Portuguese, French, American English and Mandarin Chinese, with another for
Arabic on its way. The aim of these books is to help language students develop
their vocabulary efficiently by enabling them to focus on the most frequently
used words. The benefit of such an approach to vocabulary learning is made clear
in the preface to the series, where it is noted that as much as 95 percent of a
typical text in English consists of the four to five thousand most frequent
words. Similarly, as much as 85 percent of spoken English consists of just one
thousand common words (Nation, 1990). Therefore, learners of a language can in
principle make rapid early progress in written and spoken communication by
focusing selectively on a relatively small set of words, as long as they know
which words to target.

The main body of this frequency dictionary is a list of the 5000 most common
Czech words, as determined by statistical analysis of a 100-million-word corpus.
As discussed in the book's introduction, this corpus was drawn from material
representing both written and spoken sources. Although this material was much
more heavily weighted toward written than spoken language sources, frequency
values drawn from different source categories have been normalized and weighted
to create generalized frequency rankings for the main word list that reflect a
more even balance.

The introduction also discusses some of the particular challenges brought to
this project by the Czech language itself. These include the fact that as is
typical for Slavic languages, Czech is highly inflected, so that the procedure
used to determine frequency has had to take into account the potentially
numerous inflectional forms of a given word. In addition, there is a very
substantial gap between the official and colloquial versions of the language, a
situation that has been referred to elsewhere as ''semi-diglossia'' (Wilson,
2010). As a result, many highly frequent words are encountered exclusively or
almost exclusively in only written- or only spoken-language contexts. In such
cases, the word is included in the frequency-sorted list along with a notation,
described in detail below, that informs the learner in which contexts the word
is most likely to appear in normal usage.

Each word's entry in the main frequency-sorted list includes the word's ''rank
order'' (a generalized frequency measure, according to which the entire set of
words is listed from 1, most frequent, to 5000), the word's part of speech, its
English translation(s) and an example of the word used in a Czech sentence, also
translated into English. Where appropriate, one or more ''register codes'' are
given, each with a plus or minus sign, indicating that a word was particularly
likely or unlikely to occur within corpus sources belonging to one of four
categories (spoken, fiction, non-fiction and newspapers). Also appearing in each
word's entry is a second frequency-based measure called the ''overall normalized
averaged reduced frequency.''

Following the main frequency-sorted list are an alphabetically sorted list of
all 5000 words together with their frequency rankings and English glosses; and
another arrangement of the same set of 5000 words, this time sorted into
sub-lists by part of speech, with words in each such sub-list given in order of
frequency. Also distributed throughout the book are 20 thematically organized,
frequency-ranked word lists, each list containing the most common Czech words
related to a particular topic such as family, professions, and verbs of motion.


As a language teacher and student, I have often found frequency-based
dictionaries to be extremely useful, and I have no doubt that this work will
prove similarly valuable for learners of Czech. As noted by the authors, the
book is also likely to be helpful for educators, including teachers of Czech as
well as individuals involved in curriculum design and materials development. The
statistical information that is presented may also be of substantial interest to

This book is particularly welcome in light of the fact that Czech is spoken by
relatively few people (12 million or so), which has meant that learners and
teachers of the language have had access to educational resources which are far
fewer in number than those available for more widely-spoken languages. In fact,
it happens to be the case that while living in Prague and studying Czech, I
searched in vain for exactly this kind of dictionary -- having made successful
use of frequency-based dictionaries available for other European languages --
and was disappointed to find that none apparently existed for Czech. It is
fortunate that this situation has now been remedied, and for learners who wish
to accelerate the progress of their vocabulary development, this is probably the
best single resource that can be recommended. It should, however, be noted that
the dictionary is not intended as a first introduction to the language for
beginners, who will need to acquaint themselves elsewhere with basic grammar and
pronunciation (although with respect to the latter, Czech orthography generally
agrees quite well with the standard International Phonetic Alphabet).

The layout of this dictionary follows the pattern established by earlier entries
in this series, and is straightforward and for the most part, logical. Two
potential weak points can be noted, though I believe they can be considered
inconveniences rather than major flaws. First is the lack of an English-to-Czech
glossary. Second, the dictionary does not include proper names, which for the
authors' purposes are defined as those beginning with a capital letter --
unfortunately, this includes a fair number of names for nearby places and other
items whose Czech names are likely to be unintuitive for many learners (e.g.
Německo for ''Germany,'' Rakousko for ''Austria''). Interestingly, the adjectival
forms corresponding to such names are not capitalized in Czech (e.g. německý for
''German'') and hence are eligible for inclusion in the word list. A similar
situation holds for a number of other sets of words which in English are
considered proper nouns but are not capitalized in Czech and therefore are
included in the dictionary; these include the names of months (e.g. duben,
''April'') and days of the week (e.g. čtvrtek, ''Thursday''). Despite the absence of
(capitalized) proper names, the dictionary appears to be quite complete
otherwise; even interjections (e.g. fuj, ''yuck'') are included.

Overall, there is much more to praise here than to criticize. The issues I have
mentioned above are quite minor when one takes into account the obvious effort
and care that has gone into creating the main frequency-ranked word list along
with the other tools this dictionary provides. The example sentences are clear
and illustrative; learners will no doubt find it a helpful exercise to practice
translating both the vocabulary words and their example sentences from Czech to
English and vice versa. The part of speech index and the ''thematic vocabulary''
sections permit the learner to focus selectively on particular grammatical
classes or other groups of words.

In my generally favorable review of a previous entry in this dictionary series
(the one for Mandarin Chinese), I stated that despite a few weaknesses I had
noted, my main reaction upon examining the dictionary was a sense of regret that
it had not been available to me years earlier when I was first studying Chinese.
My reaction to this Czech frequency dictionary is much the same. I would have
greatly valued such a resource when first studying Czech, and believe that
future learners will find it an extremely effective learning tool. I have no
hesitation in giving this dictionary a strong positive recommendation.


Nation, I. S. P. (1990). Teaching and Learning Vocabulary. Boston: Heinle & Heinle.

Wilson, J. (2010). Moravians in Prague: A Sociolinguistic Study of Dialect
Contact in the Czech Republic. Frankfurt am Main: Peter Lang AG.

Michael Grosvald earned his doctorate in Linguistics in 2009 at the University of California at Davis. His background includes over a decade as a language instructor in Prague, Berlin, Taipei and the US; his interests include the phonetics and phonology of signed and spoken languages, second language acquisition, computational linguistics, psycholinguistics and the neuroscience of language. He is currently working as a post-doctoral scholar in the Department of Neurology at the University of California, Irvine.

