LINGUIST List 22.2332

Thu Jun 02 2011

Disc: Error in the Fernandez Huerta Readability Formula

Editor for this issue: Elyssa Winzeler <elyssalinguistlist.org>


        1.     Gwillim Law , Error in the Fernandez Huerta Readability Formula

Message 1: Error in the Fernandez Huerta Readability Formula
Date: 27-May-2011
From: Gwillim Law <glawmeasinc.com>
Subject: Error in the Fernandez Huerta Readability Formula
E-mail this message to a friend

José Fernández Huerta published his formula for calculating the readabilityof text in Spanish in 1959. It is still widely used. I found six web pagesthat contain the actual formula, and many more where it is cited. I havetwo strong reasons for believing that the formula contains an error. Iwould be interested in getting additional feedback on the matter. Also, ifthere is a consensus that the formula does contain an error, where would Igo to report it?

The Huerta score* was an adaptation of the Flesch Reading Ease score intoSpanish. The Flesch formula for English text, first published in 1948, is:

Flesch = 206.835 - 84.6 * syllables/words - 1.015 * words/sentences

Scores run roughly from 30 to 100, with higher scores being easier to read.This makes sense. Sentences with more words in them will produce a lowerscore; words with more syllables in them will produce a lower score.

The Huerta formula is usually presented as 206.84 - (0.60 * P) - (1.02 *F), where P = number of syllables and F = number of sentences, as countedin a sample containing 100 words. Applying the same sanity check, we seethat if the number of syllables per word increases, the score decreases, asexpected; but if the number of sentences increases, the score alsodecreases. Now, if the number of sentences in a 100- word sample increases,each sentence must be getting shorter. That should make the readability ofthe passage increase, not decrease. (Reason 1.)

In its original form, the Huerta formula is not scalable. To compare it tothe Flesch formula, one would have to convert it to a formula that worksfor a passage containing any number of words. When I do that, I come upwith the formula

Huerta = 206.84 - 60 * syllables/words - 102 * sentences/words

Note that if words = 100, this works out the same as the original Huertaformula. Note also that it matches the Flesch formula almost term for term.The coefficients of (syllables/words) in the two formulas differ by about40%, but that's understandable, because the average number of syllables ina Spanish word is greater than the corresponding ratio in English. It's thelast term that looks wrong. The coefficients are very different (1.015 and102), but that's because I converted the Huerta formula to make it scalable.

In its original form, the coefficient was 1.02. But the real difference isthat the fraction is inverted. (Reason 2.) Since Fernández Huerta avowedlybased his work on that of Flesch, it seems to me that the obviousconclusion is that he made a mistake. When he decided to stipulate a sampleof 100 words, he got confused and didn't realize that he had inverted thefraction. Perhaps he tested his formula using a sample with 10 sentences,in which case the two formulas give the same result: 1.02 * 10 = 102 * 10/100.

Gwillim Law

References:

Original publication of the Huerta formula:Fernández Huerta, José. Medidas sencillas de lecturabilidad. Consigna 1959;(214): 29-32.

Some web pages describing or using the Huerta formula:

http://www.ideosity.com/SEO/SEO-Readability-Tests.aspx

http://www.standards-schmandards.com/exhibits/rix/

http://www.utexas.edu/disability/ai/resource/readability/manual/huerta-calculate-English.html

http://scielo.isciii.es/scielo.php?script=sci_arttext&pid=S1135-57272002000400007&lng=en&nrm=iso

http://www.faculty.de.gcsu.edu/~cbader/5210/fryforeign.htm


Linguistic Field(s): Applied Linguistics



Page Updated: 02-Jun-2011