Editor for this issue: Dan Parker
We'd like to remind readers that the responses to queries are usually best posted to the individual asking the question. That individual is then strongly encouraged to post a summary to the list. This policy was
instituted to help control the huge volume of mail on LINGUIST; so we would appreciate your cooperating with it whenever it seems appropriate.
In addition to posting a summary, we'd like to remind people that it is usually a good idea to personally thank those individuals who have taken the trouble to respond to the query.
Does anyone know where I can find the proportion of English lemmas that are nouns?
More precisely, I'm looking for figures for lemmas in some large dictionary or corpus classified by word class (aka part of speech), and if possible also by token frequency; so ideally I'd like a table which shows nouns (and maybe other word classes) as a percentage of the lemmas in a given frequency range. My assumption is that the percentage of nouns in rare vocabulary is higher than in common vocabulary, but I'd like to know whether this is true.
If I learn anything significant I'll summarise back to the list.