LINGUIST List 27.937
Mon Feb 22 2016
Disc: Significance testing for corpus comparison
Editor for this issue: Anna White <awhitelinguistlist.org>
Bettina Eiber <bettina.eiber
Significance testing for corpus comparison E-mail this message to a friend
I am working on a corpus containing Wikipedia articles and articles from printed encyclopedias. I would like to study differences in style between Computer Mediated Discourse and written discourse. My corpus contains articles from 4 disciplines and it is thematically comparable because I always chose the same lemma.
I also calculated relative frequencies and now I ask myself how to find out the most typical words for each subcorpus (Wikipedia vs. printed encyclopedias). For this purpose I ask the question if statistical methods like significance testing could help here. I read about LL-test, chi square and also non-parametric tests.
Now: Which test should I apply for my research question or should I rely on other measures?
Thank you for your answers,
Linguistic Field(s): Text/Corpus Linguistics
Page Updated: 22-Feb-2016