LINGUIST List 29.913
Mon Feb 26 2018
FYI: Announcing the SFU Opinion and Comments Corpus
Editor for this issue: Kenneth Steimel <kenlinguistlist.org>
Maite Taboada <mtaboada
Announcing the SFU Opinion and Comments Corpus E-mail this
message to a friend
The Discourse Processing Lab at Simon Fraser
is pleased to announce the release of the SFU Opinion and Comments Corpus.
SFU Opinion and Comments Corpus (SOCC) is a corpus for the analysis of online news
comments. Our corpus contains comments and the articles from which the comments
originated. The articles are all opinion articles, not hard news articles. The
corpus is larger than any other currently available comments corpora, and has been
collected with attention to preserving reply structures and other metadata. In
addition to the raw corpus, we also present annotations for four different
phenomena: constructiveness, toxicity, negation and its scope, and appraisal.
Full details, and download link, are available from our GitHub project page: https://github.com/sfu-discourse-lab/SOCC
For more information about this work, please see our papers.
V., H. Wu, L. Cavasso, E. Francis, K. Shukla and M. Taboada (2018) The SFU Opinion
and Comments Corpus: A corpus for the analysis of online news comments. Journal
paper under review.
Kolhatkar. V. and M. Taboada (2017) Using New York
Times Picks to identify constructive comments. Proceedings of the Workshop Natural
Language Processing Meets Journalism, Conference on Empirical Methods in Natural
Language Processing. Copenhagen. September 2017.
Kolhatkar, V. and M.
Taboada (2017) Constructive language in news comments. Proceedings of the 1st
Abusive Language Online Workshop, 55th Annual Meeting of the Association for
Computational Linguistics. Vancouver. August 2017, pp. 11-17.
Varada Kolhatkar (vkolhatk
Linguistic Field(s): Computational Linguistics; Discourse Analysis;
Subject Language(s): English (eng)
Page Updated: 26-Feb-2018