Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
This paper presents a case study of analyzing and improving intercoder reliability in discourse tagging using statistical techniques. Biascorrected tags are formulated and successfully used to guide a revision of the coding manual and develop an automatic classifier. | Development and Use of a Gold-Standard Data Set for Subjectivity Classifications Janyce M. Wiebef and Rebecca F. Brucet and Thomas p. O Haraf Department of Computer Science and Computing Research Laboratory New Mexico State University Las Cruces NM 88003 ịDepartment of Computer Science University of North Carolina at Asheville Asheville NC 28804-8511 wiebe tomohara@cs.nmsu. edu bruce@cs.unca. edu Abstract This paper presents a case study of analyzing and improving intercoder reliability in discourse tagging using statistical techniques. Bias-corrected tags are formulated and successfully used to guide a revision of the coding manual and develop an automatic classifier. 1 Introduction This paper presents a case study of analyzing and improving intercoder reliability in discourse tagging using the statistical techniques presented in Bruce and Wiebe 1998 Bruce and Wiebe to appear . Our approach is data driven we refine our understanding and presentation of the classification scheme guided by the results of the intercoder analysis. We also present the results of a probabilistic classifier developed on the resulting annotations. Much research in discourse processing has focused on task-oriented and instructional dialogs. The task addressed here comes to the fore in other genres especially news reporting. The task is to distinguish sentences used to objectively present factual information from sentences used to present opinions and evaluations. There are many applications for which this distinction promises to be important including text categorization and summarization. This research takes a large step toward developing a reliably annotated gold standard to support experimenting with such applications. This research is also a case study of analyzing and improving manual tagging that is applicable to any tagging task. We perform a statistical analysis that provides information that complements the information provided by Cohen s Kappa Cohen 1960 Carletta 1996 . In .