TAILIEUCHUNG - Báo cáo khoa học: "Do Automatic Annotation Techniques Have Any Impact on Supervised Complex Question Answering?"

In this paper, we analyze the impact of different automatic annotation methods on the performance of supervised approaches to the complex question answering problem (defined in the DUC-2007 main task). Huge amount of annotated or labeled data is a prerequisite for supervised training. The task of labeling can be accomplished either by humans or by computer programs. When humans are employed, the whole process becomes time consuming and expensive. | Do Automatic Annotation Techniques Have Any Impact on Supervised Complex Question Answering Yllias Chali University of Lethbridge Lethbridge AB Canada chali@ Sadid A. Hasan University of Lethbridge Lethbridge AB Canada hasan@ Shafiq R. Joty University of British Columbia Vancouver BC Canada rjoty@ Abstract In this paper we analyze the impact of different automatic annotation methods on the performance of supervised approaches to the complex question answering problem defined in the DUC-2007 main task . Huge amount of annotated or labeled data is a prerequisite for supervised training. The task of labeling can be accomplished either by humans or by computer programs. When humans are employed the whole process becomes time consuming and expensive. So in order to produce a large set of labeled data we prefer the automatic annotation strategy. We apply five different automatic annotation techniques to produce labeled data using ROUGE similarity measure Basic Element BE overlap syntactic similarity measure semantic similarity measure and Extended String Subsequence Kernel ESSK . The representative supervised methods we use are Support Vector Machines SVM Conditional Random Fields CRF Hidden Markov Models HMM and Maximum Entropy MaxEnt . Evaluation results are presented to show the impact. 1 Introduction In this paper we consider the complex question answering problem defined in the DUC-2007 main task1. We focus on an extractive approach of summarization to answer complex questions where a subset of the sentences in the original documents are chosen. For supervised learning methods huge amount of annotated or labeled data sets are obviously required as a precondition. The decision as to whether a sentence is important enough 1http projects duc duc2007 to be annotated can be taken either by humans or by computer programs. When humans are employed in the process producing such a large labeled corpora becomes time consuming

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.