TAILIEUCHUNG - Báo cáo khoa học: "A Machine Learning Approach to German Pronoun Resolution"

This paper presents a novel ensemble learning approach to resolving German pronouns. Boosting, the method in question, combines the moderately accurate hypotheses of several classifiers to form a highly accurate one. Experiments show that this approach is superior to a single decision-tree classifier. Furthermore, we present a standalone system that resolves pronouns in unannotated text by using a fully automatic sequence of preprocessing modules that mimics the manual annotation process. Although the system performs well within a limited textual domain, further research is needed to make it effective for open-domain question answering and text summarisation. . | A Machine Learning Approach to German Pronoun Resolution Beata Kouchnir Department of Computational Linguistics Tubingen University 72074 Tubingen Germany kouchnir@ Abstract This paper presents a novel ensemble learning approach to resolving German pronouns. Boosting the method in question combines the moderately accurate hypotheses of several classifiers to form a highly accurate one. Experiments show that this approach is superior to a single decision-tree classifier. Furthermore we present a standalone system that resolves pronouns in unannotated text by using a fully automatic sequence of preprocessing modules that mimics the manual annotation process. Although the system performs well within a limited textual domain further research is needed to make it effective for open-domain question answering and text summarisation. 1 Introduction Automatic coreference resolution pronominal and otherwise has been a popular research area in Natural Language Processing for more than two decades with extensive documentation of both the rule-based and the machine learning approach. For the latter good results have been achieved with large feature sets including syntactic semantic grammatical and morphological information derived from handannotated corpora. However for applications that work with plain text . question answering text summarisation this approach is not practical. The system presented in this paper resolves German pronouns in free text by imitating the manual annotation process with off-the-shelf language sofware. As the avalability and reliability of such software is limited the system can use only a small number of features. The fact that most German pronouns are morphologically ambiguous proves an additional challenge. The choice of boosting as the underlying machine learning algorithm is motivated both by its theoretical concept as well as its performance for other NLP tasks. The fact that boosting uses the method of ensemble learning .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.