TAILIEUCHUNG - Báo cáo khoa học: "Automatic Creation of Domain Templates"

Recently, many Natural Language Processing (NLP) applications have improved the quality of their output by using various machine learning techniques to mine Information Extraction (IE) patterns for capturing information from the input text. Currently, to mine IE patterns one should know in advance the type of the information that should be captured by these patterns. In this work we propose a novel methodology for corpus analysis based on cross-examination of several document collections representing different instances of the same domain. We show that this methodology can be used for automatic domain template creation. . | Automatic Creation of Domain Templates Elena Filatova Vasileios Hatzivassiloglou and Kathleen McKeown Department of Computer Science Department of Computer Science Columbia University The University of Texas at Dallas filatova kathy @ vh@ Abstract Recently many Natural Language Processing NLP applications have improved the quality of their output by using various machine learning techniques to mine Information Extraction IE patterns for capturing information from the input text. Currently to mine IE patterns one should know in advance the type of the information that should be captured by these patterns. In this work we propose a novel methodology for corpus analysis based on cross-examination of several document collections representing different instances of the same domain. We show that this methodology can be used for automatic domain template creation. As the problem of automatic domain template creation is rather new there is no well-defined procedure for the evaluation of the domain template quality. Thus we propose a methodology for identifying what information should be present in the template. Using this information we evaluate the automatically created domain templates through the text snippets retrieved according to the created templates. 1 Introduction Open-ended question-answering QA systems typically produce a response containing a variety of specific facts proscribed by the question type. A biography for example might contain the date of birth occupation or nationality of the person in question Duboue and McKeown 2003 Zhou et al. 2004 Weischedel et al. 2004 Filatova and Prager 2005 . A definition may contain the genus of the term and characteristic attributes Blair-Goldensohn et al. 2004 . A response to a question about a terrorist attack might include the event victims perpetrator and date as the templates designed for the Message Understanding Conferences Radev and McKeown 1998 White et al. 2001 predicted. .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.