TAILIEUCHUNG - Báo cáo khoa học: "Organizing Encyclopedic Knowledge based on the Web and its Application to Question Answering"

We propose a method to generate large-scale encyclopedic knowledge, which is valuable for much NLP research, based on the Web. We first search the Web for pages containing a term in question. Then we use linguistic patterns and HTML structures to extract text fragments describing the term. Finally, we organize extracted term descriptions based on word senses and domains. In addition, we apply an automatically generated encyclopedia to a question answering system targeting the Japanese InformationTechnology Engineers Examination. . | Organizing Encyclopedic Knowledge based on the Web and its Application to Question Answering Atsushi Fujii University of Library and Information Science 1-2 Kasuga Tsukuba 305-8550 Japan CREST Japan Science and Technology Corporation fujii@ Tetsuya Ishikawa University of Library and Information Science 1-2 Kasuga Tsukuba 305-8550 Japan ishikawa@ Abstract We propose a method to generate large-scale encyclopedic knowledge which is valuable for much NLP research based on the Web. We first search the Web for pages containing a term in question. Then we use linguistic patterns and HTML structures to extract text fragments describing the term. Finally we organize extracted term descriptions based on word senses and domains. In addition we apply an automatically generated encyclopedia to a question answering system targeting the Japanese InformationTechnology Engineers Examination. 1 Introduction Reflecting the growth in utilization of the World Wide Web a number of Web-based language processing methods have been proposed within the natural language processing NLP information retrieval IR and artificial intelligence AI communities. A sample of these includes methods to extract linguistic resources Fujii and Ishikawa 2000 Resnik 1999 Soderland 1997 retrieve useful information in response to user queries Etzioni 1997 McCallum et al. 1999 and mine discover knowledge latent in the Web Inokuchi et al. 1999 . In this paper mainly from an NLP point of view we explore a method to produce linguistic resources. Specifically we enhance the method proposed by Fu-jii and Ishikawa 2000 which extracts encyclopedic knowledge . term descriptions from the Web. In brief their method searches the Web for pages containing a term in question and uses linguistic expressions and HTML layouts to extract fragments describing the term. They also use a language model to discard non-linguistic fragments. In addition a clustering method is used to divide descriptions into a .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.