TAILIEUCHUNG - Báo cáo khoa học: "Unsupervised Ontology Induction from Text"

Extracting knowledge from unstructured text is a long-standing goal of NLP. Although learning approaches to many of its subtasks have been developed (., parsing, taxonomy induction, information extraction), all end-to-end solutions to date require heavy supervision and/or manual engineering, limiting their scope and scalability. We present OntoUSP, a system that induces and populates a probabilistic ontology using only dependency-parsed text as input. | Unsupervised Ontology Induction from Text Hoifung Poon and Pedro Domingos Department of Computer Science Engineering University of Washington hoifung pedrod@ Abstract Extracting knowledge from unstructured text is a long-standing goal of NLP. Although learning approaches to many of its subtasks have been developed . parsing taxonomy induction information extraction all end-to-end solutions to date require heavy supervision and or manual engineering limiting their scope and scalability. We present OntoUSP a system that induces and populates a probabilistic ontology using only dependency-parsed text as input. OntoUSP builds on the USP unsupervised semantic parser by jointly forming ISA and IS-PART hierarchies of lambda-form clusters. The ISA hierarchy allows more general knowledge to be learned and the use of smoothing for parameter estimation. We evaluate On-toUSP by using it to extract a knowledge base from biomedical abstracts and answer questions. OntoUSP improves on the recall of USP by 47 and greatly outperforms previous state-of-the-art approaches. 1 Introduction Knowledge acquisition has been a major goal of NLP since its early days. We would like computers to be able to read text and express the knowledge it contains in a formal representation suitable for answering questions and solving problems. However progress has been difficult. The earliest approaches were manual but the sheer amount of coding and knowledge engineering needed makes them very costly and limits them to well-circumscribed domains. More recently ma chine learning approaches to a number of key subproblems have been developed . Snow et al. 2006 but to date there is no sufficiently automatic end-to-end solution. Most saliently supervised learning requires labeled data which itself is costly and infeasible for large-scale open-domain knowledge acquisition. Ideally we would like to have an end-to-end unsupervised or lightly supervised solution to the problem of knowledge

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.