TAILIEUCHUNG - Báo cáo khoa học: "A Discriminative Hierarchical Model for Fast Coreference at Large Scale"

Methods that measure compatibility between mention pairs are currently the dominant approach to coreference. However, they suffer from a number of drawbacks including difficulties scaling to large numbers of mentions and limited representational power. As these drawbacks become increasingly restrictive, the need to replace the pairwise approaches with a more expressive, highly scalable alternative is becoming urgent. | A Discriminative Hierarchical Model for Fast Coreference at Large Scale Michael Wick University of Massachsetts 140 Governor s Drive Amherst MA mwick@ Sameer Singh University of Massachusetts 140 Governor s Drive Amherst MA sameer@ Andrew McCallum University of Massachusetts 140 Governor s Drive Amherst MA mccallum@ Abstract Methods that measure compatibility between mention pairs are currently the dominant approach to coreference. However they suffer from a number of drawbacks including difficulties scaling to large numbers of mentions and limited representational power. As these drawbacks become increasingly restrictive the need to replace the pairwise approaches with a more expressive highly scalable alternative is becoming urgent. In this paper we propose a novel discriminative hierarchical model that recursively partitions entities into trees of latent sub-entities. These trees succinctly summarize the mentions providing a highly compact information-rich structure for reasoning about entities and coreference uncertainty at massive scales. We demonstrate that the hierarchical model is several orders of magnitude faster than pairwise allowing us to perform coreference on six million author mentions in under four hours on a single CPU. 1 Introduction Coreference resolution the task of clustering mentions into partitions representing their underlying real-world entities is fundamental for high-level information extraction and data integration including semantic search question answering and knowledge base construction. For example coreference is vital for determining author publication lists in bibliographic knowledge bases such as CiteSeer and Google Scholar where the repository must know if the R. Hamming who authored Error detecting and error correcting codes is the same R. 379 Hamming who authored The unreasonable effectiveness of mathematics. Features of the mentions . bags-of-words in titles contextual snippets and .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.