TAILIEUCHUNG - Báo cáo khoa học: "Unsupervised Coreference Resolution in a Nonparametric Bayesian Model"

We present an unsupervised, nonparametric Bayesian approach to coreference resolution which models both global entity identity across a corpus as well as the sequential anaphoric structure within each document. While most existing coreference work is driven by pairwise decisions, our model is fully generative, producing each mention from a combination of global entity properties and local attentional state. Despite being unsupervised, our system achieves a MUC F1 measure on the MUC-6 test set, broadly in the range of some recent supervised results. . | Unsupervised Coreference Resolution in a Nonparametric Bayesian Model Aria Haghighi and Dan Klein Computer Science Division UC Berkeley aria42 klein @ Abstract We present an unsupervised nonparametric Bayesian approach to coreference resolution which models both global entity identity across a corpus as well as the sequential anaphoric structure within each document. While most existing coreference work is driven by pairwise decisions our model is fully generative producing each mention from a combination of global entity properties and local attentional state. Despite being unsupervised our system achieves a MUC Fl measure on the MUC-6 test set broadly in the range of some recent supervised results. 1 Introduction Referring to an entity in natural language can broadly be decomposed into two processes. First speakers directly introduce new entities into discourse entities which may be shared across discourses. This initial reference is typically accomplished with proper or nominal expressions. Second speakers refer back to entities already introduced. This anaphoric reference is canonically though of course not always accomplished with pronouns and is governed by linguistic and cognitive constraints. In this paper we present a nonparametric generative model of a document corpus which naturally connects these two processes. Most recent coreference resolution work has focused on the task of deciding which mentions noun phrases in a document are coreferent. The dominant approach is to decompose the task into a collection of pairwise coreference decisions. One then 848 applies discriminative learning methods to pairs of mentions using features which encode properties such as distance syntactic environment and so on Soon et al. 2001 Ng and Cardie 2002 . Although such approaches have been successful they have several liabilities. First rich features require plentiful labeled data which we do not have for coreference tasks in most domains and .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.