TAILIEUCHUNG - Báo cáo khoa học: "Computationally Efficient M-Estimation of Log-Linear Structure Models∗"

We describe a new loss function, due to Jeon and Lin (2006), for estimating structured log-linear models on arbitrary features. The loss function can be seen as a (generative) alternative to maximum likelihood estimation with an interesting information-theoretic interpretation, and it is statistically consistent. It is substantially faster than maximum (conditional) likelihood estimation of conditional random fields (Lafferty et al., 2001; an order of magnitude or more). We compare its performance and training time to an HMM, a CRF, an MEMM, and pseudolikelihood on a shallow parsing task. These experiments help tease apart the contributions of rich features and. | Computationally Efficient M-Estimation of Log-Linear Structure Models Noah A. Smith and Douglas L. Vail and John D. Lafferty School of Computer Science Carnegie Mellon University PittsbUrgh PA 15213 UsA nasmith dvail2 lafferty @ Abstract We describe a new loss function due to Jeon and Lin 2006 for estimating structured log-linear models on arbitrary features. The loss function can be seen as a generative alternative to maximum likelihood estimation with an interesting information-theoretic interpretation and it is statistically consistent. It is substantially faster than maximum conditional likelihood estimation of conditional random fields Lafferty et al. 2001 an order of magnitude or more . We compare its performance and training time to an HMM a CRF an MEMM and pseudolikelihood on a shallow parsing task. These experiments help tease apart the contributions of rich features and discriminative training which are shown to be more than additive. 1 Introduction Log-linear models are a very popular tool in natural language processing and are often lauded for permitting the use of arbitrary and correlated features of the data by a model. Users of log-linear models know however that this claim requires some qualification any feature is permitted in principle but training log-linear models and decoding under them is tractable only when the model s independence assumptions permit efficient inference procedures. For example in the original conditional random fields Lafferty et al. 2001 features were con This work was supported by NSF grant IIS-0427206 and the DARPA CALO project. The authors are grateful for feedback from David Smith and from three anonymous ACL reviewers and helpful discussions with Charles Sutton. 752 fined to locally-factored indicators on label bigrams and label unigrams with any of the observation . Even in cases where inference in log-linear models is tractable it requires the computation of a partition function. More formally a log-linear .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.