TAILIEUCHUNG - Báo cáo khoa học: "Contrastive Estimation: Training Log-Linear Models on Unlabeled Data∗"

Conditional random fields (Lafferty et al., 2001) are quite effective at sequence labeling tasks like shallow parsing (Sha and Pereira, 2003) and namedentity extraction (McCallum and Li, 2003). CRFs are log-linear, allowing the incorporation of arbitrary features into the model. To train on unlabeled data, we require unsupervised estimation methods for log-linear models; few exist. We describe a novel approach, contrastive estimation. We show that the new technique can be intuitively understood as exploiting implicit negative evidence and is computationally efficient. . | Contrastive Estimation Training Log-Linear Models on Unlabeled Data Noah A. Smith and Jason Eisner Department of Computer Science Center for Language and Speech Processing Johns Hopkins University Baltimore MD 21218 USA nasmith jason @ Abstract Conditional random fields Lafferty et al. 2001 are quite effective at sequence labeling tasks like shallow parsing Sha and Pereira 2003 and named-entity extraction McCallum and Li 2003 . CRFs are log-linear allowing the incorporation of arbitrary features into the model. To train on unlabeled data we require unsupervised estimation methods for log-linear models few exist. We describe a novel approach contrastive estimation. We show that the new technique can be intuitively understood as exploiting implicit negative evidence and is computationally efficient. Applied to a sequence labeling problem POS tagging given a tagging dictionary and unlabeled text contrastive estimation outperforms EM with the same feature set is more robust to degradations of the dictionary and can largely recover by modeling additional features. 1 Introduction Finding linguistic structure in raw text is not easy. The classical forward-backward and inside-outside algorithms try to guide probabilistic models to discover structure in text but they tend to get stuck in local maxima Charniak 1993 . Even when they avoid local maxima . through clever initialization they typically deviate from human ideas of whatthe right structure is Merialdo 1994 . One strategy is to incorporate domain knowledge into the model s structure. Instead of blind HMMs or PCFGs one could use models whose features This work was supported by a Fannie and John Hertz Foundation fellowship to the first author and NSFITR grant IIS-0313193 to the second author. The views expressed are not necessarily endorsed by the sponsors. The authors also thank three anonymous ACL reviewers for helpful comments colleagues at JHU CLSP especially David Smith and Roy Tromble and Miles .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.