TAILIEUCHUNG - Báo cáo khoa học: "Generalized Expectation Criteria for Semi-Supervised Learning of Conditional Random Fields"

This paper presents a semi-supervised training method for linear-chain conditional random fields that makes use of labeled features rather than labeled instances. This is accomplished by using generalized expectation criteria to express a preference for parameter settings in which the model’s distribution on unlabeled data matches a target distribution. We induce target conditional probability distributions of labels given features from both annotated feature occurrences in context and adhoc feature majority label assignment. . | Generalized Expectation Criteria for Semi-Supervised Learning of Conditional Random Fields Gideon S. Mann Google Inc. 76 Ninth Avenue New York NY 10011 Andrew McCallum Department of Computer Science University of Massachusetts 140 Governors Drive Amherst MA 01003 Abstract This paper presents a semi-supervised training method for linear-chain conditional random fields that makes use of labeled features rather than labeled instances. This is accomplished by using generalized expectation criteria to express a preference for parameter settings in which the model s distribution on unlabeled data matches a target distribution. We induce target conditional probability distributions of labels given features from both annotated feature occurrences in context and ad-hoc feature majority label assignment. The use of generalized expectation criteria allows for a dramatic reduction in annotation time by shifting from traditional instance-labeling to feature-labeling and the methods presented outperform traditional CRF training and other semi-supervised methods when limited human effort is available. 1 Introduction A significant barrier to applying machine learning to new real world domains is the cost of obtaining the necessary training data. To address this problem work over the past several years has explored semi-supervised or unsupervised approaches to the same problems seeking to improve accuracy with the addition of lower cost unlabeled data. Traditional approaches to semi-supervised learning are applied to cases in which there is a small amount of fully labeled data and a much larger amount of unlabeled data presumably from the same data source. For example EM Nigam et al. 1998 transduc-tive SVMs Joachims 1999 entropy regularization Grandvalet and Bengio 2004 and graph-based Traditional Full Instance Labeling address number oak avenue rent I I I I I I I ADDRESS ADDRESS ADDRESS ADDRESS ADDRESS RENT RENT Feature Labeling address number oak avenue rent . ADDRESS ADDRESS . .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.