Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Bootstrapping"

Diễm Thúy 65 8 pdf

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ Tải xuống

This paper reﬁnes the analysis of cotraining, deﬁnes and evaluates a new co-training algorithm that has theoretical justiﬁcation, gives a theoretical justiﬁcation for the Yarowsky algorithm, and shows that co-training and the Yarowsky algorithm are based on diﬀerent independence assumptions. agreement on unlabeled data, nor, for that matter, does the co-training algorithm directly seek to ﬁnd classiﬁers that agree on unlabeled data. Moreover, the suggestion that the Yarowsky algorithm is a special case of co-training is based on an incidental detail of the particular application that Yarowsky considers, not on the properties of the core algorithm. . | Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics ACL Philadelphia July 2002 pp. 360-367. Bootstrapping Steven Abney AT T Laboratories - Research 180 Park Avenue Florham Park NJ USA 07932 Abstract This paper refines the analysis of cotraining defines and evaluates a new co-training algorithm that has theoretical justification gives a theoretical justification for the Yarowsky algorithm and shows that co-training and the Yarowsky algorithm are based on different independence assumptions. 1 Overview The term bootstrapping here refers to a problem setting in which one is given a small set of labeled data and a large set of unlabeled data and the task is to induce a classifier. The plenitude of unlabeled natural language data and the paucity of labeled data have made bootstrapping a topic of interest in computational linguistics. Current work has been spurred by two papers Yarowsky 1995 and Blum and Mitchell 1998 . Blum and Mitchell propose a conditional independence assumption to account for the efficacy of their algorithm called co-training and they give a proof based on that conditional independence assumption. They also give an intuitive explanation of why co-training works in terms of maximizing agreement on unlabeled data between classifiers based on different views of the data. Finally they suggest that the Yarowsky algorithm is a special case of the co-training algorithm. The Blum and Mitchell paper has been very influential but it has some shortcomings. The proof they give does not actually apply directly to the co-training algorithm nor does it directly justify the intuitive account in terms of classifier agreement on unlabeled data nor for that matter does the co-training algorithm directly seek to find classifiers that agree on unlabeled data. Moreover the suggestion that the Yarowsky algorithm is a special case of co-training is based on an incidental detail of the particular application that Yarowsky considers not on

TÀI LIỆU LIÊN QUAN

Kỷ yếu tóm tắt báo cáo khoa học: Hội nghị khoa học tim mạch toàn quốc lần thứ XI - Hội tim mạch Quốc gia Việt Nam

Báo cáo nghiên cứu khoa học: "Danh lục các loài thú ở khu bảo tồn thiên nhiên Pù Huống tỉnh Nghệ An và ý nghĩa bảo tồn nguồn gen quí hiếm của chúng"

Báo cáo khoa học: Hỗ trợ nâng cao năng lực quản lý chất thải sinh hoạt tại thành phố Hội An

Báo cáo nghiên cứu khoa học: "Tính năng động nghệ thuật của văn học hiện đại Việt Nam và một cách nhìn hành trình thể loại"

Báo cáo nghiên cứu khoa học: " DỊCH CHUYỂN TRUY VẤN OQL VÀO CÁC PHÉP TÍNH BAO HÀM"

Báo cáo khoa học: " Áp dụng thủ tục phân tích trong kiểm toán báo cáo tài chính"

Báo cáo nghiên cứu khoa học: "Người lính trở về sau chiến tranh với mặc cảm “ăn mày dĩ vãng’ trong tiểu thuyết Chu Lai"

Báo cáo nghiên cứu khoa học: "Khảo sát hiện tượng chuyển đổi chức năng - nghĩa của động từ tiếng Việt"

Báo cáo nghiên cứu khoa học: " BẢN CHẤT KHOA HỌC VÀ CÁCH MẠNG LÀ CỘI NGUỒN SỨC SỐNG CỦA CHỦ NGHĨA MÁC - LÊNIN"

Báo cáo khoa học: " CẢI TIẾN CÁC THUẬT TOÁN MƯỢN VÀ KHOÁ KÊNH TẦN SỐ MẠNG DI ĐỘNG TẾ BÀO"

Đã phát hiện trình chặn quảng cáo AdBlock

Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.