Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
This paper refines the analysis of cotraining, defines and evaluates a new co-training algorithm that has theoretical justification, gives a theoretical justification for the Yarowsky algorithm, and shows that co-training and the Yarowsky algorithm are based on different independence assumptions. agreement on unlabeled data, nor, for that matter, does the co-training algorithm directly seek to find classifiers that agree on unlabeled data. Moreover, the suggestion that the Yarowsky algorithm is a special case of co-training is based on an incidental detail of the particular application that Yarowsky considers, not on the properties of the core algorithm. . | Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics ACL Philadelphia July 2002 pp. 360-367. Bootstrapping Steven Abney AT T Laboratories - Research 180 Park Avenue Florham Park NJ USA 07932 Abstract This paper refines the analysis of cotraining defines and evaluates a new co-training algorithm that has theoretical justification gives a theoretical justification for the Yarowsky algorithm and shows that co-training and the Yarowsky algorithm are based on different independence assumptions. 1 Overview The term bootstrapping here refers to a problem setting in which one is given a small set of labeled data and a large set of unlabeled data and the task is to induce a classifier. The plenitude of unlabeled natural language data and the paucity of labeled data have made bootstrapping a topic of interest in computational linguistics. Current work has been spurred by two papers Yarowsky 1995 and Blum and Mitchell 1998 . Blum and Mitchell propose a conditional independence assumption to account for the efficacy of their algorithm called co-training and they give a proof based on that conditional independence assumption. They also give an intuitive explanation of why co-training works in terms of maximizing agreement on unlabeled data between classifiers based on different views of the data. Finally they suggest that the Yarowsky algorithm is a special case of the co-training algorithm. The Blum and Mitchell paper has been very influential but it has some shortcomings. The proof they give does not actually apply directly to the co-training algorithm nor does it directly justify the intuitive account in terms of classifier agreement on unlabeled data nor for that matter does the co-training algorithm directly seek to find classifiers that agree on unlabeled data. Moreover the suggestion that the Yarowsky algorithm is a special case of co-training is based on an incidental detail of the particular application that Yarowsky considers not on