Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Supervised Domain Adaption for WSD"

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ

The lack of positive results on supervised domain adaptation for WSD have cast some doubts on the utility of handtagging general corpora and thus developing generic supervised WSD systems. In this paper we show for the first time that our WSD system trained on a general source corpus (B NC) and the target corpus, obtains up to 22% error reduction when compared to a system trained on the target corpus alone. In addition, we show that as little as 40% of the target corpus (when supplemented with the source corpus) is sufficient to obtain the same results as training. | Supervised Domain Adaption for WSD Eneko Agirre and Oier Lopez de Lacalle IXA NLP Group University of the Basque Country Donostia Basque Contry e.agirre oier.lopezdelacalle @ehu.es Abstract The lack of positive results on supervised domain adaptation for WSD have cast some doubts on the utility of handtagging general corpora and thus developing generic supervised WSD systems. In this paper we show for the first time that our WSD system trained on a general source corpus Bnc and the target corpus obtains up to 22 error reduction when compared to a system trained on the target corpus alone. In addition we show that as little as 40 of the target corpus when supplemented with the source corpus is sufficient to obtain the same results as training on the full target data. The key for success is the use of unlabeled data with SVD a combination of kernels and SVM. 1 Introduction In many Natural Language Processing NLP tasks we find that a large collection of manually-annotated text is used to train and test supervised machine learning models. While these models have been shown to perform very well when tested on the text collection related to the training data what we call the source domain the performance drops considerably when testing on text from other domains called target domains . In order to build models that perform well in new target domains we usually find two settings Daume III 2007 . In the semi-supervised setting the training hand-annotated text from the source domain is supplemented with unlabeled data from the target domain. In the supervised setting we use training data from both the source and target domains to test on the target domain. In Agirre and Lopez de Lacalle 2008 we studied semi-supervised Word Sense Disambigua tion WSD adaptation and in this paper we focus on supervised WSD adaptation. We compare the performance of similar supervised WSD systems on three different scenarios. In the source to target scenario the WSD system is trained on the .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.