TAILIEUCHUNG - Báo cáo khoa học: "Domain Adaptation with Active Learning for Word Sense Disambiguation"

When a word sense disambiguation (WSD) system is trained on one domain but applied to a different domain, a drop in accuracy is frequently observed. This highlights the importance of domain adaptation for word sense disambiguation. In this paper, we first show that an active learning approach can be successfully used to perform domain adaptation of WSD systems. Then, by using the predominant sense predicted by expectation-maximization (EM) and adopting a count-merging technique, we improve the effectiveness of the original adaptation process achieved by the basic active learning approach. . | Domain Adaptation with Active Learning for Word Sense Disambiguation Yee Seng Chan and Hwee Tou Ng Department of Computer Science National University of Singapore 3 Science Drive 2 Singapore 117543 chanys nght @ Abstract When a word sense disambiguation WSD system is trained on one domain but applied to a different domain a drop in accuracy is frequently observed. This highlights the importance of domain adaptation for word sense disambiguation. In this paper we first show that an active learning approach can be successfully used to perform domain adaptation of WSD systems. Then by using the predominant sense predicted by expectation-maximization EM and adopting a count-merging technique we improve the effectiveness of the original adaptation process achieved by the basic active learning approach. 1 Introduction In natural language a word often assumes different meanings and the task of determining the correct meaning or sense of a word in different contexts is known as word sense disambiguation WSD . To date the best performing systems in WSD use a corpus-based supervised learning approach. With this approach one would need to collect a text corpus in which each ambiguous word occurrence is first tagged with its correct sense to serve as training data. The reliance of supervised WSD systems on annotated corpus raises the important issue of domain dependence. To investigate this Escudero et al. 2000 and Martinez and Agirre 2000 conducted experiments using the DSO corpus which 49 contains sentences from two different corpora namely Brown Corpus BC and Wall Street Journal WSJ . They found that training a WSD system on one part BC or WSJ of the DSO corpus and applying it to the other can result in an accuracy drop of more than 10 highlighting the need to perform domain adaptation of WSD systems to new domains. Escudero et al. 2000 pointed out that one of the reasons for the drop in accuracy is the difference in sense priors . the proportions of the .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.