Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Determining Word Sense Dominance Using a Thesaurus"

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ

The degree of dominance of a sense of a word is the proportion of occurrences of that sense in text. We propose four new methods to accurately determine word sense dominance using raw text and a published thesaurus. Unlike the McCarthy et al. (2004) system, these methods can be used on relatively small target texts, without the need for a similarly-sensedistributed auxiliary text. We perform an extensive evaluation using artificially generated thesaurus-sense-tagged data. In the process, we create a word–category cooccurrence matrix, which can be used for unsupervised word sense disambiguation and estimating distributional similarity of word senses, as. | Determining Word Sense Dominance Using a Thesaurus Saif Mohammad and Graeme Hirst Department of Computer Science University of Toronto Toronto ON M5S 3G4 Canada smm gh @cs.toronto.edu Abstract The degree of dominance of a sense of a word is the proportion of occurrences of that sense in text. We propose four new methods to accurately determine word sense dominance using raw text and a published thesaurus. Unlike the McCarthy et al. 2004 system these methods can be used on relatively small target texts without the need for a similarly-sense-distributed auxiliary text. We perform an extensive evaluation using artificially generated thesaurus-sense-tagged data. In the process we create a word-category cooccurrence matrix which can be used for unsupervised word sense disambiguation and estimating distributional similarity of word senses as well. 1 Introduction The occurrences of the senses of a word usually have skewed distribution in text. Further the distribution varies in accordance with the domain or topic of discussion. For example the assertion of illegality sense of charge is more frequent in the judicial domain while in the domain of economics the expense cost sense occurs more often. Formally the degree of dominance of a particular sense of a word target word in a given text target text may be defined as the ratio of the occurrences of the sense to the total occurrences of the target word. The sense with the highest dominance in the target text is called the predominant sense of the target word. Determination of word sense dominance has many uses. An unsupervised system will benefit by backing off to the predominant sense in case of insufficient evidence. The dominance values may be used as prior probabilities for the different senses obviating the need for labeled training data in a sense disambiguation task. Natural language systems can choose to ignore infrequent senses of words or consider only the most dominant senses McCarthy et al. 2004 . An .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.