TAILIEUCHUNG - Báo cáo khoa học: "AN ASSESSMENT EXTRACTED OF SEMANTIC INFORMATION FROM MACHINE READABLE AUTOMATICALLY DICTIONARIES"

In this paper we provide a quantitative evaluation of information automatically extracted from machine readable dictionaries. Our results show that for any one dictionary, 55-70% of the extracted information is garbled in some way. However, we show that these results can be dramatically reduced to about 6% by combining the information extracted from five dictionaries. It therefore appears that even if individual dictionaries are an unreliable source of semantic information, multiple dictionaries can play an important role in building large lexical-semantic databases. . | AN ASSESSMENT OF SEMANTIC INFORMATION AUTOMATICALLY EXTRACTED FROM MACHINE READABLE DICTIONARIES Jean Véronis1-2and Nancy Ide1 Department of Computer Science VASSAR COLLEGE Poughkeepsie New York 12601 . troupe Rcprésentation et Traitemcnt des Connaissanccs CENTRE National DE LA RECHERCHE SCIENTIFIQUE 31 Ch. Joseph Aiguier 13402 Marseille Cedex 09 France ABSTRACT In this paper we provide a quantitative evaluation of information automatically extracted from machine readable dictionaries. Our results show that for any one dictionary 55-70 of the extracted information is garbled in some way. However we show that these results can be dramatically reduced to about 6 by combining the information extracted from five dictionaries. It therefore appears that even if individual dictionaries are an unreliable source of semantic information multiple dictionaries can play an important role in building large lexical-semantic databases. I. INTRODUCTION In recent years it has become increasingly clear that die limited size of existing computational lexicons and the poverty of the semantic information they contain represents one of the primary bottlenecks in the development of realistic natural language processing NLP systems. The need for extensive lexical and semantic databases is evident in the recent initiation of a number of projects to construct massive generic lexicons for NLP project GENELEX in Europe or EDR in Japan . The manual construction of large lexical-semantic databases demands enormous human resources and there is a growing body of research into the possibility of automatically extracting at least a part of the required lexical and semantic information from everyday dictionaries. Everyday dictionaries arc obviously not structured in a way that enables their immediate use in NLP systems but several studies have shown that relatively simple procedures can be used to extract taxonomies and various other semantic relations for example Amsler 1980 Calzolari 1984 .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.