TAILIEUCHUNG - Báo cáo khoa học: "Automatic construction of a hypernym-labeled noun hierarchy from text"

Previous work has shown that automatic methods can be used in building semantic lexicons. This work goes a step further by automatically creating not just clusters of related words, but a hierarchy of nouns and their hypernyms, akin to the hand-built hierarchy in WordNet. | Automatic construction of a hypernym-labeled noun hierarchy from text Sharon A. Caraballo Dept of Computer Science Brown University Providence RI 02912 sc@ Abstract Previous work has shown that automatic methods can be used in building semantic lexicons. This work goes a step further by automatically creating not just clusters of related words but a hierarchy of nouns and their hypernyms akin to the hand-built hierarchy in WordNet. 1 Introduction The purpose of this work is to build something like the hypernym-labeled noun hierarchy of WordNet Fellbaum 1998 automatically from text using no other lexical resources. WordNet has been an important research tool but it is insufficient for domainspecific text such as that encountered in the MUCs Message Understanding Conferences . Our work develops a labeled hierarchy based on a text corpus. In this project nouns are clustered into a hierarchy using data on conjunctions and ap-positives appearing in the Wall Street Journal. The internal nodes of the resulting tree are then labeled with hypernyms for the nouns clustered underneath them also based on data extracted from the Wall Street Journal. The resulting hierarchy is evaluated by human judges and future research directions are discussed. 2 Building the noun hierarchy The first stage in constructing our hierarchy is to build an unlabeled hierarchy of nouns using bottom-up clustering methods see . Brown et al. 1992 . Nouns are clustered based on conjunction and apposi-tive data collected from the Wall Street Jour nal corpus. Some of the data comes from the parsed files 2-21 of the Wall Street Journal Penn Treebank corpus Marcus et al. 1993 and additional parsed text was obtained by parsing the 1987 Wall Street Journal text using the parser described in Charniak et al. 1998 . From this parsed text we identified all conjunctions of noun phrases . executive vice-president and treasurer or scientific equipment apparatus and disposables and all appositives .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.