TAILIEUCHUNG - Báo cáo khoa học: "Combining Association Measures for Collocation Extraction"

We introduce the possibility of combining lexical association measures and present empirical results of several methods employed in automatic collocation extraction. First, we present a comprehensive summary overview of association measures and their performance on manually annotated data evaluated by precision-recall graphs and mean average precision. Second, we describe several classification methods for combining association measures, followed by their evaluation and comparison with individual measures. . | Combining Association Measures for Collocation Extraction Pavel Pecina and Pavel Schlesinger Institute of Formal and Applied Linguistics Charles University Prague Czech Republic pecina schlesinger @ Abstract We introduce the possibility of combining lexical association measures and present empirical results of several methods employed in automatic collocation extraction. First we present a comprehensive summary overview of association measures and their performance on manually annotated data evaluated by precision-recall graphs and mean average precision. Second we describe several classification methods for combining association measures followed by their evaluation and comparison with individual measures. Finally we propose a feature selection algorithm significantly reducing the number of combined measures with only a small performance degradation. 1 Introduction Lexical association measures are mathematical formulas determining the strength of association between two or more words based on their occurrences and cooccurrences in a text corpus. They have a wide spectrum of applications in the field of natural language processing and computational linguistics such as automatic collocation extraction Manning and Schutze 1999 bilingual word alignment Mihalcea and Pedersen 2003 or dependency parsing. A number of various association measures were introduced in the last decades. An overview of the most widely used techniques is given . in Manning and Schutze 1999 or Pearce 2002 . Several researchers also attempted to compare existing methods and suggest different evaluation schemes Kita 1994 and Evert 2001 . A comprehensive study of statistical aspects of word cooccurrences can be found in Evert 2004 or Krenn 2000 . In this paper we present a novel approach to automatic collocation extraction based on combining multiple lexical association measures. We also address the issue of the evaluation of association measures by precision-recall graphs and

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.