TAILIEUCHUNG - Báo cáo khoa học: "Dependency Parsing of Hungarian: Baseline Results and Challenges"

Hungarian is a stereotype of morphologically rich and non-configurational languages. Here, we introduce results on dependency parsing of Hungarian that employ a 80K, multi-domain, fully manually annotated corpus, the Szeged Dependency Treebank. We show that the results achieved by state-of-the-art data-driven parsers on Hungarian and English (which is at the other end of the configurational-nonconfigurational spectrum) are quite similar to each other in terms of attachment scores. We reveal the reasons for this and present a systematic and comparative linguistically motivated error analysis on both languages. . | Dependency Parsing of Hungarian Baseline Results and Challenges Richard Farkas1 Veronika Vincze2 Helmut Schmid1 institute for Natural Language Processing University of Stuttgart farkas schmid @ 2Research Group on Artificial Intelligence Hungarian Academy of Sciences vinczev@ Abstract Hungarian is a stereotype of morphologically rich and non-configurational languages. Here we introduce results on dependency parsing of Hungarian that employ a 80K multi-domain fully manually annotated corpus the Szeged Dependency Treebank. We show that the results achieved by state-of-the-art data-driven parsers on Hungarian and English which is at the other end of the configurational-non-configurational spectrum are quite similar to each other in terms of attachment scores. We reveal the reasons for this and present a systematic and comparative linguistically motivated error analysis on both languages. This analysis highlights that addressing the language-specific phenomena is required for a further remarkable error reduction. 1 Introduction From the viewpoint of syntactic parsing the languages of the world are usually categorized according to their level of configurationality. At one end there is English a strongly configurational language while Hungarian is at the other end of the spectrum. It has very few fixed structures at the sentence level. Leaving aside the issue of the internal structure of NPs most sentence-level syntactic information in Hungarian is conveyed by morphology not by configuration E. Kiss 2002 . A large part of the methodology for syntactic parsing has been developed for English. However parsing non-configurational and less configurational languages requires different techniques. In this study we present results on Hungarian dependency parsing and we investigate this general issue in the case of English and Hungarian. We employed three state-of-the-art data-driven parsers Nivre et al. 2004 McDonald et al. 2005 Bohnet 2010 .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.