TAILIEUCHUNG - Báo cáo khoa học: "Lexicalization in Crosslinguistic Probabilistic Parsing: The Case of French"

This paper presents the first probabilistic parsing results for French, using the recently released French Treebank. We start with an unlexicalized PCFG as a baseline model, which is enriched to the level of Collins’ Model 2 by adding lexicalization and subcategorization. The lexicalized sister-head model and a bigram model are also tested, to deal with the flatness of the French Treebank. The bigram model achieves the best performance: 81% constituency F-score and 84% dependency accuracy. All lexicalized models outperform the unlexicalized baseline, consistent with probabilistic parsing results for English, but contrary to results for German, where lexicalization has only. | Lexicalization in Crosslinguistic Probabilistic Parsing The Case of French AbhishekArun and Frank Keller School of Informatics University of Edinburgh 2 Buccleuch Place Edinburgh EH8 9LW UK keller@ Abstract This paper presents the first probabilistic parsing results for French using the recently released French Treebank. We start with an unlexicalized PCFG as a baseline model which is enriched to the level of Collins Model 2 by adding lexical-ization and subcategorization. The lexi-calized sister-head model and a bigram model are also tested to deal with the flatness of the French Treebank. The bigram model achieves the best performance 81 constituency F-score and 84 dependency accuracy. All lexicalized models outperform the unlexicalized baseline consistent with probabilistic parsing results for English but contrary to results for German where lexicalization has only a limited effect on parsing performance. 1 Introduction This paper brings together two strands of research that have recently emerged in the field of probabilistic parsing crosslinguistic parsing and lexicalized parsing. Interest in parsing models for languages other than English has been growing starting with work on Czech Collins et al. 1999 and Chinese Bikel and Chiang 2000 Levy and Manning 2003 . Probabilistic parsing for German has also been explored by a range of authors Dubey and Keller 2003 Schiehlen 2004 . In general these authors have found that existing lexicalized parsing models for English . Collins 1997 do not straightforwardly generalize to new languages this typically manifests itself in a severe reduction in parsing performance compared to the results for English. A second recent strand in parsing research has dealt with the role of lexicalization. The conventional wisdom since Magerman 1995 has been that lexicalization substantially improves performance compared to an unlexicalized baseline model . a probabilistic context-free grammar PCFG . .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.