TAILIEUCHUNG - Báo cáo khoa học: "Probabilistic Parsing for German using Sister-Head Dependencies"

We present a probabilistic parsing model for German trained on the Negra treebank. We observe that existing lexicalized parsing models using head-head dependencies, while successful for English, fail to outperform an unlexicalized baseline model for German. Learning curves show that this effect is not due to lack of training data. We propose an alternative model that uses sister-head dependencies instead of head-head dependencies. | Probabilistic Parsing for German using Sister-Head Dependencies Amit Dubey Department of Computational Linguistics Saarland University PO Box 15 11 50 66041 Saarbrucken Germany adubey@ Frank Keller School of Informatics University of Edinburgh 2 Buccleuch Place Edinburgh EH8 9LW UK keller@ Abstract We present a probabilistic parsing model for German trained on the Negra treebank. We observe that existing lexicalized parsing models using head-head dependencies while successful for English fail to outperform an unlexicalized baseline model for German. Learning curves show that this effect is not due to lack of training data. We propose an alternative model that uses sister-head dependencies instead of head-head dependencies. This model outperforms the baseline achieving a labeled precision and recall of up to 74 . This indicates that sister-head dependencies are more appropriate for treebanks with very flat structures such as Negra. 1 Introduction Treebank-based probabilistic parsing has been the subject of intensive research over the past few years resulting in parsing models that achieve both broad coverage and high parsing accuracy . Collins 1997 Charniak 2000 . However most of the existing models have been developed for English and trained on the Penn Treebank Marcus et al. 1993 which raises the question whether these models generalize to other languages and to annotation schemes that differ from the Penn Treebank markup. The present paper addresses this question by proposing a probabilistic parsing model trained on Negra Skut et al. 1997 a syntactically annotated corpus for German. German has a number of syntactic properties that set it apart from English and the Negra annotation scheme differs in important respects from the Penn Treebank markup. While Ne-gra has been used to build probabilistic chunkers Becker and Frank 2002 skut and Brants 1998 the research reported in this paper is the first attempt to develop a probabilistic full

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.