Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Dependency Grammar Induction via Bitext Projection Constraints"

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ

Broad-coverage annotated treebanks necessary to train parsers do not exist for many resource-poor languages. The wide availability of parallel text and accurate parsers in English has opened up the possibility of grammar induction through partial transfer across bitext. We consider generative and discriminative models for dependency grammar induction that use word-level alignments and a source language parser (English) to constrain the space of possible target trees. | Dependency Grammar Induction via Bitext Projection Constraints Kuzman Ganchev and Jennifer Gillenwater and Ben Taskar Department of Computer and Information Science University of Pennsylvania Philadelphia PA USA kuzman jengi taskar @seas.upenn.edu Abstract Broad-coverage annotated treebanks necessary to train parsers do not exist for many resource-poor languages. The wide availability of parallel text and accurate parsers in English has opened up the possibility of grammar induction through partial transfer across bitext. We consider generative and discriminative models for dependency grammar induction that use word-level alignments and a source language parser English to constrain the space of possible target trees. Unlike previous approaches our framework does not require full projected parses allowing partial approximate transfer through linear expectation constraints on the space of distributions over trees. We consider several types of constraints that range from generic dependency conservation to language-specific annotation rules for auxiliary verb analysis. We evaluate our approach on Bulgarian and Spanish CoNLL shared task data and show that we consistently outperform unsupervised methods and can outperform supervised learning for limited training data. 1 Introduction For English and a handful of other languages there are large well-annotated corpora with a variety of linguistic information ranging from named entity to discourse structure. Unfortunately for the vast majority of languages very few linguistic resources are available. This situation is likely to persist because of the expense of creating annotated corpora that require linguistic expertise Abeillé 2003 . On the other hand parallel corpora between many resource-poor languages and resource-rich languages are ample motivat ing recent interest in transferring linguistic resources from one language to another via parallel text. For example several early works Yarowsky and Ngai 2001 Yarowsky et al.

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.