TAILIEUCHUNG - Báo cáo khoa học: "Better Automatic Treebank Conversion Using A Feature-Based Approach"

For the task of automatic treebank conversion, this paper presents a feature-based approach which encodes bracketing structures in a treebank into features to guide the conversion of this treebank to a different standard. Experiments on two Chinese treebanks show that our approach improves conversion accuracy by over a strong baseline. | Better Automatic Treebank Conversion Using A Feature-Based Approach Muhua Zhu Jingbo Zhu Minghan Hu Natural Language Processing Lab. Northeastern University China zhumuhua@ zhujingbo@ huminghan@ Abstract For the task of automatic treebank conversion this paper presents a feature-based approach which encodes bracketing structures in a treebank into features to guide the conversion of this treebank to a different standard. Experiments on two Chinese treebanks show that our approach improves conversion accuracy by over a strong baseline. 1 Introduction In the field of syntactic parsing research efforts have been put onto the task of automatic conversion of a treebank source treebank to fit a different standard which is exhibited by another treebank target treebank . Treebank conversion is desirable primarily because source-style and target-style annotations exist for non-overlapping text samples so that a larger target-style treebank can be obtained through such conversion. Hereafter source and target treebanks are named as heterogenous treebanks due to their different annotation standards. In this paper we focus on the scenario of conversion between phrase-structure heterogeneous treebanks Wang et al. 1994 Zhu and Zhu 2010 . Due to the availability of annotation in a source treebank it is natural to use such annotation to guide treebank conversion. The motivating idea is illustrated in Fig. 1 which depicts a sentence annotated with standards of Tsinghua Chinese Treebank TCT Zhou 1996 and Penn Chinese Treebank CTB Xue et al. 2002 respectively. Suppose that the conversion is in the direction from the TCT-style parse left side to the CTB-style parse right side . The constituents vp W will i surrender dj A A enemy will i surrender and np lt 715 intelligence experts in the TCT-style parse strongly suggest a resulting CTB-style parse also bracket the words as constituents. Zhu and Zhu 2010 show the effectiveness of using .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.