TAILIEUCHUNG - Báo cáo khoa học: "Parsing the WSJ using CCG and Log-Linear Models"

This paper describes and evaluates log-linear parsing models for Combinatory Categorial Grammar (CCG). A parallel implementation of the L - BFGS optimisation algorithm is described, which runs on a Beowulf cluster allowing the complete Penn Treebank to be used for estimation. We also develop a new efficient parsing algorithm for CCG which maximises expected recall of dependencies. We compare models which use all CCG derivations, including nonstandard derivations, with normal-form models. The performances of the two models are comparable and the results are competitive with existing wide-coverage CCG parsers. . | Parsing the WSJ using CCG and Log-Linear Models Stephen Clark School of Informatics University of Edinburgh 2 Buccleuch Place Edinburgh UK James R. Curran School of Information Technologies University of Sydney NSW 2006 Australia james@ Abstract This paper describes and evaluates log-linear parsing models for Combinatory Categorial Grammar CCG . A parallel implementation of the L-BFGS optimisation algorithm is described which runs on a Beowulf cluster allowing the complete Penn Treebank to be used for estimation. We also develop a new efficient parsing algorithm for CCG which maximises expected recall of dependencies. We compare models which use all CCG derivations including nonstandard derivations with normal-form models. The performances of the two models are comparable and the results are competitive with existing wide-coverage CCG parsers. 1 Introduction A number of statistical parsing models have recently been developed for Combinatory Categorial Grammar ccg Steedman 2000 and used in parsers applied to the WSJ Penn Treebank Clark et al. 2002 Hockenmaier and Steedman 2002 Hockenmaier 2003b . In Clark and Curran 2003 we argued for the use of log-linear parsing models for CCG. However estimating a log-linear model for a wide-coverage CCG grammar is very computationally expensive. Following Miyao and Tsujii 2002 we showed how the estimation can be performed efficiently by applying the inside-outside algorithm to a packed chart. We also showed how the complete WSJ Penn Treebank can be used for training by developing a parallel version of Generalised Iterative Scaling gis to perform the estimation. This paper significantly extends our earlier work in a number of ways. First we evaluate a number of log-linear models obtaining results which are competitive with the state-of-the-art for CCG parsing. We also compare log-linear models which use all CCG derivations including non-standard derivations with normal-form models. Second we .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.