TAILIEUCHUNG - Báo cáo khoa học: "Rebanking CCGbank for improved NP interpretation"

Once released, treebanks tend to remain unchanged despite any shortcomings in their depth of linguistic analysis or coverage of specific phenomena. Instead, separate resources are created to address such problems. In this paper we show how to improve the quality of a treebank, by integrating resources and implementing improved analyses for specific constructions. We demonstrate this rebanking process by creating an updated version of CCGbank that includes the predicate-argument structure of both verbs and nouns, baseNP brackets, verb-particle constructions, and restrictive and non-restrictive nominal modifiers; and evaluate the impact of these changes on a statistical parser. . | Rebanking CCGbank for improved NP interpretation Matthew Honnibal and James R. Curran School of Information Technologies University of Sydney NSW 2006 Australia mhonn james @ Johan Bos University of Groningen The Netherlands bos@ Abstract Once released treebanks tend to remain unchanged despite any shortcomings in their depth of linguistic analysis or coverage of specific phenomena. Instead separate resources are created to address such problems. In this paper we show how to improve the quality of a treebank by integrating resources and implementing improved analyses for specific constructions. We demonstrate this rebanking process by creating an updated version of CCG-bank that includes the predicate-argument structure of both verbs and nouns base-NP brackets verb-particle constructions and restrictive and non-restrictive nominal modifiers and evaluate the impact of these changes on a statistical parser. 1 Introduction Progress in natural language processing relies on direct comparison on shared data discouraging improvements to the evaluation data. This means that we often spend years competing to reproduce partially incorrect annotations. It also encourages us to approach related problems as discrete tasks when a new data set that adds deeper information establishes a new incompatible evaluation. Direct comparison has been central to progress in statistical parsing but it has also caused problems. Treebanking is a difficult engineering task coverage cost consistency and granularity are all competing concerns that must be balanced against each other when the annotation scheme is developed. The difficulty of the task means that we ought to view treebanking as an ongoing process akin to grammar development such as the many years of work on the ERG Flickinger 2000 . This paper demonstrates how a treebank can be rebanked to incorporate novel analyses and infor mation from existing resources. We chose to work on CCGbank Hockenmaier and

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.