TAILIEUCHUNG - Báo cáo khoa học: "Open-Domain Semantic Role Labeling by Modeling Word Spans"

Most supervised language processing systems show a significant drop-off in performance when they are tested on text that comes from a domain significantly different from the domain of the training data. Semantic role labeling techniques are typically trained on newswire text, and in tests their performance on fiction is as much as 19% worse than their performance on newswire text. We investigate techniques for building open-domain semantic role labeling systems that approach the ideal of a train-once, use-anywhere system. . | Open-Domain Semantic Role Labeling by Modeling Word Spans Fei Huang Alexander Yates Temple University Temple University 1805 N. Broad St. 1805 N. Broad St. Wachman Hall 318 Wachman Hall 303A yates@ Abstract Most supervised language processing systems show a significant drop-off in performance when they are tested on text that comes from a domain significantly different from the domain of the training data. Semantic role labeling techniques are typically trained on newswire text and in tests their performance on fiction is as much as 19 worse than their performance on newswire text. We investigate techniques for building open-domain semantic role labeling systems that approach the ideal of a train-once use-anywhere system. We leverage recently-developed techniques for learning representations of text using latent-variable language models and extend these techniques to ones that provide the kinds of features that are useful for semantic role labeling. In experiments our novel system reduces error by 16 relative to the previous state of the art on out-of-domain text. 1 Introduction In recent semantic role labeling SRL competitions such as the shared tasks of CoNLL 2005 and CoNLL 2008 supervised SRL systems have been trained on newswire text and then tested on both an in-domain test set Wall Street Journal text and an out-of-domain test set fiction . All systems tested on these datasets to date have exhibited a significant drop-off in performance on the out-of-domain tests often performing 15 worse or more on the fiction test sets. Yet the baseline from CoNLL 2005 suggests that the fiction texts are actually easier than the newswire texts. Such observations expose a weakness of current supervised natural language processing NLP technology for SRL systems learn to identify semantic roles for the subset of language contained in the training data but are not yet good at generalizing to language that has not been seen before. We aim to build

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.