TAILIEUCHUNG - Báo cáo khoa học: "Analysis and Repair of Name Tagger Errors"

Name tagging is a critical early stage in many natural language processing pipelines. In this paper we analyze the types of errors produced by a tagger, distinguishing name classification and various types of name identification errors. We present a joint inference model to improve Chinese name tagging by incorporating feedback from subsequent stages in an information extraction pipeline: name structure parsing, cross-document coreference, semantic relation extraction and event extraction. We show through examples and performance measurement how different stages can correct different types of errors | Analysis and Repair of Name Tagger Errors Heng Ji Ralph Grishman Department of Computer Science New York University New York NY 10003 UsA hengj i@ grishman@ Abstract Name tagging is a critical early stage in many natural language processing pipelines. In this paper we analyze the types of errors produced by a tagger distinguishing name classification and various types of name identification errors. We present a joint inference model to improve Chinese name tagging by incorporating feedback from subsequent stages in an information extraction pipeline name structure parsing cross-document coreference semantic relation extraction and event extraction. We show through examples and performance measurement how different stages can correct different types of errors. The resulting accuracy approaches that of individual human annotators. 1 Introduction High-performance named entity NE tagging is crucial in many natural language processing tasks such as information extraction and machine translation. In traditional pipelined system architectures NE tagging is one of the first steps in the pipeline. NE errors adversely affect subsequent stages and error rates are often compounded by later stages. However Roth and Yi 2002 2004 and our recent work have focused on incorporating richer linguistic analysis using the feedback from later stages to improve name taggers. We expanded our last year s model Ji and Grishman 2005 that used the results of coreference analysis and relation extraction by adding feedback from more information extraction components -name structure parsing cross-document coreference and event extraction - to incrementally re rank the multiple hypotheses from a baseline name tagger. While together these components produced a further improvement on last year s model our goal in this paper is to look behind the overall performance figures in order to understand how these varied components contribute to the improvement and compare the remaining .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.