TAILIEUCHUNG - Báo cáo khoa học: "Fast Full Parsing by Linear-Chain Conditional Random Fields"

This paper presents a chunking-based discriminative approach to full parsing. We convert the task of full parsing into a series of chunking tasks and apply a conditional random field (CRF) model to each level of chunking. The probability of an entire parse tree is computed as the product of the probabilities of individual chunking results. The parsing is performed in a bottom-up manner and the best derivation is efficiently obtained by using a depthfirst search algorithm. Experimental results demonstrate that this simple parsing framework produces a fast and reasonably accurate parser. . | Fast Full Parsing by Linear-Chain Conditional Random Fields Yoshimasa Tsuruoka1 Jun ichi Tsujiitt Sophia Ananiadou1 1 School of Computer Science University of Manchester UK National Centre for Text Mining NaCTeM UK Department of Computer Science University of Tokyo Japan @ Abstract This paper presents a chunking-based discriminative approach to full parsing. We convert the task of full parsing into a series of chunking tasks and apply a conditional random field CRF model to each level of chunking. The probability of an entire parse tree is computed as the product of the probabilities of individual chunking results. The parsing is performed in a bottom-up manner and the best derivation is efficiently obtained by using a depth-first search algorithm. Experimental results demonstrate that this simple parsing framework produces a fast and reasonably accurate parser. 1 Introduction Full parsing analyzes the phrase structure of a sentence and provides useful input for many kinds of high-level natural language processing such as summarization Knight and Marcu 2000 pronoun resolution Yang et al. 2006 and information extraction Miyao et al. 2008 . One of the major obstacles that discourage the use of full parsing in large-scale natural language processing applications is its computational cost. For example the MEDLINE corpus a collection of abstracts of biomedical papers consists of 70 million sentences and would require more than two years of processing time if the parser needs one second to process a sentence. Generative models based on lexicalized PCFGs enjoyed great success as the machine learning framework for full parsing Collins 1999 Char-niak 2000 but recently discriminative models attract more attention due to their superior accuracy Charniak and Johnson 2005 Huang 2008 and adaptability to new grammars and languages Buchholz and Marsi 2006 . A traditional approach to discriminative full parsing is to .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.