TAILIEUCHUNG - Báo cáo khoa học: "A Fast, Accurate Deterministic Parser for Chinese"

We present a novel classifier-based deterministic parser for Chinese constituency parsing. Our parser computes parse trees from bottom up in one pass, and uses classifiers to make shift-reduce decisions. Trained and evaluated on the standard training and test sets, our best model (using stacked classifiers) runs in linear time and has labeled precision and recall above 88% using gold-standard part-of-speech tags, surpassing the best published results. Our SVM parser is 2-13 times faster than state-of-the-art parsers, while producing more accurate results. . | A Fast Accurate Deterministic Parser for Chinese Mengqiu Wang Kenji Sagae Teruko Mitamura Language Technologies Institute School of Computer Science Carnegie Mellon University mengqiu sagae teruko @ Abstract We present a novel classifier-based deterministic parser for Chinese constituency parsing. Our parser computes parse trees from bottom up in one pass and uses classifiers to make shift-reduce decisions. Trained and evaluated on the standard training and test sets our best model using stacked classifiers runs in linear time and has labeled precision and recall above 88 using gold-standard part-of-speech tags surpassing the best published results. Our SVM parser is 2-13 times faster than state-of-the-art parsers while producing more accurate results. Our Maxent and DTree parsers run at speeds 40-270 times faster than state-of-the-art parsers but with 5-6 losses in accuracy. 1 Introduction and Background Syntactic parsing is one of the most fundamental tasks in Natural Language Processing NLP . In recent years Chinese syntactic parsing has also received a lot of attention in the NLP community especially since the release of large collections of annotated data such as the Penn Chinese Treebank Xue et al. 2005 . Corpus-based parsing techniques that are successful for English have been applied extensively to Chinese. Traditional statistical approaches build models which assign probabilities to every possible parse tree for a sentence. Techniques such as dynamic programming beam-search and best-first-search are then employed to find the parse tree with the highest probability. The massively ambiguous nature of wide-coverage statistical parsing coupled with cubic-time or worse algorithms makes this approach too slow for many practical applications. Deterministic parsing has emerged as an attractive alternative to probabilistic parsing offering accuracy just below the state-of-the-art in syntactic analysis of English but running in linear time Sagae and Lavie

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.