TAILIEUCHUNG - Báo cáo khoa học: "Statistical Decision-Tree Models for Parsing*"

Syntactic natural language parsers have shown themselves to be inadequate for processing highly-ambiguous large-vocabulary text, as is evidenced by their poor performance on domains like the Wall Street Journal, and by the movement away from parsing-based approaches to textprocessing in general. In this paper, I describe SPATTER, a statistical parser based on decision-tree learning techniques which constructs a complete parse for every sentence and achieves accuracy rates far better than any published result. . | Statistical Decision-Tree Models for Parsing David M. Magerman Bolt Beranek and Newman Inc. 70 Fawcett Street Room 15 148 Cambridge MA 02138 USA Abstract Syntactic natural language parsers have shown themselves to be inadequate for processing highly-ambiguous large-vocabulary text as is evidenced by their poor performance on domains like the Wall Street Journal and by the movement away from parsing-based approaches to textprocessing in general. In this paper I describe SPATTER a statistical parser based on decision-tree learning techniques which constructs a complete parse for every sentence and achieves accuracy rates far better than any published result. This work is based on the following premises 1 grammars are too complex and detailed to develop manually for most interesting domains 2 parsing models must rely heavily on lexical and contextual information to analyze sentences accurately and 3 existing n-gram modeling techniques are inadequate for parsing models. In experiments comparing SPATTER with IBM s computer manuals parser SPATTER significantly outperforms the grammar-based parser. Evaluating SPATTER against the Penn Treebank Wall Street Journal corpus using the PARSEVAL measures SPATTER achieves 86 precision 86 recall and crossing brackets per sentence for sentences of 40 words or less and 91 precision 90 recall and crossing brackets for sentences between 10 and 20 words in length. This work was sponsored by the Advanced Research Projects Agency contract DABT63-94-C-0062. It does not reflect the position or the policy of the . Government and no official endorsement should be inferred. Thanks to the members of the IBM Speech Recognition Group for their significant contributions to this work. 1 Introduction Parsing a natural language sentence can be viewed as making a sequence of disambiguation decisions determining the part-of-speech of the words choosing between possible constituent structures and selecting labels for the .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.