TAILIEUCHUNG - Báo cáo khoa học: "A Cascaded Finite-State Parser for Syntactic Analysis of Swedish"

This report describes the development of a parsing system for written Swedish and is focused on a grammar, the main component of the system, semiautomatically extracted from corpora. A cascaded, finite-state algorithm is applied to the grammar in which the input contains coarse-grained semantic class information, and the output produced reflects not only the syntactic structure of the input, but grammatical functions as well. The grammar has been tested on a variety of random samples of different text genres, achieving precision and recall of and respectively, and average crossing rate of , when evaluated against manually. | Proceedings of EACL 99 A Cascaded Finite-State Parser for Syntactic Analysis of Swedish Dimitrios Kokkinakis and Sofie Johansson Kokkinakis Department of Swedish Sprakdata Box 200 SE-405 30 Goteborg University Goteborg Sweden svedk svesj @ Abstract This report describes the development of a parsing system for written Swedish and is focused on a grammar the main component of the system semi-automatically extracted from corpora. A cascaded finite-state algorithm is applied to the grammar in which the input contains coarse-grained semantic class information and the output produced reflects not only the syntactic structure of the input but grammatical functions as well. The grammar has been tested on a variety of random samples of different text genres achieving precision and recall of and respectively and average crossing rate of when evaluated against manually disambiguated annotated texts. 1 Introduction This report describes a parsing system for fast and accurate analysis of large bodies of written Swedish. The grammar has been implemented in a modular fashion as finite-state cascaded machines henceforth called Cass-SWE a name adopted from the parser used Cascaded analysis of syntactic structure Abney 1996 . Cass-SWE operates on part-of-speech annotated texts and is coupled with a pre-processing mechanism which distinguishes thousands of phrasal verbs idioms and multi-word expressions. Cass-SWE is designed in such a way that semantic information inherited by named-entity NE identification software is taken under consideration and grammatical functions are extracted heuristically using finite-state transducers. The grammar has been manually acquired from open-source texts by observing legitimately adjacent part-of-speech chains and how and which function words sig nal boundaries between phrasal constituents and clauses. 2 Background Cascaded Finite-State Automata Finite-state technology has had a great impact on a variety of Natural

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.