TAILIEUCHUNG - Báo cáo khoa học: "Grammatical Error Correction with Alternating Structure Optimization"

We present a novel approach to grammatical error correction based on Alternating Structure Optimization. As part of our work, we introduce the NUS Corpus of Learner English (NUCLE), a fully annotated one million words corpus of learner English available for research purposes. We conduct an extensive evaluation for article and preposition errors using various feature sets. Our experiments show that our approach outperforms two baselines trained on non-learner text and learner text, respectively. . | Grammatical Error Correction with Alternating Structure Optimization Daniel Dahlmeier1 and Hwee Tou Ng1 2 1NUS Graduate School for Integrative Sciences and Engineering 2Department of Computer Science National University of Singapore danielhe nght @ Abstract We present a novel approach to grammatical error correction based on Alternating Structure Optimization. As part of our work we introduce the NUS Corpus of Learner English NUCLE a fully annotated one million words corpus of learner English available for research purposes. We conduct an extensive evaluation for article and preposition errors using various feature sets. Our experiments show that our approach outperforms two baselines trained on non-learner text and learner text respectively. Our approach also outperforms two commercial grammar checking software packages. 1 Introduction Grammatical error correction GEC has been recognized as an interesting as well as commercially attractive problem in natural language processing NLP in particular for learners of English as a foreign or second language EFL ESL . Despite the growing interest research has been hindered by the lack of a large annotated corpus of learner text that is available for research purposes. As a result the standard approach to GEC has been to train an off-the-shelf classifier to re-predict words in non-learner text. Learning GEC models directly from annotated learner corpora is not well explored as are methods that combine learner and non-learner text. Furthermore the evaluation of GEC has been problematic. Previous work has either evaluated on artificial test instances as a substitute for real learner errors or on proprietary data that is not available to 915 other researchers. As a consequence existing methods have not been compared on the same test set leaving it unclear where the current state of the art really is. In this work we aim to overcome both problems. First we present a novel approach to GEC based on Alternating .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.