TAILIEUCHUNG - Báo cáo khoa học: "Combining Trigram-based and Feature-based Methods for Context-Sensitive Spelling Correction"

This paper addresses the problem of correcting spelling errors that result in valid, though unintended words (such as peace and piece, or quiet and quite) and also the problem of correcting particular word usage errors (such as amount and number, or among and between). Such corrections require contextual information and are not handled by conventional spelling programs such as Unix spell. First, we introduce a method called Trigrams that uses part-of-speech trigrams to encode the context. This method uses a small number of parameters compared to previous methods based on word trigrams. However, it is effectively unable to distinguish. | Combining Trigram-based and Feature-based Methods for Context-Sensitive Spelling Correction Andrew R. Golding and Yves Schabes Mitsubishi Electric Research Laboratories 201 Broadway Cambridge MA 02139 golding schabes @ Abstract This paper addresses the problem of correcting spelling errors that result in valid though unintended words such as peace and piece or quiet and quite and also the problem of correcting particular word usage errors such as amount and number or among and between . Such corrections require contextual information and are not handled by conventional spelling programs such as Unix spell. First we introduce a method called Trigrams that uses part-of-speech trigrams to encode the context. This method uses a small number of parameters compared to previous methods based on word trigrams. However it is effectively unable to distinguish among words that have the same part of speech. For this case an alternative feature-based method called Bayes performs better but Bayes is less effective than Trigrams when the distinction among words depends on syntactic constraints. A hybrid method called Tribayes is then introduced that combines the best of the previous two methods. The improvement in performance of Tribayes over its components is verified experimentally. Tribayes is also compared with the grammar checker in Microsoft Word and is found to have substantially higher performance. 1 Introduction Spelling correction has become a very common technology and is often not perceived as a problem where progress can be made. However conventional spelling checkers such as Unix spell are concerned only with spelling errors that result in words that cannot be found in a word list of a given language. One analysis has shown that up to 15 of spelling errors that result from elementary typographical errors character insertion deletion or transposition yield another valid word in the language Peterson 1986 . These errors remain undetected by traditional .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.