TAILIEUCHUNG - Báo cáo khoa học: "Neural Network Recognition of Spelling Errors"

One area in which artificial neural networks (ANNs) may strengthen NLP systems is in the identification of words under noisy conditions. In order to achieve this benefit when spelling errors or spelling variants are present, variable-length strings of symbols must be converted to ANN input/output form--fixed-length arrays of numbers. A common view in the neural network community has been that different forms of input/output representations have negligible effect on ANN performance. | Neural Network Recognition of Spelling Errors Mark Lewellen Computational Linguistics Georgetown University Washington DC 20057-1051 m lewel len@ Abstract One area in which artificial neural networks ANNs may strengthen NLP systems is in the identification of words under noisy conditions. In order to achieve this benefit when spelling errors or spelling variants are present variable-length strings of symbols must be converted to ANN input output form fixed-length arrays of numbers. A common view in the neural network community has been that different forms of input output representations have negligible effect on ANN performance. This paper however shows that input output representations can in fact affect the performance of ANNs in the case of natural language words. Minimum properties for an adequate word representation are proposed as well as new methods of word representation. To test the hypothesis that word representations significantly affect ANN performance traditional and new word representations are evaluated for their ability to recognize words in the presence of four types of typographical noise substitutions insertions deletions and reversals of letters. The results indicate that word representations have a significant effect on ANN performance. Additionally different types of word representation are shown to perform better on different types of error. Introduction ANNs are a promising technology for NLP since a strength of ANNS is their common sense ability to make reasonable decisions even when faced with novel data while a weakness of NLP applications is brittleness in the face of ambiguous situations. One area in which much ambiguity occurs is the identification of words words may be misspelled they may have valid spelling variants and they can be homographic. Robust word recognition capabilities can improve applications which involve text understanding and are the central component of applications such as spell-checking and name .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.