TAILIEUCHUNG - Báo cáo khoa học: "Machine Aided Error-Correction Environment for Korean Morphological Analysis and Part-of-Speech Tagging"

Statistical methods require very large corpus with high quality. But building large and faultless annotated corpus is a very difficult job. This paper proposes an efficient m e t h o d to construct part-of-speech tagged corpus. A rulebased error correction m e t h o d is proposed to find and correct errors semi-automatically by user-defined rules. We also make use of user's correction log to reflect feedback. Experiments were carried out to show the efficiency of error correction process of this workbench. The result shows that about % of tagging errors can be corrected. . | Machine Aided Error-Correction Environment for Korean Morphological Analysis and Part-of-Speech Tagging Junsik Park Jung-Goo Kang Wook Hur and Key-Sun Choi Center for Artificial Intelligence Research Korea Advanced Institute of Science and Technology Taejon 305-701 Korea j spark j gkang hook kschoi world. kaist. ac. kr Abstract Statistical methods require very large corpus with high quality. But building large and faultless annotated corpus is a very difficult job. This paper proposes an efficient method to construct part-of-speech tagged corpus. A rulebased error correction method is proposed to find and correct errors semi-automatically by user-defined rules. We also make use of user s correction log to reflect feedback. Experiments were carried out to show the efficiency of error correction process of this workbench. The result shows that about of tagging errors can be corrected. 1 Introduction Natural language processing system using corpus needs the large amount of corpus Choi et al. 1994 but it also requires the high quality. The process of making the general annotated corpus can be viewed as Figure 1. There are some difficulties in processing the annotated corpus. First the number of items in a dictionary is not so large. The second problem is in the difficulty of modifying the errors produced by automatic tagging. Manual error correction would require large amount of costs and there may still remain errors after correcting process. There were also researches about automatic correction but they had problems about the sideeffects after automatic error correction Lee and Lee 1996 Lim et al. 1996 . In this paper we will integrate the morphological analysis and tagging and provide interactive user interface. User gives the feedback to resolve the ambiguities of analysis. To reduce the cost and improve the correctness we have developed an environment which is enable to find errors and modify them. In the following section related works are described. In .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.