TAILIEUCHUNG - Báo cáo khoa học: "Using Machine Learning to Maintain Rule-based Named-Entity Recognition and Classification Systems"

This paper presents a method that assists in maintaining a rule-based named-entity recognition and classification system. The underlying idea is to use a separate system, constructed with the use of machine learning, to monitor the performance of the rule-based system. The training data for the second system is generated with the use of the rule-based system, thus avoiding the need for manual tagging. The disagreement of the two systems acts as a signal for updating the rule-based system. The generality of the approach is illustrated by applying it to large corpora in two different languages: Greek and French. . | Using Machine Learning to Maintain Rule-based Named-Entity Recognition and Classification Systems Georgios Petasis t Frantz Vichot Francis Wolinski Georgios Paliouras t Vangelis Karkaletsis t and Constantine D. Spyropoulos t t Institute of Informatics and Telecommunications Informatique-CDC National Centre for Scientific Research Demokritos 4 rue Berthollet 15310 Ag. Paraskevi Athens Greece 94114 Arcueil France petasis paliourg vangelis costass @ @ Abstract This paper presents a method that assists in maintaining a rule-based named-entity recognition and classification system. The underlying idea is to use a separate system constructed with the use of machine learning to monitor the performance of the rule-based system. The training data for the second system is generated with the use of the rule-based system thus avoiding the need for manual tagging. The disagreement of the two systems acts as a signal for updating the rule-based system. The generality of the approach is illustrated by applying it to large corpora in two different languages Greek and French. The results are very encouraging showing that this alternative use of machine learning can assist significantly in the maintenance of rulebased systems. 1 Introduction Machine learning has recently been proposed as a promising solution to a major problem in language engineering the construction of lexical resources. Most of the real-world language engineering systems make use of a variety of lexical resources in particular grammars and lexicons. The use of general-purpose resources is ineffective since in most applications a specialised vocabulary is used which is not supported by general-purpose lexicons and grammars. For this reason significant effort is currently put into the construction of generic tools that can quickly adapt to a particular thematic domain. The adaptation of these tools mainly involves the adaptation of domain-specific .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.