TAILIEUCHUNG - Báo cáo khoa học: "A Bootstrapping Approach to Named Entity Classification Using Successive Learners"

This paper presents a new bootstrapping approach to named entity (NE) classification. This approach only requires a few common noun/pronoun seeds that correspond to the concept for the target NE type, . he/she/man/woman for PERSON NE. The entire bootstrapping procedure is implemented as training two successive learners: (i) a decision list is used to learn the parsing-based high precision NE rules; (ii) a Hidden Markov Model is then trained to learn string sequence-based NE patterns. | A Bootstrapping Approach to Named Entity Classification Using Successive Learners Cheng Niu Wei Li Jihong Ding Rohini K. Srihari Cymfony Inc. 600 Essjay Road Williamsville NY 14221. USA. cniu wei jding rohini @ Abstract This paper presents a new bootstrapping approach to named entity NE classification. This approach only requires a few common noun pronoun seeds that correspond to the concept for the target NE type . he she man woman for PERSON NE. The entire bootstrapping procedure is implemented as training two successive learners i a decision list is used to learn the parsing-based high precision NE rules ii a Hidden Markov Model is then trained to learn string sequence-based NE patterns. The second learner uses the training corpus automatically tagged by the first learner. The resulting NE system approaches supervised NE performance for some NE types. The system also demonstrates intuitive support for tagging user-defined NE types. The differences of this approach from the co-training-based NE bootstrapping are also discussed. 1 Introduction Named Entity NE tagging is a fundamental task for natural language processing and information extraction. An NE tagger recognizes and classifies text chunks that represent various proper names time or numerical expressions. Seven types of named entities are defined in the Message Understanding Conference MUC standards namely PeRsON PER ORGANIZATION ORG location LOC time date MonEy and PERCENT1 MUC-7 1998 . 1 This paper only focuses on classifying proper names. Time and numerical NEs are not yet explored using this method. There is considerable research on NE tagging using different techniques. These include systems based on handcrafted rules Krupka 1998 as well as systems using supervised machine learning such as the Hidden Markov Model HMM Bikel 1997 and the Maximum Entropy Model Borthwick 1998 . The state-of-the-art rule-based systems and supervised learning systems can reach near-human performance for NE .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.