TAILIEUCHUNG - Báo cáo khoa học: "Exploiting Named Entity Taggers in a Second Language"

In this work we present a method for Named Entity Recognition (NER). Our method does not rely on complex linguistic resources, and apart from a hand coded system, we do not use any languagedependent tools. The only information we use is automatically extracted from the documents, without human intervention. Moreover, the method performs well even without the use of the hand coded system. | Exploiting Named Entity Taggers in a Second Language Thamar Solorio Computer Science Department National Institute of Astrophysics Optics and Electronics Luis Enrique Erro 1 Tonantzintla Puebla 72840 Mexico Abstract In this work we present a method for Named Entity Recognition NER . Our method does not rely on complex linguistic resources and apart from a hand coded system we do not use any languagedependent tools. The only information we use is automatically extracted from the documents without human intervention. Moreover the method performs well even without the use of the hand coded system. The experimental results are very encouraging. Our approach even outperformed the hand coded system on NER in Spanish and it achieved high accuracies in Portuguese. 1 Introduction Given the usefulness of Named Entities NEs in many natural language processing tasks there has been a lot of work aimed at developing accurate named entity extractors Borthwick 1999 Velardi et al. 2001 Arevalo et al. 2002 Zhou and Su 2002 Florian 2002 Zhang and Johnson 2003 . Most approaches however have very low portability they are designed to perform well over a particular collection or type of document and their accuracies will drop considerably when used in different domains. The reason for this is that many NE extractor systems rely heavily on complex linguistic resources which are typically hand coded for example regular expressions grammars gazetteers and the like. Adapting a system of this nature to a different collection or language requires a lot of human effort involving tasks such as rewriting the grammars acquiring new dictionaries searching trigger words and so on. Even if one has the human resources and the time needed for the adaptation process there are languages that lack the linguistic resources needed for instance dictionaries are available in electronic form for only a handful of languages. We believe that by using machine learning techniques we can adapt an existing hand .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.