TAILIEUCHUNG - Báo cáo khoa học: "Named Entity Recognition without Gazetteers"

It is often claimed that Named Entity recognition systems need extensive gazetteers--lists of names of people, organisations, locations, and other named entities. Indeed, the compilation of such gazetteers is sometimes mentioned as a bottleneck in the design of Named Entity recognition systems. We report on a Named Entity recognition system which combines rule-based grammars with statistical (maximum entropy) models. We report on the system's performance with gazetteers of different types and different sizes, using test material from the MUC-7 competition. . | Proceedings of EACL 99 Named Entity Recognition without Gazetteers Andrei Mikheev Marc Moens and Claire Grover HCRC Language Technology Group University of Edinburgh 2 Buccleuch Place Edinburgh EH8 9LW UK. mikheev@ Abstract It is often claimed that Named Entity recognition systems need extensive gazetteers lists of names of people organisations locations and other named entities. Indeed the compilation of such gazetteers is sometimes mentioned as a bottleneck in the design of Named Entity recognition systems. We report on a Named Entity recognition system which combines rule-based grammars with statistical maximum entropy models. We report on the system s performance with gazetteers of different types and different sizes using test material from the MUC-7 competition. We show that for the text type and task of this competition it is sufficient to use relatively small gazetteers of well-known names rather than large gazetteers of low-frequency names. We conclude with observations about the domain independence of the competition and of our experiments. 1 Introduction Named Entity recognition involves processing a text and identifying certain occurrences of words or expressions as belonging to particular categories of Named Entities ne ne recognition software serves as an important preprocessing tool for tasks such as information extraction information retrieval and other text processing applications. What counts as a Named Entity depends on the application that makes use of the annotations. One such application is document retrieval or automated document forwarding documents an-noted with NE information can be searched more Now also at Harlequin Ltd. Edinburgh office accurately than raw text. For example NE annotation allows you to search for all texts that mention the company Philip Morris ignoring documents about a possibly unrelated person by the same name. Or you can have all documents forwarded to you about a .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.