TAILIEUCHUNG - Báo cáo khoa học: "Locating noun phrases with finite state transducers"

We present a method for constructing, maintaining and consulting a database of proper nouns. We describe noun phrases composed of a proper noun a n d / o r a description of a human occupation. They are formalized by finite state transducers (FST) and large coverage dictionaries and are applied to a corpus of newspapers. We take into account synonymy and hyperonymy. This first stage of our parsing procedure has a high degree of accuracy. We show how we can handle requests such as: 'Find all newspaper articles in a general corpus mentioning the French prime minister', or 'How. | Locating noun phrases with finite state transducers. Jean Senellart LADL Laboratoire d automatique documentaire et linguistique. ưniversité Paris VII 2 place Jussieu 75251 PARIS Cedex 05 email senella@ Abstract We present a method for constructing maintaining and consulting a database of proper nouns. We describe noun phrases composed of a proper noun and or a description of a human occupation. They are formalized by finite state transducers FST and large coverage dictionaries and are applied to a corpus of newspapers. We take into account synonymy and hyperonymy. This first stage of our parsing procedure has a high degree of accuracy. We show how we can handle requests such as Find all newspaper articles in a general corpus mentioning the French prime minister or How is Mr. X referred to in the corpus what have been his different occupations through out the period over which our corpus extends In the first case non trivial occurrences of noun phrases are located that is phrases not containing words present in the request but either synonyms or proper nouns relevant to request. The results of the search is far better than than those obtained by a key-word based engine. Most answers are correct except some cases of homonymy where a human reader would also fail without more context . Also the treatment of people having several different occupations is not fully resolved. We have built for French a library of about one thousand such FSTs and English FSTs are under construction. The same method can be used to locate and propose new proper nouns simply by replacing given proper names in the same FSTs by variables. 1 Introduction Information Retrieval in full texts is one of the challenges of the next years. Web engines attempt to select among the millions of existing Web Sites those corresponding to some input request. Newspaper archives is another exam ple there are several gigabytes of news on electronic support and the size is increasing every day. .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.