TAILIEUCHUNG - Báo cáo khoa học: "Automatically Generated Customizable Online Dictionaries"

The aim of our software presentation is to demonstrate that corpus-driven bilingual dictionaries generated fully by automatic means are suitable for human use. Previous experiments have proven that bilingual lexicons can be created by applying word alignment on parallel corpora. Such an approach, especially the corpus-driven nature of it, yields several advantages over more traditional approaches. Most importantly, automatically attained translation probabilities are able to guarantee that the most frequently used translations come first within an entry. However, the proposed technique have to face some difficulties, as well. . | Automatically Generated Customizable Online Dictionaries Eniko Heja Dept. of Language Technology Research Institute for Linguistics HAS . 360 H-1394 Budapest eheja@ David Takacs Dept. of Language Technology Research Institute for Linguistics HAS . 360 H-1394 Budapest takdavid@ Abstract The aim of our software presentation is to demonstrate that corpus-driven bilingual dictionaries generated fully by automatic means are suitable for human use. Previous experiments have proven that bilingual lexicons can be created by applying word alignment on parallel corpora. Such an approach especially the corpus-driven nature of it yields several advantages over more traditional approaches. Most importantly automatically attained translation probabilities are able to guarantee that the most frequently used translations come first within an entry. However the proposed technique have to face some difficulties as well. In particular the scarce availability of parallel texts for medium density languages imposes limitations on the size of the resulting dictionary. Our objective is to design and implement a dictionary building workflow and a query system that is apt to exploit the additional benefits of the method and overcome the disadvantages of it. 1 Introduction The work presented here is part of the pilot project EFNILEX 1 launched in 2008. The project objective was to investigate to what extent LT methods are capable of supporting the creation of bilingual dictionaries. Need for such dictionaries shows up specifically in the case of lesser used languages where it does not pay off for publishers to invest into the production of dictionaries due to the low demand. The targeted size of the dictionaries is between 15 000 and 25 000 entries. Since the 1EFNILEX is financed by EFNIL completely automatic generation of clean bilingual resources is not possible according to the state of the art we have decided to provide lexicographers with bilingual .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.