Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Improving Statistical Natural Language Translation with Categories and Rules"

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ

This paper describes an all level approach on statistical natural language translation (SNLT). W i t h o u t any predefined knowledge the system learns a statistical translation lexicon (STL), word classes (WCs) and translation rules (TRs) from a parallel corpus thereby producing a generalized form of a word alignment (WA). The translation process itself is realized as a beam search. In our method example-based techniques enter an overall statistical approach leading to about 50 percent correctly translated sentences applied to the very difficult EnglishGerman V E R B M O B I L spontaneous speech corpus. . | Improving Statistical Natural Language Translation with Categories and Rules Franz Josef Och and Hans Weber FAU Erlangen - Computer Science Institute IMMD VIII - Artificial Intelligence Am Weichselgarten 9 91058 Erlangen - Tennenlohe Germany faoch weber @inund8.informatik.uni-erlangen.de Abstract This paper describes an all level approach on statistical natural language translation SNLT . Without any predefined knowledge the system learns a statistical translation lexicon STL word classes WCs and translation rules TRs from a parallel corpus thereby producing a generalized form of a word alignment WA . The translation process itself is realized as a beam search. In our method example-based techniques enter an overall statistical approach leading to about 50 percent correctly translated sentences applied to the very difficult English-German Verbmobil spontaneous speech corpus. 1 Introduction In SNLT the transfer itself is realized as a maximization process of the form Trans d argmaxe P e d 1 Here d is a given source language SL sentence which has to be translated into a target language TL sentence e. In order to model the distributions P e d all approaches in SNLT use a divide and conquer strategy of approximating P e d by a combination of simpler models. The problem is to reduce parameters in a sufficient way but end up with a model still able to describe the linguistic facts of natural language translation. The work presented here uses two approximations for P e d . One approximation is used for to gain the relevant parameters in training while a modified formula is subject of decoding translations. In detail we impose the following modifications with respect to approaches published in the last decade 1. A refined distance weight for the STL probabilities is used which allows for a good modeling of the effects caused by syntactic phrases. 2. In order to account for collocations a WA technique is used where one-to-n and n-to-one WAs are allowed. 3. For the .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.