TAILIEUCHUNG - Báo cáo khoa học: "A Comparative Study on Reordering Constraints in Statistical Machine Translation"

In statistical machine translation, the generation of a translation hypothesis is computationally expensive. If arbitrary wordreorderings are permitted, the search problem is NP-hard. On the other hand, if we restrict the possible word-reorderings in an appropriate way, we obtain a polynomial-time search algorithm. In this paper, we compare two different reordering constraints, namely the ITG constraints and the IBM constraints. | A Comparative Study on Reordering Constraints in Statistical Machine Translation Richard Zens and Hermann Ney Chair of Computer Science VI RWTH Aachen - University of Technology zens ney @ Abstract In statistical machine translation the generation of a translation hypothesis is computationally expensive. If arbitrary wordreorderings are permitted the search problem is NP-hard. On the other hand if we restrict the possible word-reorderings in an appropriate way we obtain a polynomial-time search algorithm. In this paper we compare two different reordering constraints namely the ITG constraints and the IBM constraints. This comparison includes a theoretical discussion on the permitted number of reorderings for each of these constraints. We show a connection between the ITG constraints and the since 1870 known Schroder numbers. We evaluate these constraints on two tasks the Verbmobil task and the Canadian Hansards task. The evaluation consists of two parts First we check how many of the Viterbi alignments of the training corpus satisfy each of these constraints. Second we restrict the search to each of these constraints and compare the resulting translation hypotheses. The experiments will show that the baseline ITG constraints are not sufficient on the Canadian Hansards task. Therefore we present an extension to the ITG constraints. These extended ITG constraints increase the alignment coverage from about 87 to 96 . 1 Introduction In statistical machine translation we are given a source language French sentence fJ fl. fj . fj which is to be translated into a target language English sentence el Cl. .ei. .ei. Among all possible target language sentences we will choose the sentence with the highest probability e argmax Pr e fJ 1 el argmax PrC Pr fJ e1 2 eI e1 The decomposition into two knowledge sources in Eq. 2 is the so-called source-channel approach to statistical machine translation Brown et al. 1990 . It allows an independent modeling of target .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.