TAILIEUCHUNG - Báo cáo khoa học: "A Polynomial-Time Algorithm for Statistical Machine Translation"

We introduce a polynomial-time algorithm for statistical machine translation. This algorithm can be used in place of the expensive, slow best-first search strategies in current statistical translation architectures. The approach employs the stochastic bracketing transduction grammar (SBTG) model we recently introduced to replace earlier word alignment channel models, while retaining a bigram language model. The new algorithm in our experience yields major speed improvement with no significant loss of accuracy. . | A Polynomial-Time Algorithm for Statistical Machine Translation Dekai Wu HKUST Department of Computer Science University of Science and Technology Clear Water Bay Hong Kong Abstract We introduce a polynomial-time algorithm for statistical machine translation. This algorithm can be used in place of the expensive slow best-first search strategies in current statistical translation architectures. The approach employs the stochastic bracketing transduction grammar SBTG model we recently introduced to replace earlier word alignment channel models while retaining a bigram language model. The new algorithm in our experience yields major speed improvement with no significant loss of accuracy. 1 Motivation The statistical translation model introduced by IBM Brown et al. 1990 views translation as a noisy channel process. Assume as we do throughout this paper that the input language is Chinese and the task is to translate into English. The underlying generative model shown in Figure 1 contains a stochastic English sentence generator whose output is corrupted by the translation channel to produce Chinese sentences. In the IBM system the language model employs simple n-grams while the translation model employs several sets of parameters as discussed below. Estimation of the parameters has been described elsewhere Brown et al. 1993 . Translation is performed in the reverse direction from generation as usual for recognition under generative models. For each Chinese sentence c that is to be translated the system must attempt to find the English sentence e such that 1 e argmaxPr eịc e 2 argmaxPr c e Pr e e In the IBM model the search for the optimal e is performed using a best-first heuristic stack search similar to A methods. One of the primary obstacles to making the statistical translation approach practical is slow speed of translation as performed in A fashion. This price is paid for the robustness that is obtained by using very flexible language and .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.