TAILIEUCHUNG - Báo cáo khoa học: "Paraphrase Lattice for Statistical Machine Translation"

Lattice decoding in statistical machine translation (SMT) is useful in speech translation and in the translation of German because it can handle input ambiguities such as speech recognition ambiguities and German word segmentation ambiguities. We show that lattice decoding is also useful for handling input variations. Given an input sentence, we build a lattice which represents paraphrases of the input sentence. We call this a paraphrase lattice. Then, we give the paraphrase lattice as an input to the lattice decoder. . | Paraphrase Lattice for Statistical Machine Translation Takashi Onishi and Masao Utiyama and Eiichiro Sumita Language Translation Group MASTAR Project National Institute of Information and Communications Technology 3-5 Hikaridai Keihanna Science City Kyoto 619-0289 JAPaN mutiyama @ Abstract Lattice decoding in statistical machine translation SMT is useful in speech translation and in the translation of German because it can handle input ambiguities such as speech recognition ambiguities and German word segmentation ambiguities. We show that lattice decoding is also useful for handling input variations. Given an input sentence we build a lattice which represents paraphrases of the input sentence. We call this a paraphrase lattice. Then we give the paraphrase lattice as an input to the lattice decoder. The decoder selects the best path for decoding. Using these paraphrase lattices as inputs we obtained significant gains in BLEU scores for IWSLT and Europarl datasets. 1 Introduction Lattice decoding in SMT is useful in speech translation and in the translation of German Bertoldi et al. 2007 Dyer 2009 . In speech translation by using lattices that represent not only 1-best result but also other possibilities of speech recognition we can take into account the ambiguities of speech recognition. Thus the translation quality for lattice inputs is better than the quality for 1best inputs. In this paper we show that lattice decoding is also useful for handling input variations. Input variations refers to the differences of input texts with the same meaning. For example Is there a beauty salon and Is there a beauty parlor have the same meaning with variations in beauty salon and beauty parlor . Since these variations are frequently found in natural language texts a mismatch of the expressions in source sentences and the expressions in training corpus leads to a decrease in translation quality. Therefore we propose a novel method that .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.