TAILIEUCHUNG - Báo cáo khoa học: "Variational Decoding for Statistical Machine Translation"

Statistical models in machine translation exhibit spurious ambiguity. That is, the probability of an output string is split among many distinct derivations (., trees or segmentations). In principle, the goodness of a string is measured by the total probability of its many derivations. However, ﬁnding the best string (., during decoding) is then computationally intractable. Therefore, most systems use a simple Viterbi approximation that measures the goodness of a string using only its most probable derivation. Instead, we develop a variational approximation, which considers all the derivations but still allows tractable decoding. . | Variational Decoding for Statistical Machine Translation Zhifei Li and Jason Eisner and Sanjeev Khudanpur Department of Computer Science and Center for Language and Speech Processing Johns Hopkins University Baltimore MD 21218 USA jason@ khudanpur@ Abstract Statistical models in machine translation exhibit spurious ambiguity. That is the probability of an output string is split among many distinct derivations . trees or segmentations . In principle the goodness of a string is measured by the total probability of its many derivations. However finding the best string . during decoding is then computationally intractable. Therefore most systems use a simple Viterbi approximation that measures the goodness of a string using only its most probable derivation. Instead we develop a variational approximation which considers all the derivations but still allows tractable decoding. Our particular variational distributions are parameterized as n-gram models. We also analytically show that interpolating these n-gram models for different n is similar to minimumrisk decoding for BLEU Tromble et al. 2008 . Experiments show that our approach improves the state of the art. 1 Introduction Ambiguity is a central issue in natural language processing. Many systems try to resolve ambiguities in the input for example by tagging words with their senses or choosing a particular syntax tree for a sentence. These systems are designed to recover the values of interesting latent variables such as word senses syntax trees or translations given the observed input. However some systems resolve too many ambiguities. They recover additional latent variables so-called nuisance variables that are not of interest to the For example though machine translation MT seeks to output a string typical MT systems Koehn et al. 2003 Chiang 2007 1These nuisance variables may be annotated in training data but it is more common for them to be latent even there .

Gia Phúc 67 9 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Báo cáo khoa học: "Variational Decoding for Statistical Machine Translation"

9 49 0

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 462307 61

Giới thiệu :Lập trình mã nguồn mở

14 25017 79

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11301 542

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10515 466

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9800 108

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8879 1161

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8469 426

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8093 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7501 1765

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 7200 268

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

Data Structures and Algorithms - Chapter 8: Heaps

41 174 5 02-12-2024

Báo cáo nghiên cứu nông nghiệp " Biofertiliser inoculant technology for the growth of rice in Vietnam: Developing technical infrastructure for quality assurance and village production for farmers "

12 134 2 02-12-2024

Bảng màu theo chữ cái – V

11 156 2 02-12-2024

Color Atlas of Ophthamology

165 135 2 02-12-2024

Đề tài " Dự báo về tác động của Tổ chức Thương mại Thế giới WTO đối với các doanh nghiệp xuất khẩu vừa và nhỏ Việt Nam – Những giải pháp đề xuất "

72 179 2 02-12-2024

Báo cáo " Bàn về hành vi pháp luật và hành vi đạo đức "

11 172 2 02-12-2024

Valve Selection Handbook - Fourth Edition

337 142 1 02-12-2024

ETHICAL CODE HANDBOOK: Demonstrate your commitment to high standards

7 143 1 02-12-2024

IT Audit: EMC’s Journey to the Private Cloud

13 152 1 02-12-2024

Chủ đề 3 : SỰ CÂN BẰNG CỦA VẬT RẮN (4 tiết)

9 200 1 02-12-2024

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8093 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7501 1765

Ebook Chào con ba mẹ đã sẵn sàng

112 4370 1369

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 6169 1260

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8879 1161

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3801 680

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3912 609

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4629 562

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11301 542

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4463 490