TAILIEUCHUNG - Báo cáo khoa học: "Discriminative Modeling of Extraction Sets for Machine Translation"

We present a discriminative model that directly predicts which set of phrasal translation rules should be extracted from a sentence pair. Our model scores extraction sets: nested collections of all the overlapping phrase pairs consistent with an underlying word alignment. Extraction set models provide two principle advantages over word-factored alignment models. First, we can incorporate features on phrase pairs, in addition to word links. Second, we can optimize for an extraction-based loss function that relates directly to the end task of generating translations. . | Discriminative Modeling of Extraction Sets for Machine Translation John DeNero and Dan Klein Computer Science Division University of California Berkeley denero klein @ Abstract We present a discriminative model that directly predicts which set of phrasal translation rules should be extracted from a sentence pair. Our model scores extraction sets nested collections of all the overlapping phrase pairs consistent with an underlying word alignment. Extraction set models provide two principle advantages over word-factored alignment models. First we can incorporate features on phrase pairs in addition to word links. Second we can optimize for an extraction-based loss function that relates directly to the end task of generating translations. Our model gives improvements in alignment quality relative to state-of-the-art unsupervised and supervised baselines as well as providing up to a improvement in BLEU score in Chinese-to-English translation experiments. 1 Introduction In the last decade the field of statistical machine translation has shifted from generating sentences word by word to systems that recycle whole fragments of training examples expressed as translation rules. This general paradigm was first pursued using contiguous phrases Och et al. 1999 Koehn et al. 2003 and has since been generalized to a wide variety of hierarchical and syntactic formalisms. The training stage of statistical systems focuses primarily on discovering translation rules in parallel corpora. Most systems discover translation rules via a two-stage pipeline a parallel corpus is aligned at the word level and then a second procedure extracts fragment-level rules from word-aligned sentence pairs. This paper offers a model-based alternative to phrasal rule extraction which merges this two-stage pipeline into a single step. We present a discriminative model that directly predicts which set of phrasal translation rules should be extracted from a sentence pair. Our model predicts

Thiên Mỹ 72 11 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Báo cáo khoa học: "Discriminative Pronunciation Modeling: A Large-Margin, Feature-Rich Approach"

10 43 0

Báo cáo khoa học: "Discriminative Modeling of Extraction Sets for Machine Translation"

11 64 0

Báo cáo khoa học: "Modeling Wisdom of Crowds Using Latent Mixture of Discriminative Experts"

6 66 0

Báo cáo khoa học: "Discriminative Feature-Tied Mixture Modeling for Statistical Machine Translation"

5 49 0

Báo cáo khoa học: "Discriminative Lexicon Adaptation for Improved Character Accuracy – A New Direction in Chinese Language Modeling"

9 54 0

Báo cáo khoa học: "Discriminative Syntactic Language Modeling for Speech Recognition"

8 60 0

Báo cáo khoa học: "Discriminative Language Modeling with Conditional Random Fields and the Perceptron Algorithm"

8 55 0

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 461942 55

Giới thiệu :Lập trình mã nguồn mở

14 23123 64

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 10987 531

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10183 451

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9572 106

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8385 1132

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8278 423

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 7895 2234

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 6836 256

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 6123 1484

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

Giáo án mầm non chương trình đổi mới: Đề tài: Ôn xác định vị trí trên – dưới, trước- sau của đối tượng khác.

8 373 3 21-05-2024

Trading Strategies Profit Making Techniques For Stock_3

23 200 1 21-05-2024

Anh văn bằng C-124

8 193 0 21-05-2024

Management and Services Part 1

10 171 0 21-05-2024

Posted prices versus bargaining in markets_7

23 166 0 21-05-2024

MySQL Database Usage & Administration PHẦN 7

37 168 0 21-05-2024

THE ANTHROPOLOGY OF ONLINE COMMUNITIES BY Samuel M.Wilson and Leighton C. Peterson

19 161 0 21-05-2024

Đóng mới oto 8 chỗ ngồi part 9

10 127 0 21-05-2024

báo cáo hóa học:" Endoscopic decompression for intraforaminal and extraforaminal nerve root compression"

7 118 0 21-05-2024

GIÁO TRÌNH VI XỬ LÝ 1 - CHƯƠNG 5. LẬP TRÌNH CHO VI ĐIỀU KHIỂN 80C51

23 118 1 21-05-2024

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 7895 2234

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 6123 1484

Ebook Chào con ba mẹ đã sẵn sàng

112 3788 1255

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 5413 1138

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8385 1132

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3552 656

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3757 544

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 10987 531

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4170 523

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4191 483