Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Discriminative Modeling of Extraction Sets for Machine Translation"

Thiên Mỹ 72 11 pdf

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ Tải xuống

We present a discriminative model that directly predicts which set of phrasal translation rules should be extracted from a sentence pair. Our model scores extraction sets: nested collections of all the overlapping phrase pairs consistent with an underlying word alignment. Extraction set models provide two principle advantages over word-factored alignment models. First, we can incorporate features on phrase pairs, in addition to word links. Second, we can optimize for an extraction-based loss function that relates directly to the end task of generating translations. . | Discriminative Modeling of Extraction Sets for Machine Translation John DeNero and Dan Klein Computer Science Division University of California Berkeley denero klein @cs.berkeley.edu Abstract We present a discriminative model that directly predicts which set of phrasal translation rules should be extracted from a sentence pair. Our model scores extraction sets nested collections of all the overlapping phrase pairs consistent with an underlying word alignment. Extraction set models provide two principle advantages over word-factored alignment models. First we can incorporate features on phrase pairs in addition to word links. Second we can optimize for an extraction-based loss function that relates directly to the end task of generating translations. Our model gives improvements in alignment quality relative to state-of-the-art unsupervised and supervised baselines as well as providing up to a 1.4 improvement in BLEU score in Chinese-to-English translation experiments. 1 Introduction In the last decade the field of statistical machine translation has shifted from generating sentences word by word to systems that recycle whole fragments of training examples expressed as translation rules. This general paradigm was first pursued using contiguous phrases Och et al. 1999 Koehn et al. 2003 and has since been generalized to a wide variety of hierarchical and syntactic formalisms. The training stage of statistical systems focuses primarily on discovering translation rules in parallel corpora. Most systems discover translation rules via a two-stage pipeline a parallel corpus is aligned at the word level and then a second procedure extracts fragment-level rules from word-aligned sentence pairs. This paper offers a model-based alternative to phrasal rule extraction which merges this two-stage pipeline into a single step. We present a discriminative model that directly predicts which set of phrasal translation rules should be extracted from a sentence pair. Our model predicts

TÀI LIỆU LIÊN QUAN

Báo cáo khoa học: "Discriminative Pruning for Discriminative ITG Alignment"

Báo cáo khoa học: "A Discriminative Latent Variable Model for Statistical Machine Translation"

Báo cáo khoa học: "Discriminative Learning for Joint Template Filling"

Báo cáo khoa học: "Joint Feature Selection in Distributed Stochastic Learning for Large-Scale Discriminative Training in SMT"

Báo cáo khoa học: "Discriminative Pronunciation Modeling: A Large-Margin, Feature-Rich Approach"

Báo cáo khoa học: "Discriminative Strategies to Integrate Multiword Expression Recognition and Parsing"

Báo cáo khoa học: "Concept-to-text Generation via Discriminative Reranking"

Báo cáo khoa học: "A Discriminative Hierarchical Model for Fast Coreference at Large Scale"

Báo cáo khoa học: "Discriminative Modeling of Extraction Sets for Machine Translation"

Báo cáo khoa học: "Consistent Translation using Discriminative Learning: A Translation Memory-inspired Approach"

Đã phát hiện trình chặn quảng cáo AdBlock

Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.