TAILIEUCHUNG - Báo cáo khoa học: "Dealing with Spurious Ambiguity in Learning ITG-based Word Alignment"

Word alignment has an exponentially large search space, which often makes exact inference infeasible. Recent studies have shown that inversion transduction grammars are reasonable constraints for word alignment, and that the constrained space could be efficiently searched using synchronous parsing algorithms. However, spurious ambiguity may occur in synchronous parsing and cause problems in both search efficiency and accuracy. In this paper, we conduct a detailed study of the causes of spurious ambiguity and how it effects parsing and discriminative learning. . | Dealing with Spurious Ambiguity in Learning ITG-based Word Alignment Shujian Huang State Key Laboratory for Novel Software Technology Nanjing University huangsj@ Stephan Vogel Language Technologies Institute Carnegie Mellon University vogel@ Jiajun Chen State Key Laboratory for Novel Software Technology Nanjing University henjj@ Abstract A AA I AA I e f I e f I e e Word alignment has an exponentially large search space which often makes exact inference infeasible. Recent studies have shown that inversion transduction grammars are reasonable constraints for word alignment and that the constrained space could be efficiently searched using synchronous parsing algorithms. However spurious ambiguity may occur in synchronous parsing and cause problems in both search efficiency and accuracy. In this paper we conduct a detailed study of the causes of spurious ambiguity and how it effects parsing and discriminative learning. We also propose a variant of the grammar which eliminates those ambiguities. Our grammar shows advantages over previous grammars in both synthetic and real-world experiments. 1 Introduction In statistical machine translation word alignment attempts to find word correspondences in parallel sentence pairs. The search space of word alignment will grow exponentially with the length of source and target sentences which makes the inference for complex models infeasible Brown et al. 1993 . Recently inversion transduction grammars Wu 1997 namely ITG have been used to constrain the search space for word alignment Zhang and Gildea 2005 Cherry and Lin 2007 Haghighi et al. 2009 Liu et al. 2010 . ITG is a family of grammars in which the right hand side of the rule is either two nonterminals or a terminal sequence. The most general case of the ITG family is the bracketing transduction grammar 379 Figure 1 BTG rules. AA denotes a monotone concatenation and AA denotes an inverted concatenation. BTG Figure 1 which has only one .

TỪ KHÓA LIÊN QUAN
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.