Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
Statistical translation models that try to capture the recursive structure of language have been widely adopted over the last few years. These models make use of varying amounts of information from linguistic theory: some use none at all, some use information about the grammar of the target language, some use information about the grammar of the source language. But progress has been slower on translation models that are able to learn the relationship between the grammars of both the source and target language. . | Learning to Translate with Source and Target Syntax David Chiang USC Information Sciences Institute 4676 Admiralty Way Suite 1001 Marina del Rey CA 90292 USA chiang@isi.edu Abstract Statistical translation models that try to capture the recursive structure of language have been widely adopted over the last few years. These models make use of varying amounts of information from linguistic theory some use none at all some use information about the grammar of the target language some use information about the grammar of the source language. But progress has been slower on translation models that are able to learn the relationship between the grammars of both the source and target language. We discuss the reasons why this has been a challenge review existing attempts to meet this challenge and show how some old and new ideas can be combined into a simple approach that uses both source and target syntax for significant improvements in translation accuracy. 1 Introduction Statistical translation models that use synchronous context-free grammars SCFGs or related formalisms to try to capture the recursive structure of language have been widely adopted over the last few years. The simplest of these Chiang 2005 make no use of information from syntactic theories or syntactic annotations whereas others have successfully incorporated syntactic information on the target side Galley et al. 2004 Galley et al. 2006 or the source side Liu et al. 2006 Huang et al. 2006 . The next obvious step is toward models that make full use of syntactic information on both sides. But the natural generalization to this setting has been found to underperform phrasebased models Liu et al. 2009 Ambati and Lavie 2008 and researchers have begun to explore solutions Zhang et al. 2008 Liu et al. 2009 . In this paper we explore the reasons why tree-to-tree translation has been challenging and how source syntax and target syntax might be used together. Drawing on previous successful attempts to relax .