Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
This paper describes a novel technique for incorporating syntactic knowledge into phrasebased machine translation through incremental syntactic parsing. Bottom-up and topdown parsers typically require a completed string as input. This requirement makes it difficult to incorporate them into phrase-based translation, which generates partial hypothesized translations from left-to-right. | Incremental Syntactic Language Models for Phrase-based Translation Lane Schwartz Air Force Research Laboratory Wright-Patterson AFB OH USA lane.schwartz@wpafb.af.mil William Schuler Ohio State University Columbus OH UsA schuler@ling.ohio-state.edu Chris Callison-Burch Johns Hopkins University Baltimore MD USA ccb@cs.jhu.edu Stephen Wu Mayo Clinic Rochester MN USA wu.stephen@mayo.edu Abstract This paper describes a novel technique for incorporating syntactic knowledge into phrasebased machine translation through incremental syntactic parsing. Bottom-up and topdown parsers typically require a completed string as input. This requirement makes it difficult to incorporate them into phrase-based translation which generates partial hypothesized translations from left-to-right. Incremental syntactic language models score sentences in a similar left-to-right fashion and are therefore a good mechanism for incorporating syntax into phrase-based translation. We give a formal definition of one such lineartime syntactic language model detail its relation to phrase-based decoding and integrate the model with the Moses phrase-based translation system. We present empirical results on a constrained Urdu-English translation task that demonstrate a significant BLEU score improvement and a large decrease in perplexity. 1 Introduction Early work in statistical machine translation viewed translation as a noisy channel process comprised of a translation model which functioned to posit adequate translations of source language words and a target language model which guided the fluency of generated target language strings Brown et al. This research was supported by NSF CAREER PECASE award 0447685 NSF grant IIS-0713448 and the European Commission through the EuroMatrixPlus project. Opinions interpretations conclusions and recommendations are those of the authors and are not necessarily endorsed by the sponsors or the United States Air Force. Cleared for public release Case Number .