TAILIEUCHUNG - Báo cáo khoa học: "Demonstration of Joshua: An Open Source Toolkit for Parsing-based Machine Translation"

We describe Joshua (Li et al., 2009a)1 , an open source toolkit for statistical machine translation. Joshua implements all of the algorithms required for translation via synchronous context free grammars (SCFGs): chart-parsing, n-gram language model integration, beam- and cubepruning, and k-best extraction. The toolkit also implements suffix-array grammar extraction and minimum error rate training. It uses parallel and distributed computing techniques for scalability. | Demonstration of Joshua An Open Source Toolkit for Parsing-based Machine Translation Zhifei Li Chris Callison-Burch Chris Dyer Juri Ganitkevitch Sanjeev Khudanpur Lane Schwartz Wren N. G. Thornton Jonathan Weese and Omar F. Zaidan Center for Language and Speech Processing Johns Hopkins University f Computational Linguistics and Information Processing Lab University of Maryland Human Language Technology and Pattern Recognition Group RWTH Aachen University Natural Language Processing Lab University of Minnesota Abstract We describe Joshua Li et al. 2009a 1 an open source toolkit for statistical machine translation. Joshua implements all of the algorithms required for translation via synchronous context free grammars SCFGs chart-parsing n-gram language model integration beam- and cubepruning and k-best extraction. The toolkit also implements suffix-array grammar extraction and minimum error rate training. It uses parallel and distributed computing techniques for scalability. We also provide a demonstration outline for illustrating the toolkit s features to potential users whether they be newcomers to the field or power users interested in extending the toolkit. 1 Introduction Large scale parsing-based statistical machine translation . Chiang 2007 Quirk et al. 2005 Galley et al. 2006 and Liu et al. 2006 has made remarkable progress in the last few years. However most of the systems mentioned above employ tailor-made dedicated software that is not open source. This results in a high barrier to entry for other researchers and makes experiments difficult to duplicate and compare. In this paper we describe Joshua a Java-based generalpurpose open source toolkit for parsing-based machine translation serving the same role as Moses Koehn et al. 2007 does for regular phrase-based machine translation. This research was supported in part by the Defense Advanced Research Projects Agency s GALE program under Contract No. HR0011-06-2-0001 and the National Science Foundation under

Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.