Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Unsupervised Morphology Rivals Supervised Morphology for Arabic MT"

Xuân Kiên 62 6 pdf

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ Tải xuống

If unsupervised morphological analyzers could approach the effectiveness of supervised ones, they would be a very attractive choice for improving MT performance on low-resource inﬂected languages. In this paper, we compare performance gains for state-of-the-art supervised vs. unsupervised morphological analyzers, using a state-of-theart Arabic-to-English MT system. | Unsupervised Morphology Rivals Supervised Morphology for Arabic MT David Stallard Jacob Devlin Michael Kayser BBN Technologies stallard jdevlin rzbib @bbn.com Yoong Keok Lee Regina Barzilay CSAIL Massachusetts Institute of Technology yklee regina @csail.mit.edu Abstract If unsupervised morphological analyzers could approach the effectiveness of supervised ones they would be a very attractive choice for improving MT performance on low-resource inflected languages. In this paper we compare performance gains for state-of-the-art supervised vs. unsupervised morphological analyzers using a state-of-the-art Arabic-to-English MT system. We apply maximum marginal decoding to the unsupervised analyzer and show that this yields the best published segmentation accuracy for Arabic while also making segmentation output more stable. Our approach gives an 18 relative BLEU gain for Levantine dialectal Arabic. Furthermore it gives higher gains for Modern Standard Arabic MSA as measured on NIST MT-08 than does MADA Habash and Rambow 2005 a leading supervised MSA segmenter. 1 Introduction If unsupervised morphological segmenters could approach the effectiveness of supervised ones they would be a very attractive choice for improving machine translation MT performance in low-resource inflected languages. An example of particular current interest is Arabic whose various colloquial dialects are sufficiently different from Modern Standard Arabic MSA in lexicon orthography and morphology as to be low-resource languages themselves. An additional advantage of Arabic for study is the availability of high-quality supervised seg-menters for MSA such as MADA Habash and 322 Rambow 2005 for performance comparison. The MT gain for supervised MSA segmenters on dialect establishes a lower bound which the unsupervised segmenter must exceed if it is to be useful for dialect. And comparing the gain for supervised and unsupervised segmenters on MSA tells us how useful the unsupervised segmenter is .

TÀI LIỆU LIÊN QUAN

Báo cáo khoa học: "Unsupervised Relation Discovery with Sense Disambiguation"

Báo cáo khoa học: "Unsupervised Semantic Role Induction with Global Role Ordering"

Báo cáo khoa học: "Towards the Unsupervised Acquisition of Discourse Relations"

Báo cáo khoa học: "Unsupervised Morphology Rivals Supervised Morphology for Arabic MT"

Báo cáo khoa học: "Smaller Alignment Models for Better Translations: Unsupervised Word Alignment with the 0"

Báo cáo khoa học: "A Statistical Model for Unsupervised and Semi-supervised Transliteration Mining"

Báo cáo khoa học: "Fully Unsupervised Core-Adjunct Argument Classiﬁcation"

Báo cáo khoa học: "Unsupervised Ontology Induction from Text"

Báo cáo khoa học: "Improved Unsupervised POS Induction through Prototype Discovery"

Báo cáo khoa học: "Unsupervised Event Coreference Resolution with Rich Linguistic Features"

Đã phát hiện trình chặn quảng cáo AdBlock

Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.