TAILIEUCHUNG - Báo cáo khoa học: "Machine Translation Using Probabilistic Synchronous Dependency Insertion Grammars"

Syntax-based statistical machine translation (MT) aims at applying statistical models to structured data. In this paper, we present a syntax-based statistical machine translation system based on a probabilistic synchronous dependency insertion grammar. Synchronous dependency insertion grammars are a version of synchronous grammars defined on dependency trees. We first introduce our approach to inducing such a grammar from parallel corpora. | Machine Translation Using Probabilistic Synchronous Dependency Insertion Grammars Yuan Ding Martha Palmer Department of Computer and Information Science University of Pennsylvania Philadelphia PA 19104 USA yding mpalmer @ Abstract Syntax-based statistical machine translation MT aims at applying statistical models to structured data. In this paper we present a syntax-based statistical machine translation system based on a probabilistic synchronous dependency insertion grammar. Synchronous dependency insertion grammars are a version of synchronous grammars defined on dependency trees. We first introduce our approach to inducing such a grammar from parallel corpora. Second we describe the graphical model for the machine translation task which can also be viewed as a stochastic tree-to-tree transducer. We introduce a polynomial time decoding algorithm for the model. We evaluate the outputs of our MT system using the NIST and Bleu automatic MT evaluation software. The result shows that our system outperforms the baseline system based on the IBM models in both translation speed and quality. 1 Introduction Statistical approaches to machine translation pioneered by Brown et al. 1993 achieved impressive performance by leveraging large amounts of parallel corpora. Such approaches which are essentially stochastic string-to-string transducers do not explicitly model natural language syntax or semantics. In reality pure statistical systems sometimes suffer from ungrammatical outputs which are understandable at the phrasal level but sometimes hard to comprehend as a coherent sentence. In recent years syntax-based statistical machine translation which aims at applying statistical models to structural data has begun to emerge. With the research advances in natural language parsing especially the broad-coverage parsers trained from treebanks for example Collins 1999 the utilization of structural analysis of different languages has been made possible. Ideally by .

TÀI LIỆU LIÊN QUAN
TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.