Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Improving Statistical Natural Language Translation with Categories and Rules"

Công Tráng 56 5 pdf

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ Tải xuống

This paper describes an all level approach on statistical natural language translation (SNLT). W i t h o u t any predefined knowledge the system learns a statistical translation lexicon (STL), word classes (WCs) and translation rules (TRs) from a parallel corpus thereby producing a generalized form of a word alignment (WA). The translation process itself is realized as a beam search. In our method example-based techniques enter an overall statistical approach leading to about 50 percent correctly translated sentences applied to the very difficult EnglishGerman V E R B M O B I L spontaneous speech corpus. . | Improving Statistical Natural Language Translation with Categories and Rules Franz Josef Och and Hans Weber FAU Erlangen - Computer Science Institute IMMD VIII - Artificial Intelligence Am Weichselgarten 9 91058 Erlangen - Tennenlohe Germany faoch weber @inund8.informatik.uni-erlangen.de Abstract This paper describes an all level approach on statistical natural language translation SNLT . Without any predefined knowledge the system learns a statistical translation lexicon STL word classes WCs and translation rules TRs from a parallel corpus thereby producing a generalized form of a word alignment WA . The translation process itself is realized as a beam search. In our method example-based techniques enter an overall statistical approach leading to about 50 percent correctly translated sentences applied to the very difficult English-German Verbmobil spontaneous speech corpus. 1 Introduction In SNLT the transfer itself is realized as a maximization process of the form Trans d argmaxe P e d 1 Here d is a given source language SL sentence which has to be translated into a target language TL sentence e. In order to model the distributions P e d all approaches in SNLT use a divide and conquer strategy of approximating P e d by a combination of simpler models. The problem is to reduce parameters in a sufficient way but end up with a model still able to describe the linguistic facts of natural language translation. The work presented here uses two approximations for P e d . One approximation is used for to gain the relevant parameters in training while a modified formula is subject of decoding translations. In detail we impose the following modifications with respect to approaches published in the last decade 1. A refined distance weight for the STL probabilities is used which allows for a good modeling of the effects caused by syntactic phrases. 2. In order to account for collocations a WA technique is used where one-to-n and n-to-one WAs are allowed. 3. For the .

TÀI LIỆU LIÊN QUAN

Báo cáo khoa học: "Improving Word Representations via Global Context and Multiple Word Prototypes"

Báo cáo khoa học: "Improving the IBM Alignment Models Using Variational Bayes"

Báo cáo khoa học: "Improving the Use of Pseudo-Words for Evaluating Selectional Preferences"

Báo cáo khoa học: "Hierarchical Joint Learning: Improving Joint Parsing and Named Entity Recognition with Non-Jointly Labeled Data"

Báo cáo khoa học: "Improving Statistical Machine Translation with Monolingual Collocation"

Báo cáo khoa học: "A new Approach to Improving Multilingual Summarization using a Genetic Algorithm"

Báo cáo khoa học: "Diversify and Combine: Improving Word Alignment for Machine Translation on Low-Resource Languages"

Báo cáo khoa học: "Improving Chinese Semantic Role Labeling with Rich Syntactic Features"

Báo cáo khoa học: "Improving Arabic-to-English Statistical Machine Translation by Reordering Post-verbal Subjects for Alignment"

Báo cáo khoa học: "Improving Arabic Dependency Parsing with Form-based and Functional Morphological Features"

Đã phát hiện trình chặn quảng cáo AdBlock

Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.