TAILIEUCHUNG - Báo cáo khoa học: "Discriminative Training and Maximum Entropy Models for Statistical Machine Translation"

We present a framework for statistical machine translation of natural languages based on direct maximum entropy models, which contains the widely used source-channel approach as a special case. All knowledge sources are treated as feature functions, which depend on the source language sentence, the target language sentence and possible hidden variables. This approach allows a baseline machine translation system to be extended easily by adding new feature functions. We show that a baseline statistical machine translation system is signiﬁcantly improved using this approach. . | Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics ACL Philadelphia July 2002 pp. 295-302. Discriminative Training and Maximum Entropy Models for Statistical Machine Translation Franz Josef Och and Hermann Ney Lehrstuhl fur Informatik VI Computer Science Department RWTH Aachen - University of Technology D-52056 Aachen Germany och ney @ Abstract We present a framework for statistical machine translation of natural languages based on direct maximum entropy models which contains the widely used source-channel approach as a special case. All knowledge sources are treated as feature functions which depend on the source language sentence the target language sentence and possible hidden variables. This approach allows a baseline machine translation system to be extended easily by adding new feature functions. We show that a baseline statistical machine translation system is significantly improved using this approach. 1 Introduction We are given a source French sentence fJ fa . fj . fj which is to be translated into a target English sentence el el . 6i . ei. Among all possible target sentences we will choose the sentence with the highest probability 1 e argmax Pr el fJ 1 fa The argmax operation denotes the search problem . the generation of the output sentence in the target language. 1The notational convention will be as follows. We use the symbol Pr to denote general probability distributions with nearly no specific assumptions. In contrast for model-based probability distributions we use the generic symbol p . . Source-Channel Model According to Bayes decision rule we can equivalently to Eq. 1 perform the following maximization el argmax Pr el Pr fJ ef 2 ei e1 This approach is referred to as source-channel approach to statistical MT. Sometimes it is also referred to as the fundamental equation of statistical MT Brown et al. 1993 . Here Pr el is the language model of the target language whereas Pr fJ el

Khải Tuấn 69 8 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Báo cáo khoa học: "Joint Feature Selection in Distributed Stochastic Learning for Large-Scale Discriminative Training in SMT"

11 67 0

Báo cáo khoa học: "Joint Processing and Discriminative Training for Letter-to-Phoneme Conversion"

9 50 0

Báo cáo khoa học: "A Discriminative Global Training Algorithm for Statistical MT"

8 72 0

Báo cáo khoa học: "Soft Syntactic Constraints for Word Alignment through Discriminative Training"

8 52 0

Báo cáo khoa học: "Discriminative Training of a Neural Network Statistical Parser"

8 43 0

Báo cáo khoa học: "Discriminative Training and Maximum Entropy Models for Statistical Machine Translation"

8 44 0

Acceleration in state-of-the-art ASR applied to a Vietnamese transcription system

8 86 0

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 462336 61

Giới thiệu :Lập trình mã nguồn mở

14 25915 79

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11335 542

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10543 466

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9835 108

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8885 1161

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8499 426

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8098 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7709 1788

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 7240 268

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

BÀI GIẢNG Biến Đổi Năng Lượng Điện Cơ - TS. Hồ Phạm Huy

137 157 1 23-12-2024

báo cáo hóa học:" Quality of data collection in a large HIV observational clinic database in sub-Saharan Africa: implications for clinical research and audit of care"

7 153 4 23-12-2024

CHƯƠNG 2: RỦI RO THÂM HỤT TÀI KHÓA

28 158 1 23-12-2024

Sử dụng mô hình ARCH và GARCH để phân tích và dự báo về giá cổ phiếu trên thị trường chứng khoán

24 1072 2 23-12-2024

báo cáo khoa học: "Malignant peripheral nerve sheath tumor arising from the greater omentum: Case report"

4 140 1 23-12-2024

Báo cáo nghiên cứu khoa học " NÂNG QUAN HỆ KINH TẾ THƯƠNG MẠI VIỆT NAM - TRUNG QUỐC LÊN TẦM CAO THỜI ĐẠI "

8 170 1 23-12-2024

Lập trình Java cơ bản : Luồng và xử lý file part 8

5 140 1 23-12-2024

Xinh xinh vườn nhà

6 131 0 23-12-2024

Determini prounoun 1

6 139 0 23-12-2024

ĐỀ LUYỆN THI ĐẠI HỌC MÔN: TIẾNG ANH - SỐ 3

4 128 1 23-12-2024

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8098 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7709 1788

Ebook Chào con ba mẹ đã sẵn sàng

112 4406 1371

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 6273 1266

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8885 1161

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3835 680

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3917 609

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4700 565

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11335 542

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4501 490