TAILIEUCHUNG - Báo cáo khoa học: "Conﬁdence Measure for Word Alignment"

In this paper we present a conﬁdence measure for word alignment based on the posterior probability of alignment links. We introduce sentence alignment conﬁdence measure and alignment link conﬁdence measure. Based on these measures, we improve the alignment quality by selecting high conﬁdence sentence alignments and alignment links from multiple word alignments of the same sentence pair. Additionally, we remove low conﬁdence alignment links from the word alignment of a bilingual training corpus, which increases the alignment F-score, improves Chinese-English and Arabic-English translation quality and signiﬁcantly reduces the phrase translation table size. . | Confidence Measure for Word Alignment Fei Huang IBM Research Center Yorktown Heights NY 10598 USA huangfe@ Abstract In this paper we present a confidence measure for word alignment based on the posterior probability of alignment links. We introduce sentence alignment confidence measure and alignment link confidence measure. Based on these measures we improve the alignment quality by selecting high confidence sentence alignments and alignment links from multiple word alignments of the same sentence pair. Additionally we remove low confidence alignment links from the word alignment of a bilingual training corpus which increases the alignment F-score improves Chinese-English and Arabic-English translation quality and significantly reduces the phrase translation table size. 1 Introduction Data-driven approaches have been quite active in recent machine translation MT research. Many MT systems such as statistical phrase-based and syntax-based systems learn phrase translation pairs or translation rules from large amount of bilingual data with word alignment. The quality of the parallel data and the word alignment have significant impacts on the learned translation models and ultimately the quality of translation output. Due to the high cost of commissioned translation many parallel sentences are automatically extracted from comparable corpora which inevitably introduce many noises . inaccurate or non-literal translations. Given the huge amount of bilingual training data word alignments are automatically generated using various algorithms Brown et al. 1994 Vogel et al. 1996 Figure 1 An example of inaccurate translation and word alignment. and Ittycheriah and Roukos 2005 which also introduce many word alignment errors. The example in Figure 1 shows the word alignment of the given Chinese and English sentence pair where the English words following each Chinese word is its literal translation. We find untranslated Chinese and English words marked with

Hoàng Duệ 43 9 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Báo cáo khoa học: "Conﬁdence Measure for Word Alignment"

9 33 0

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 462351 61

Giới thiệu :Lập trình mã nguồn mở

14 26651 79

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11375 543

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10566 468

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9854 108

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8906 1161

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8518 426

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8109 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7902 1820

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 7289 268

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

Báo cáo nghiên cứu khoa học " KẾT QUẢ NGHIÊN CỨU BƯỚC ĐẦU VỀ THIÊN ĐỊCH CHÂN KHỚP TRÊN CÂY THANH TRÀ Ở THỪA THIÊN HUẾ "

7 287 4 08-01-2025

Báo cáo nghiên cứu khoa học " HÃY LÀM CHO HUẾ XANH HƠN VÀ ĐẸP HƠN "

6 187 3 08-01-2025

Bảng màu theo chữ cái – V

11 177 2 08-01-2025

Chương 10: Các phương pháp tính quá trình quá độ trong mạch điện tuyến tính

57 246 8 08-01-2025

Đề tài " Dự báo về tác động của Tổ chức Thương mại Thế giới WTO đối với các doanh nghiệp xuất khẩu vừa và nhỏ Việt Nam – Những giải pháp đề xuất "

72 193 2 08-01-2025

Valve Selection Handbook - Fourth Edition

337 150 2 08-01-2025

ĐỀ TÀI " ĐÁNH GIÁ HIỆU QUẢ HOẠT ĐỘNG KINH DOANH NGOẠI HỐI CỦA NGÂN HÀNG THƯƠNG MẠI CỔ PHẦN XUẤT NHẬP KHẨU VIỆT NAM "

51 159 3 08-01-2025

Phạm trù Chủ nghĩa cá nhân của tư tưởng phương Tây trong sự lý giải của Phan Khôi _1

9 138 0 08-01-2025

Business English Lesson – Advanced Level's archiveFinance (1)

8 121 0 08-01-2025

SQL và PL/SQLCơ bản.Oracle cơ bản - SQL và PL/SQLMỤC LỤCMỤC LỤC ... CHƯƠNG

104 168 0 08-01-2025

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8109 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7902 1820

Ebook Chào con ba mẹ đã sẵn sàng

112 4435 1376

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 6353 1276

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8906 1161

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3859 680

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3930 610

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4768 567

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11375 543

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4533 490