TAILIEUCHUNG - Báo cáo khoa học: "Using POS Information for Statistical Machine Translation into Morphologically Rich Languages"

When translating from languages with hardly any inflectional morphology like English into morphologically rich languages, the English word forms often do not contain enough information for producing the correct fullform in the target language. We investigate methods for improving the quality of such translations by making use of part-ofspeech information and maximum entropy modeling. Results for translations from English into Spanish and Catalan are presented on the LC-STAR corpus which consists of spontaneously spoken dialogues in the domain of appointment scheduling and travel planning. . | Using POS Information for Statistical Machine Translation into Morphologically Rich Languages Nicola Ueffing and Hermann Ney Lehrstuhl fur Informatik VI - Computer Science Department RWTH Aachen - University of Technology ueffing ney @ Abstract When translating from languages with hardly any inflectional morphology like English into morphologically rich languages the English word forms often do not contain enough information for producing the correct fullform in the target language. We investigate methods for improving the quality of such translations by making use of part-of-speech information and maximum entropy modeling. Results for translations from English into Spanish and Catalan are presented on the LC-STAR corpus which consists of spontaneously spoken dialogues in the domain of appointment scheduling and travel planning. 1 Introduction In this paper we address the question of how part-of-speech POS information can help improving the quality of Statistical Machine Translation SMT . One of the main problems when translating from a language with hardly any inflectional morphology which is English in our experiments into one with richer morphology here Spanish and Catalan is the production of the correct inflected form in the target language. We introduce transformations to the English string that are based on the part-of-speech information and show how this knowledge source can help SMT. Systematic evaluations will show that the quality of the gen erated translations is improved. The transformations we apply are the following Treatment of verbs In Catalan and Spanish the pronoun before a verb is often omitted and instead the person is expressed via the ending of the verb. The same holds for future tense and for the modes expressed through would and should in English. Since this makes it hard to generate the correct translation of a given English verb we propose a method resulting in English word forms containing sufficient information. .

Ðức Sinh 62 8 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Báo cáo khoa học: "Using POS Information for Statistical Machine Translation into Morphologically Rich Languages"

8 49 0

Báo cáo khoa học: "Joint Word Segmentation and POS Tagging using a Single Perceptron"

9 41 0

Báo cáo khoa học: "High Precision Treebanking — Blazing Useful Trees Using POS Information"

8 37 0

Báo cáo khoa học: "Discourse Type Clustering using POS n-gram Proﬁles and High-Dimensional Embeddings"

9 37 0

Báo cáo khoa học: "Correcting a PoS-tagged corpus using three complementary methods"

9 65 0

Báo cáo khoa học: "A Flexible POS Tagger Using an Automatically Acquired Language Model"

8 80 0

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 462285 61

Giới thiệu :Lập trình mã nguồn mở

14 24844 79

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11281 542

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10508 466

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9785 108

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8876 1160

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8463 426

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8090 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7465 1763

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 7185 268

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

B2B Content Marketing: 2012 Benchmarks, Budgets & Trends

17 213 3 23-11-2024

Giáo trình phân tích phương trình vi phân viết dưới dạng thuật toán đặc tính của hệ thống p1

5 149 1 23-11-2024

Báo cáo nghiên cứu nông nghiệp " Biofertiliser inoculant technology for the growth of rice in Vietnam: Developing technical infrastructure for quality assurance and village production for farmers "

12 132 2 23-11-2024

Bảng màu theo chữ cái – V

11 153 2 23-11-2024

BÀI GIẢNG Biến Đổi Năng Lượng Điện Cơ - TS. Hồ Phạm Huy

137 146 1 23-11-2024

Bệnh sán lá gan trên gia súc và cách phòng trị

3 157 1 23-11-2024

Chủ đề 3 : SỰ CÂN BẰNG CỦA VẬT RẮN (4 tiết)

9 198 1 23-11-2024

Cắt tóc ngắn cá tính như người nổi tiếng

8 132 0 23-11-2024

TÀI LIỆU TRẮC NGHIỆM LIPOPROTEIN

24 132 1 23-11-2024

CÂU HỎI TRẮC NGHIỆM HSLS NƯỚC TIỂU

9 167 0 23-11-2024

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8090 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7465 1763

Ebook Chào con ba mẹ đã sẵn sàng

112 4364 1369

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 6149 1258

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8876 1160

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3786 680

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3909 609

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4614 562

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11281 542

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4447 490