TAILIEUCHUNG - Báo cáo khoa học: "Using Similarity Scoring To Improve the Bilingual Dictionary for Word Alignment"

We describe an approach to improve the bilingual cooccurrence dictionary that is used for word alignment, and evaluate the improved dictionary using a version of the Competitive Linking algorithm. We demonstrate a problem faced by the Competitive Linking algorithm and present an approach to ameliorate it. In particular, we rebuild the bilingual dictionary by clustering similar words in a language and assigning them a higher cooccurrence score with a given word in the other language than each single word would have otherwise. Experimental results show a signiﬁcant improvement in precision and recall for word alignment when the improved dicitonary. | Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics ACL Philadelphia July 2002 pp. 409-416. Using Similarity Scoring To Improve the Bilingual Dictionary for Word Alignment Katharina Probst Language Technologies Institute Carnegie Mellon University Pittsburgh PA USA 15213 kathrin@ Ralf Brown Language Technologies Institute Carnegie Mellon University Pittsburgh PA USA 15213 ralf@ Abstract We describe an approach to improve the bilingual cooccurrence dictionary that is used for word alignment and evaluate the improved dictionary using a version of the Competitive Linking algorithm. We demonstrate a problem faced by the Competitive Linking algorithm and present an approach to ameliorate it. In particular we rebuild the bilingual dictionary by clustering similar words in a language and assigning them a higher cooccurrence score with a given word in the other language than each single word would have otherwise. Experimental results show a significant improvement in precision and recall for word alignment when the improved dicitonary is used. 1 Introduction and Related Work Word alignment is a well-studied problem in Natural Language Computing. This is hardly surprising given its significance in many applications word-aligned data is crucial for example-based machine translation statistical machine translation but also other applications such as cross-lingual information retrieval. Since it is a hard and time-consuming task to hand-align bilingual data the automation of this task receives a fair amount of attention. In this paper we present an approach to improve the bilingual dictionary that is used by word alignment algorithms. Our method is based on similarity scores between words which in effect results in the clustering of morphological variants. One line of related work is research in clustering based on word similarities. This problem is an area of active research in the Information Retrieval community. For .

Khôi Vĩ 63 8 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Measuring semantic similarity between words using page counts and snippets

6 67 1

Báo cáo khoa học: "Using Similarity Scoring To Improve the Bilingual Dictionary for Word Alignment"

8 54 0

Báo cáo khoa học: "Memory-Based Learning: Using Similarity for Smoothing"

8 46 0

Báo cáo khoa học: "Using lexical and relational similarity to classify semantic relations"

9 45 0

Báo cáo khoa học: "Verb Classiﬁcation using Distributional Similarity in Syntactic and Semantic Structures"

10 60 0

Báo cáo khoa học: "Learning to Grade Short Answer Questions using Semantic Similarity Measures and Dependency Graph Alignments"

11 50 0

Báo cáo khoa học: "Paraphrase Recognition Using Machine Learning to Combine Similarity Measures"

9 55 0

Báo cáo khoa học: "Solving Relational Similarity Problems Using the Web as a Corpus"

9 52 0

Báo cáo khoa học: "Finding Synonyms Using Automatic Word Alignment and Measures of Distributional Similarity"

8 48 0

Báo cáo khoa học: "Supersense Tagging of Unknown Nouns using Semantic Similarity"

8 64 0

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 462343 61

Giới thiệu :Lập trình mã nguồn mở

14 26126 79

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11350 542

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10553 466

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9844 108

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8891 1161

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8507 426

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8101 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7765 1793

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 7274 268

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

Báo cáo nghiên cứu khoa học " KẾT QUẢ NGHIÊN CỨU BƯỚC ĐẦU VỀ THIÊN ĐỊCH CHÂN KHỚP TRÊN CÂY THANH TRÀ Ở THỪA THIÊN HUẾ "

7 279 4 29-12-2024

THE ANTHROPOLOGY OF ONLINE COMMUNITIES BY Samuel M.Wilson and Leighton C. Peterson

19 227 4 29-12-2024

Quy Trình Canh Tác Cây Bông Vải

8 164 3 29-12-2024

Báo cáo nghiên cứu khoa học " HÃY LÀM CHO HUẾ XANH HƠN VÀ ĐẸP HƠN "

6 181 3 29-12-2024

Bảng màu theo chữ cái – V

11 168 2 29-12-2024

BÀI GIẢNG Biến Đổi Năng Lượng Điện Cơ - TS. Hồ Phạm Huy

137 161 1 29-12-2024

Đề tài " Dự báo về tác động của Tổ chức Thương mại Thế giới WTO đối với các doanh nghiệp xuất khẩu vừa và nhỏ Việt Nam – Những giải pháp đề xuất "

72 187 2 29-12-2024

CUỘC KHÁNG CHIẾN CHỐNG THỰC DÂN PHÁP KẾT THÚC (1953 - 1954)_5

11 148 1 29-12-2024

Xinh xinh vườn nhà

6 131 0 29-12-2024

Lịch sử Trung Quốc 5000 năm tập 3 part 2

54 153 1 29-12-2024

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8101 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7765 1793

Ebook Chào con ba mẹ đã sẵn sàng

112 4410 1371

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 6305 1268

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8891 1161

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3843 680

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3920 609

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4720 565

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11350 542

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4511 490