TAILIEUCHUNG - Báo cáo khoa học: "DECISION LISTS FOR LEXICAL AMBIGUITY RESOLUTION"

The particular domain chosen here as a case study is the problem of restoring missing accents 1 to Spanish and French text. Because it requires the resolution of both semantic and syntactic ambiguity, and offers an objective ground truth for automatic evaluation, it is particularly well suited for demonstrating and testing the capabilities of the given algorithm. It is also a practical problem with immediate application. PROBLEM DESCRIPTION The general problem considered here is the resolution of lexical ambiguity, both syntactic and semantic, based on properties of the surrounding context. . | DECISION LISTS FOR LEXICAL AMBIGUITY RESOLUTION Application to Accent Restoration in Spanish and French David Yarowsky Department of Computer and Information Science University of Pennsylvania Philadelphia PA 19104 Abstract This paper presents a statistical decision procedure for lexical ambiguity resolution. The algorithm exploits both local syntactic patterns and more distant collocational evidence generating an efficient effective and highly perspicuous recipe for resolving a given ambiguity. By identifying and utilizing only the single best disambiguating evidence in a target context the algorithm avoids the problematic complex modeling of statistical dependencies. Although directly applicable to a wide class of ambiguities the algorithm is described and evaluated in a realistic case study the problem of restoring missing accents in Spanish and French text. Current accuracy exceeds 99 on the full task and typically is over 90 for even the most difficult ambiguities. INTRODUCTION This paper presents a general-purpose statistical decision procedure for lexical ambiguity resolution based on decision lists Rivest 1987 . The algorithm considers multiple types of evidence in the context of an ambiguous word exploiting differences in collocational distribution as measured by log-likelihoods. Unlike standard Bayesian approaches however it does not combine the log-likelihoods of all available pieces of contextual evidence but bases its classifications solely on the single most reliable piece of evidence identified in the target context. Perhaps surprisingly this strategy appears to yield the same or even slightly better precision than the combination of evidence approach when trained on the same features. It also brings with it several additional advantages the greatest of which is the ability to include multiple highly non-independent sources of evidence without complex modeling of dependencies. Some other advantages are significant .

Khôi Nguyên 64 8 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Báo cáo khoa học: "DECISION LISTS FOR LEXICAL AMBIGUITY RESOLUTION"

8 47 0

Contents lists available at ScienceDirect: Decision Support Systems

14 44 0

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 462340 61

Giới thiệu :Lập trình mã nguồn mở

14 26020 79

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11345 542

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10550 466

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9841 108

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8889 1161

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8504 426

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8100 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7735 1790

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 7263 268

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

Báo cáo nghiên cứu khoa học " KẾT QUẢ NGHIÊN CỨU BƯỚC ĐẦU VỀ THIÊN ĐỊCH CHÂN KHỚP TRÊN CÂY THANH TRÀ Ở THỪA THIÊN HUẾ "

7 276 4 26-12-2024

Data Structures and Algorithms - Chapter 8: Heaps

41 188 5 26-12-2024

BÀI GIẢNG Biến Đổi Năng Lượng Điện Cơ - TS. Hồ Phạm Huy

137 158 1 26-12-2024

ETHICAL CODE HANDBOOK: Demonstrate your commitment to high standards

7 147 1 26-12-2024

Sáng kiến kinh nghiệm môn mỹ thuật

5 173 1 26-12-2024

Lập trình Java cơ bản : Luồng và xử lý file part 8

5 140 1 26-12-2024

Sinh thái học nông nghiệp : Sinh thái học và sự phát triển Nông nghiệp part 8

8 136 0 26-12-2024

THUẬT TOÁN LUYỆN KIM SONG SONG (Parallel Simulated Annealing Algorithms) GIẢI QUYẾT BÀI TOÁN MAX-SAT

41 127 1 26-12-2024

Giáo trình Công nghệ chế biến dầu mỡ thực phẩm - ThS. Trần Thanh Trúc

104 130 0 26-12-2024

Món ngon ngày lễ tết part 2

16 133 1 26-12-2024

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8100 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7735 1790

Ebook Chào con ba mẹ đã sẵn sàng

112 4406 1371

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 6283 1266

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8889 1161

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3839 680

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3919 609

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4708 565

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11345 542

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4508 490