TAILIEUCHUNG - Báo cáo khoa học: "Text Chunking using Regularized Winnow"

Many machine learning methods have recently been applied to natural language processing tasks. Among them, the Winnow algorithm has been argued to be particularly suitable for NLP problems, due to its robustness to irrelevant features. However in theory, Winnow may not converge for nonseparable data. To remedy this problem, a modiﬁcation called regularized Winnow has been proposed. In this paper, we apply this new method to text chunking. We show that this method achieves state of the art performance with signiﬁcantly less computation than previous approaches. . | Text Chunking using Regularized Winnow Tong Zhangf and Fred DamerauỊ and David Johnson IBM TJ. Watson Research Center Yorktown Heights New York 10598 USA ftzhang@ Ịdamerau@ dejohns@ Abstract Many machine learning methods have recently been applied to natural language processing tasks. Among them the Winnow algorithm has been argued to be particularly suitable for NLP problems due to its robustness to irrelevant features. However in theory Winnow may not converge for non-separable data. To remedy this problem a modification called regularized Winnow has been proposed. In this paper we apply this new method to text chunking. We show that this method achieves state of the art performance with significantly less computation than previous approaches. 1 Introduction Recently there has been considerable interest in applying machine learning techniques to problems in natural language processing. One method that has been quite successful in many applications is the SNoW architecture Dagan et al. 1997 Khardon et al. 1999 . This architecture is based on the Winnow algorithm Littlestone 1988 Grove and Roth 2001 which in theory is suitable for problems with many irrelevant attributes. In natural language processing one often encounters a very high dimensional feature space although most of the features are irrelevant. Therefore the robustness of Winnow to high dimensional feature space is considered an important reason why it is suitable for NLP tasks. However the convergence of the Winnow algorithm is only guaranteed for linearly separable data. In practical NLP applications data are often linearly non-separable. Consequently a direct application of Winnow may lead to numerical instability. A remedy for this called regularized Winnow has been recently proposed in Zhang 2001 . This method modifies the original Winnow algorithm so that it solves a regularized optimization problem. It converges both in the linearly separable case and in the .

Quang Ninh 72 8 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Báo cáo khoa học: "A High-Performance Semi-Supervised Learning Method for Text Chunking"

9 74 0

Báo cáo khoa học: "Text Chunking using Regularized Winnow"

8 56 0

Báo cáo khoa học: "Text Chunking by Combining Hand-Crafted Rules and Memory-Based Learning"

8 47 0

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 462353 61

Giới thiệu :Lập trình mã nguồn mở

14 26796 79

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11377 543

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10573 468

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9857 108

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8910 1161

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8524 426

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8109 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7980 1825

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 7298 268

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

B2B Content Marketing: 2012 Benchmarks, Budgets & Trends

17 242 3 11-01-2025

Hướng dẫn chế độ dinh dưỡng cho người bệnh viêm khớp

5 177 2 11-01-2025

BÀI GIẢNG Biến Đổi Năng Lượng Điện Cơ - TS. Hồ Phạm Huy

137 167 1 11-01-2025

báo cáo hóa học:" Quality of data collection in a large HIV observational clinic database in sub-Saharan Africa: implications for clinical research and audit of care"

7 163 4 11-01-2025

Bệnh sán lá gan trên gia súc và cách phòng trị

3 170 1 11-01-2025

Báo cáo nghiên cứu khoa học " Vai trò chính quyền địa phương trong phát triển kinh tế : khu chuyên doanh gốm sứ ( Trung Quốc ) và Bát Tràng ( Việt Nam )("

11 218 1 11-01-2025

CUỘC KHÁNG CHIẾN CHỐNG THỰC DÂN PHÁP KẾT THÚC (1953 - 1954)_5

11 154 1 11-01-2025

Lập trình Java cơ bản : Luồng và xử lý file part 8

5 143 1 11-01-2025

Xinh xinh vườn nhà

6 135 0 11-01-2025

Phạm trù Chủ nghĩa cá nhân của tư tưởng phương Tây trong sự lý giải của Phan Khôi _1

9 138 0 11-01-2025

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8109 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7980 1825

Ebook Chào con ba mẹ đã sẵn sàng

112 4441 1376

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 6380 1279

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8910 1161

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3862 680

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3930 610

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4784 567

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11377 543

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4537 490