TAILIEUCHUNG - Báo cáo khoa học: "Text Chunking by Combining Hand-Crafted Rules and Memory-Based Learning"

This paper proposes a hybrid of handcrafted rules and a machine learning method for chunking Korean. In the partially free word-order languages such as Korean and Japanese, a small number of rules dominate the performance due to their well-developed postpositions and endings. Thus, the proposed method is primarily based on the rules, and then the residual errors are corrected by adopting a memory-based machine learning method. | Text Chunking by Combining Hand-Crafted Rules and Memory-Based Learning Seong-Bae Park Byoung-Tak Zhang School of Computer Science and Engineering Seoul National University Seoul 151-744 Korea sbpark btzhang @ Abstract This paper proposes a hybrid of handcrafted rules and a machine learning method for chunking Korean. In the partially free word-order languages such as Korean and Japanese a small number of rules dominate the performance due to their well-developed postpositions and endings. Thus the proposed method is primarily based on the rules and then the residual errors are corrected by adopting a memory-based machine learning method. Since the memory-based learning is an efficient method to handle exceptions in natural language processing it is good at checking whether the estimates are exceptional cases of the rules and revising them. An evaluation of the method yields the improvement in F-score over the rules or various machine learning methods alone. 1 Introduction Text chunking has been one of the most interesting problems in natural language learning community since the first work of Ramshaw and Marcus 1995 using a machine learning method. The main purpose of the machine learning methods applied to this task is to capture the hypothesis that best determine the chunk type of a word and such methods have shown relatively high performance in English Kudo and Matsumoto 2000 Zhang et. al 2001 . In order to do it various kinds of information such as lexical information part-of-speech and grammatical relation of the neighboring words is used. Since the position of a word plays an important role as a syntactic constraint in English the methods are successful even with local information. However these methods are not appropriate for chunking Korean and Japanese because such languages have a characteristic of partially free wordorder. That is there is a very weak positional constraint in these languages. Instead of positional constraints they have overt

Bích Trang 56 8 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Báo cáo khoa học: "A High-Performance Semi-Supervised Learning Method for Text Chunking"

9 74 0

Báo cáo khoa học: "Text Chunking using Regularized Winnow"

8 56 0

Báo cáo khoa học: "Text Chunking by Combining Hand-Crafted Rules and Memory-Based Learning"

8 47 0

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 461890 55

Giới thiệu :Lập trình mã nguồn mở

14 22748 61

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 10915 531

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10094 447

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9543 104

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8310 1127

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8252 423

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 7869 2221

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 6716 253

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 5833 1413

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

Giáo án mầm non chương trình đổi mới: Gia đình vui nhộn

4 313 1 03-05-2024

extremetech Hacking BlackBerry phần 9

31 254 0 03-05-2024

Trading Strategies Profit Making Techniques For Stock_8

23 177 1 03-05-2024

TƯƠNG QUAN GIỮA MÔ HỌC, GIẢI PHẪU VÀ HÌNH ẢNH CỦA CÁC KHỐI U PHẦN PHỤ

3 169 0 03-05-2024

báo cáo hóa học:" Endoscopic decompression for intraforaminal and extraforaminal nerve root compression"

7 108 0 03-05-2024

HƯỚNG DẪN SỬ DỤNG PHẦN MỀM CAITA part 9

18 131 0 03-05-2024

Kỹ thuật nuôi cá rồng part 5

7 128 0 03-05-2024

Lãi suất cơ bản, công cụ quan trọng của chính sách tiền tệ

5 115 0 03-05-2024

báo cáo hóa học:" Increased androgen receptor expression in serous carcinoma of the ovary is associated with an improved survival"

6 101 0 03-05-2024

Báo cáo nghiên cứu nông nghiệp " Field control of pest fruit flies in Vietnam "

14 118 0 03-05-2024

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 7869 2221

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 5833 1413

Ebook Chào con ba mẹ đã sẵn sàng

112 3773 1234

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 5339 1136

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8310 1127

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3522 645

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 10915 531

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3698 525

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4080 517

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4140 480