Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "SVM Model Tampering and Anchored Learning: A Case Study in Hebrew NP Chunking"

Thường Kiệt 64 8 pdf

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ Tải xuống

We study the issue of porting a known NLP method to a language with little existing NLP resources, speciﬁcally Hebrew SVM-based chunking. We introduce two SVM-based methods – Model Tampering and Anchored Learning. These allow ﬁne grained analysis of the learned SVM models, which provides guidance to identify errors in the training corpus, distinguish the role and interaction of lexical features and eventually construct a model with ∼10% error reduction. | SVM Model Tampering and Anchored Learning A Case Study in Hebrew NP Chunking Yoav Goldberg and Michael Elhadad Computer Science Department Ben Gurion University of the Negev P.O.B 653 Be er Sheva 84105 Israel yoavg elhadad@cs.bgu.ac.il Abstract We study the issue of porting a known NLP method to a language with little existing NLP resources specifically Hebrew SVM-based chunking. We introduce two SVM-based methods - Model Tampering and Anchored Learning. These allow fine grained analysis of the learned SVM models which provides guidance to identify errors in the training corpus distinguish the role and interaction of lexical features and eventually construct a model with 10 error reduction. The resulting chunker is shown to be robust in the presence of noise in the training corpus relies on less lexical features than was previously understood and achieves an F-measure performance of 92.2 on automatically PoS-tagged text. The SVM analysis methods also provide general insight on SVM-based chunking. 1 Introduction While high-quality NLP corpora and tools are available in English such resources are difficult to obtain in most other languages. Three challenges must be met when adapting results established in English to another language 1 acquiring high quality annotated data 2 adapting the English task definition to the nature of a different language and 3 adapting the algorithm to the new language. This paper presents a case study in the adaptation of a well known task to a language with few NLP resources available. Specifically we deal with SVM based Hebrew NP chunking. In Goldberg et al. 2006 we established that the task is not trivially transferable 224 to Hebrew but reported that SVM based chunking Kudo and Matsumoto 2000 performs well. We extend that work and study the problem from 3 angles 1 how to deal with a corpus that is smaller and with a higher level of noise than is available in English we propose techniques that help identify suspicious data points in t F

TÀI LIỆU LIÊN QUAN

Kỷ yếu tóm tắt báo cáo khoa học: Hội nghị khoa học tim mạch toàn quốc lần thứ XI - Hội tim mạch Quốc gia Việt Nam

Báo cáo nghiên cứu khoa học: "Danh lục các loài thú ở khu bảo tồn thiên nhiên Pù Huống tỉnh Nghệ An và ý nghĩa bảo tồn nguồn gen quí hiếm của chúng"

Báo cáo khoa học: Hỗ trợ nâng cao năng lực quản lý chất thải sinh hoạt tại thành phố Hội An

Báo cáo nghiên cứu khoa học: "Tính năng động nghệ thuật của văn học hiện đại Việt Nam và một cách nhìn hành trình thể loại"

Báo cáo nghiên cứu khoa học: " DỊCH CHUYỂN TRUY VẤN OQL VÀO CÁC PHÉP TÍNH BAO HÀM"

Báo cáo khoa học: " Áp dụng thủ tục phân tích trong kiểm toán báo cáo tài chính"

Báo cáo nghiên cứu khoa học: "Người lính trở về sau chiến tranh với mặc cảm “ăn mày dĩ vãng’ trong tiểu thuyết Chu Lai"

Báo cáo nghiên cứu khoa học: "Khảo sát hiện tượng chuyển đổi chức năng - nghĩa của động từ tiếng Việt"

Báo cáo nghiên cứu khoa học: " BẢN CHẤT KHOA HỌC VÀ CÁCH MẠNG LÀ CỘI NGUỒN SỨC SỐNG CỦA CHỦ NGHĨA MÁC - LÊNIN"

Báo cáo khoa học: " CẢI TIẾN CÁC THUẬT TOÁN MƯỢN VÀ KHOÁ KÊNH TẦN SỐ MẠNG DI ĐỘNG TẾ BÀO"

Đã phát hiện trình chặn quảng cáo AdBlock

Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.