TAILIEUCHUNG - Báo cáo khoa học: "A Cascaded Linear Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging"

We propose a cascaded linear model for joint Chinese word segmentation and partof-speech tagging. With a character-based perceptron as the core, combined with realvalued features such as language models, the cascaded model is able to efﬁciently utilize knowledge sources that are inconvenient to incorporate into the perceptron directly. Experiments show that the cascaded model achieves improved accuracies on both segmentation only and joint segmentation and part-of-speech tagging. On the Penn Chinese Treebank , we obtain an error reduction of on segmentation and 12% on joint segmentation and part-of-speech tagging over the perceptron-only baseline. . | A Cascaded Linear Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging Wenbin Jiang 1 Liang Huang Key Lab. of Intelligent Information Processing Institute of Computing Technology Chinese Academy of Sciences . Box 2704 Beijing 100190 China jiangwenbin@ Qun Liu 1 Yajuan Lu 1 Department of Computer Information Science University of Pennsylvania Levine Hall 3330 Walnut Street Philadelphia PA 19104 USA lhuang3@ Abstract We propose a cascaded linear model for joint Chinese word segmentation and part-of-speech tagging. With a character-based perceptron as the core combined with realvalued features such as language models the cascaded model is able to efficiently utilize knowledge sources that are inconvenient to incorporate into the perceptron directly. Experiments show that the cascaded model achieves improved accuracies on both segmentation only and joint segmentation and part-of-speech tagging. On the Penn Chinese Treebank we obtain an error reduction of on segmentation and 12 onjoint segmentation and part-of-speech tagging over the perceptron-only baseline. 1 Introduction Word segmentation and part-of-speech POS tagging are important tasks in computer processing of Chinese and other Asian languages. Several models were introduced for these problems for example the Hidden Markov Model HMM Rabiner 1989 Maximum Entropy Model ME Ratnaparkhi and Adwait 1996 and Conditional Random Fields CRFs Lafferty et al. 2001 . CRFs have the advantage of flexibility in representing features compared to generative ones such as HMM and usually behaves the best in the two tasks. Another widely used discriminative method is the perceptron algorithm Collins 2002 which achieves comparable performance to CRFs with much faster training so we base this work on the perceptron. To segment and tag a character sequence there are two strategies to choose performing POS tagging following segmentation or joint segmentation and POS tagging Joint S T . .

Nam Thanh 74 8 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Estimation of residual stress in welding of dissimilar metals at nuclear power plants using cascaded support vector regression

8 57 0

Nghiên cứu kỹ thuật điều chế sóng mang nghịch lưu một pha cascaded 5 bậc

9 53 3

Research " EXTERNAL AUDITOR'S EVALUTIONS OF INTERNAL : AUDIT WORK - A CASCADED INFERENCE APPROACH "

154 58 0

Báo cáo khoa học: "Simple Unsupervised Grammar Induction from Raw Text with Cascaded Finite State Models"

10 63 0

Báo cáo khoa học: "A Cascaded Linear Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging"

8 64 0

Báo cáo khoa học: "HAL-based Cascaded Model for Variable-Length Semantic Pattern Induction from Psychiatry Web Resources"

8 77 0

Báo cáo khoa học: "Resume Information Extraction with Cascaded Hybrid Model"

8 66 0

Báo cáo khoa học: "Cascaded Markov Models"

8 42 0

Báo cáo khoa học: "A Cascaded Finite-State Parser for Syntactic Analysis of Swedish"

4 51 0

Báo cáo khoa học: "A Cascaded Finite-State Parser for German"

4 48 0

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 462336 61

Giới thiệu :Lập trình mã nguồn mở

14 25946 79

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11336 542

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10544 466

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9836 108

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8885 1161

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8500 426

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8098 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7710 1789

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 7243 268

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

Đóng mới oto 8 chỗ ngồi part 9

10 178 3 24-12-2024

Báo cáo nghiên cứu nông nghiệp " Biofertiliser inoculant technology for the growth of rice in Vietnam: Developing technical infrastructure for quality assurance and village production for farmers "

12 144 2 24-12-2024

Hướng dẫn chế độ dinh dưỡng cho người bệnh viêm khớp

5 167 2 24-12-2024

Báo cáo " Bàn về hành vi pháp luật và hành vi đạo đức "

11 177 2 24-12-2024

Valve Selection Handbook - Fourth Edition

337 145 2 24-12-2024

IT Audit: EMC’s Journey to the Private Cloud

13 158 1 24-12-2024

Báo cáo lâm nghiệp: "Assessment of the effects of below-zero temperatures on photosynthesis and chlorophyll a fluorescence in leaf discs of Eucalyptus globulu"

4 140 0 24-12-2024

CÂU HỎI TRẮC NGHIỆM HSLS NƯỚC TIỂU

9 175 0 24-12-2024

ĐỀ LUYỆN THI ĐẠI HỌC MÔN: TIẾNG ANH - SỐ 3

4 128 1 24-12-2024

CÔNG NGHỆ MÔI TRƯỜNG - CHƯƠNG 5 CƠ SỞ QUÁ TRÌNH XỬ LÝ SINH HỌC

1 141 0 24-12-2024

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8098 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7710 1789

Ebook Chào con ba mẹ đã sẵn sàng

112 4406 1371

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 6275 1266

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8885 1161

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3836 680

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3918 609

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4703 565

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11336 542

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4502 490