Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
We propose a cascaded linear model for joint Chinese word segmentation and partof-speech tagging. With a character-based perceptron as the core, combined with realvalued features such as language models, the cascaded model is able to efficiently utilize knowledge sources that are inconvenient to incorporate into the perceptron directly. Experiments show that the cascaded model achieves improved accuracies on both segmentation only and joint segmentation and part-of-speech tagging. On the Penn Chinese Treebank 5.0, we obtain an error reduction of 18.5% on segmentation and 12% on joint segmentation and part-of-speech tagging over the perceptron-only baseline. . | A Cascaded Linear Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging Wenbin Jiang 1 Liang Huang Key Lab. of Intelligent Information Processing Institute of Computing Technology Chinese Academy of Sciences P.O. Box 2704 Beijing 100190 China jiangwenbin@ict.ac.cn Qun Liu 1 Yajuan Lu 1 Department of Computer Information Science University of Pennsylvania Levine Hall 3330 Walnut Street Philadelphia PA 19104 USA lhuang3@cis.upenn.edu Abstract We propose a cascaded linear model for joint Chinese word segmentation and part-of-speech tagging. With a character-based perceptron as the core combined with realvalued features such as language models the cascaded model is able to efficiently utilize knowledge sources that are inconvenient to incorporate into the perceptron directly. Experiments show that the cascaded model achieves improved accuracies on both segmentation only and joint segmentation and part-of-speech tagging. On the Penn Chinese Treebank 5.0 we obtain an error reduction of 18.5 on segmentation and 12 onjoint segmentation and part-of-speech tagging over the perceptron-only baseline. 1 Introduction Word segmentation and part-of-speech POS tagging are important tasks in computer processing of Chinese and other Asian languages. Several models were introduced for these problems for example the Hidden Markov Model HMM Rabiner 1989 Maximum Entropy Model ME Ratnaparkhi and Adwait 1996 and Conditional Random Fields CRFs Lafferty et al. 2001 . CRFs have the advantage of flexibility in representing features compared to generative ones such as HMM and usually behaves the best in the two tasks. Another widely used discriminative method is the perceptron algorithm Collins 2002 which achieves comparable performance to CRFs with much faster training so we base this work on the perceptron. To segment and tag a character sequence there are two strategies to choose performing POS tagging following segmentation or joint segmentation and POS tagging Joint S T . .