Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
This paper proposes a framework for training Conditional Random Fields (CRFs) to optimize multivariate evaluation measures, including non-linear measures such as F-score. Our proposed framework is derived from an error minimization approach that provides a simple solution for directly optimizing any evaluation measure. Specifically focusing on sequential segmentation tasks, i.e. text chunking and named entity recognition, we introduce a loss function that closely reflects the target evaluation measure for these tasks, namely, segmentation F-score. . | Training Conditional Random Fields with Multivariate Evaluation Measures Jun Suzuki Erik McDermott and Hideki Isozaki NTT Communication Science Laboratories NTT Corp. 2-4 Hikaridai Seika-cho Soraku-gun Kyoto 619-0237 Japan jun mcd isozaki @cslab.kecl.ntt.co.jp Abstract This paper proposes a framework for training Conditional Random Fields CRFs to optimize multivariate evaluation measures including non-linear measures such as F-score. Our proposed framework is derived from an error minimization approach that provides a simple solution for directly optimizing any evaluation measure. Specifically focusing on sequential segmentation tasks i.e. text chunking and named entity recognition we introduce a loss function that closely reflects the target evaluation measure for these tasks namely segmentation F-score. Our experiments show that our method performs better than standard CRF training. 1 Introduction Conditional random fields CRFs are a recently introduced formalism Lafferty et al. 2001 for representing a conditional model p y x where both a set of inputs x and a set of outputs y display non-trivial interdependency. CRFs are basically defined as a discriminative model of Markov random fields conditioned on inputs observations x. Unlike generative models CRFs model only the output y s distribution over x. This allows CRFs to use flexible features such as complicated functions of multiple observations. The modeling power of CRFs has been of great benefit in several applications such as shallow parsing Sha and Pereira 2003 and information extraction McCallum and Li 2003 . Since the introduction of CRFs intensive research has been undertaken to boost their effectiveness. The first approach to estimating CRF parameters is the maximum likelihood ML criterion over conditional probability p yjx itself Lafferty et al. 2001 . The ML criterion however is prone to over-fitting the training data especially since CRFs are often trained with a very large number of correlated .