Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
Data-driven grammatical function tag assignment has been studied for English using the Penn-II Treebank data. In this paper we address the question of whether such methods can be applied successfully to other languages and treebank resources. In addition to tag assignment accuracy and f-scores we also present results of a task-based evaluation. We use three machine-learning methods to assign Cast3LB function tags to sentences parsed with Bikel’s parser trained on the Cast3LB treebank. | Using Machine-Learning to Assign Function Labels to Parser Output for Spanish Grzegorz Chrupata1 and Josef van Genabith1 2 1 National Center for Language Technology Dublin City University Glasnevin Dublin 9 Ireland 2IBM Dublin Center for Advanced Studies grzegorz.chrupala@computing.dcu.ie josef@computing.dcu.ie Abstract Data-driven grammatical function tag assignment has been studied for English using the Penn-II Treebank data. In this paper we address the question of whether such methods can be applied successfully to other languages and treebank resources. In addition to tag assignment accuracy and f-scores we also present results of a task-based evaluation. We use three machine-learning methods to assign Cast3LB function tags to sentences parsed with Bikel s parser trained on the Cast3LB treebank. The best performing method SVM achieves an f-score of 86.87 on gold-standard trees and 66.67 on parser output - a statistically significant improvement of 6.74 over the baseline. In a task-based evaluation we generate LFG functional-structures from the functiontag-enriched trees. On this task we achive an f-score of 75.67 a statistically significant 3.4 improvement over the baseline. 1 Introduction The research presented in this paper forms part of an ongoing effort to develop methods to induce wide-coverage multilingual Lexical-Functional Grammar LFG Bresnan 2001 resources from treebanks by means of automatically associating LFG f-structure information with constituency trees produced by probabilistic parsers Cahill et al. 2004 . Inducing deep syntactic analyses from treebank data avoids the cost and time involved in manually creating wide-coverage resources. Lexical Functional Grammar f-structures provide a level of syntactic representation based on the notion of grammatical functions e.g. Subject Object Oblique Adjunct etc. . This level is more abstract and cross-linguistically more uniform than constituency trees. F-structures also include explicit encodings of .