Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
This paper describes how external resources can be used to improve parser performance for heavily lexicalised grammars, looking at both robustness and efficiency. In terms of robustness, we try using different types of external data to increase lexical coverage, and find that simple POS tags have the most effect, increasing coverage on unseen data by up to 45%. We also show that filtering lexical items in a supertagging manner is very effective in increasing efficiency. | Enhancing Performance of Lexicalised Grammars Rebecca Dridanf Valia Kordonif Jeremy Nicholsonft fDept of Computational Linguistics Saarland University and DFKI GmbH Germany tDept of Computer Science and Software Engineering and NICTA University of Melbourne Australia rdrid kordoni @coli.uni-sb.de jeremymn@csse.unimelb.edu.au Abstract This paper describes how external resources can be used to improve parser performance for heavily lexicalised grammars looking at both robustness and efficiency. In terms of robustness we try using different types of external data to increase lexical coverage and find that simple POS tags have the most effect increasing coverage on unseen data by up to 45 . We also show that filtering lexical items in a supertagging manner is very effective in increasing efficiency. Even using vanilla POS tags we achieve some efficiency gains but when using detailed lexical types as supertags we manage to halve parsing time with minimal loss of coverage or precision. 1 Introduction Heavily lexicalised grammars have been used in applications such as machine translation and information extraction because they can produce semantic structures which provide more information than less informed parsers. In particular because of the structural and semantic information attached to lexicon items these grammars do well at describing complex relationships like non-projectivity and center embedding. However the cost of this additional information sometimes makes deep parsers that use these grammars impractical. Firstly because if the information is not available the parsers may fail to produce an analysis a failure of robustness. Secondly the effect of analysing the extra information can slow the parser down causing efficiency problems. This paper describes experiments aimed at improving parser performance in these two areas by annotating the input given to one such deep parser the PET parser Callmeier 2000 which uses lex-icalised grammars developed under the HPSG