Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
Givcn a grammar for a language, it is possible to create finite state mechanisms that approximate its recognition capacity. These simple automata consider only short context information~ drawn from local syntactic constraints which the grammar hnposes. While it is short of providing the strong generative capacity of the grammar, such an approximation is useful for removing most word tagging ambiguities, identifying many cases of iU-fonncd input, and assisting efficiently in othcr natural language processing tasks. . | THE RECOGNITION CAPACITY OF LOCAL SYNTACTIC CONSTRAINTS Mori Rimon Jacky Herz2 The Computer Science Department The Hebrew University of Jerusalem Giv at Ram Jerusalem 91904 ISRAEL E-mail rimon@hujics.BITNET Abstract Given a grammar for a language it is possible to create finite state mechanisms that approximate its recognition capacity. These simple automata consider only short context information drawn from local syntactic constraints which the grammar imposes. While it is short of providing the strong generative capacity of the grammar such an approximation is useful for removing most word tagging ambiguities identifying many cases of ill-fonned input and assisting efficiently in other natural language processing tasks. Our basic approach to the acquisition and usage of local syntactic constraints was presented elsewhere in this paper we present some formal and empirical results pertaining to properties of the approximating automata. 1. Introduction Parsing is a process by which an input sentence is not only recognized as belonging to the language but is also assigned a structure. As Berwick Weinberg 84 comment recognition per sc i.e. a weak generative capacity analysis is not of much value for a theory of language understanding but it can be useful as a diagnostic . Wc claim that if an efficient recognition procedure is available it can be most valuable as a pre-parsing reducer of lexical ambiguity especially as Milne 86 points out for deterministic parsers and even more useful in applications where full parsing is not absolutely required -e.g. identification of ill-formed inputs in a text critique program. Still weaker than recognition procedures arc methods which approximate the recognition capacity. This is the kind of methods that we discuss in this paper. More specifically we analyze the recognition capacity of automata based on local short context considerations. In Herz Rimon 91 we presented our approach to the acquisition and usage of local syntactic .