TAILIEUCHUNG - Báo cáo khoa học: "SUBLANGUAGES IN MACHINE TRANSLATION"

There have been various attempts at using the sublanguage notion for disambiguation and the selection of target language equivalents in machine translation. In this paper a theoretical concept and its implementation in a real MT application are presented. Above this, means of linguistic engineering like weighting mechanisms are proposed. | SUBLANGUAGES IN MACHINE TRANSLATION Heinz-Dirk Luckhardt Fachrichtung Informationswissenschaft Universitiit des Saarlandes D-6600 Saarbriicken Federal Republic of Germany ABSTRACT There have been various attempts at using the sublanguage notion for disambiguation and the selection of target language equivalents in machine translation. In this paper a theoretical concept and its implementation in a real MT application are presented. Above this means of linguistic engineering like weighting mechanisms are proposed. INTRODUCTION It has been proposed by a number of authors cf. Kittredge 1987 Kittredge Lehr-berger 1982 Luckhardt 1984 to use the sublanguage notion for solving some of the notorious problems in machine translation MT such as disambiguation and selection of target language equivalents. In the following I shall give a rough summary of what sublanguages can contribute to the solution of concrete MT problems. A SUBLANGUAGE CONCEPT FOR USE IN MT SYSTEMS To my knowledge it was z. Harris who introduced the term sublanguage cf. Harris 1968 152 for a portion of natural language differing from other portions of the same language syntactically and or lexically. Definitions are given by Hirschman Sager 1982 Quinlan 1989 and Lehrberger 1982 . In order to be able to use such characterizations in MT they have to be formalized in a way adequate to the MT system in question. Such formalizable properties were combined in the definition of Luckhardt 1984 of what sublanguage can mean for MT Text type represents the syntactic-syntagmatic level of a sublanguage for which only a rather weak differentiation can be proposed . running text word list nominal structures etc. . Subject field represents the lexical level of a sublanguage . for every sublanguage a subject field is determined as being characteristic so that the MT system may choose on the basis of the sublanguage of a text those ưanslation equivalents from the lexicon which carry the same subject field code as

TỪ KHÓA LIÊN QUAN
TÀI LIỆU HOT