TAILIEUCHUNG - Báo cáo khoa học: "Practical Glossing by Prioritised Tiling"

We present the design of a practical context-sensitive glosser, incorporating current techniques for lightweight linguistic analysis based on large-scale lexical resources. We outline a general model for ranking the possible translations of the words and expressions that make up a text. This information can be used by a simple resource-bounded algorithm, of complexity O(n log n) in sentence length, that determines a consistent gloss of best translations. We then describe how the results of the general ranking model may be approximated using a simple heuristic prioritisation scheme. . | Practical Glossing by Prioritised Tiling Victor Poznanski Pete Whitelock Jan Udens Steffan Corley Sharp Laboratories of Europe Ltd. Oxford Science Park Oxford 0X4 4GA United Kingdom vp pete jan steffan @ Abstract We present the design of a practical context-sensitive glosser incorporating current techniques for lightweight linguistic analysis based on large-scale lexical resources. We outline a general model for ranking the possible translations of the words and expressions that make up a text. This information can be used by a simple resource-bounded algorithm of complexity O n log n in sentence length that determines a consistent gloss of best translations. We then describe how the results of the general ranking model may be approximated using a simple heuristic prioritisation scheme. Finally we present a preliminary evaluation of the glosser s performance. 1 Introduction In a lexicalist MT framework such as Shake-and-Bake Whitelock 1994 ttanslation equivalence is defined between collections of suitably constrained lexical material in the two languages. Such an approach has been shown to be effective in the description of many types of complex bilingual equivalence. However the complexity of the associated parsing and generation phases leaves a system of this type some way from commercial exploitation. The parsing phase that is needed to establish adequate constraints on the words is of cubic complexity while the most general generation algorithm needed to order the words in the target text is ơ n4 Poznanski et al. 1996 . In this paper we show how a novel application domain glossing can be explored within such a framework by omitting generation entirely and replacing syntactic parsing by a simple combination of morphological analysis and tagging. The poverty of constraints established in this way and the consequent inaccuracy in translation is mitigated by providing a menu of alternatives for each gloss. The gloss is automatically updated in the light

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.