TAILIEUCHUNG - Báo cáo khoa học: "Private Access to Phrase Tables for Statistical Machine Translation"

Some Statistical Machine Translation systems never see the light because the owner of the appropriate training data cannot release them, and the potential user of the system cannot disclose what should be translated. We propose a simple and practical encryption-based method addressing this barrier. | Private Access to Phrase Tables for Statistical Machine Translation Nicola Cancedda Xerox Research Centre Europe 6 chemin de Maupertuis 38240 Meylan France Abstract Some Statistical Machine Translation systems never see the light because the owner of the appropriate training data cannot release them and the potential user of the system cannot disclose what should be translated. We propose a simple and practical encryption-based method addressing this barrier. 1 Introduction It is generally taken for granted that whoever is deploying a Statistical Machine Translation SMT system has unrestricted rights to access and use the parallel data required for its training. This is not always the case. The ideal resources for training SMT models are Translation Memories TM especially when they are large well maintained coherent in genre and topic and aligned with the application of interest. Such TMs are cherished as valuable assets by their owners who rarely accept to give away wholesale rights to their use. At the same time the prospective user of the SMT system that could be derived from such TM might be subject to confidentiality constraints on the text stream needing translation so that sending out text to translate to an SMT system deployed by the owner of the PT is not an option. We propose an encryption-based method that addresses such conflicting constraints. In this method the owner of the TM generates a Phrase Table PT from it and makes it accessible to the user following a special procedure. An SMT decoder is deployed 23 by the user with all the required resources to operate except the PT1. As a result of following the proposed procedure The user acquires all and only the phrase table entries required to perform the decoding of a specific file thus avoiding complete transfer of the TM to the user The owner of the PT does not learn anything about what is being translated thus satisfying the user s confidentiality constraints The owner

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.