TAILIEUCHUNG - Báo cáo khoa học: "Improvements in Analogical Learning: Application to Translating multi-Terms of the Medical Domain"

Philippe Langlais DIRO Univ. of Montreal, Canada felipe@ Francois Yvon and Pierre Zweigenbaum LIMSI-CNRS Univ. Paris-Sud XI, France {yvon,pz}@ Abstract Handling terminology is an important matter in a translation workflow. However, current Machine Translation (MT) systems do not yet propose anything proactive upon tools which assist in managing terminological databases. In this work, we investigate several enhancements to analogical learning and test our implementation on translating medical terms. We show that the analogical engine works equally well when translating from and into a morphologically rich language, or when dealing with language pairs written in different scripts. Combining it with a phrasebased. | Improvements in Analogical Learning Application to Translating multi-Terms of the Medical Domain Philippe Langlais DIRO Univ. of Montreal Canada felipe@ Francois Yvon and Pierre Zweigenbaum LIMSI-CNRS Univ. Paris-Sud XI France yvon pz @ Abstract Handling terminology is an important matter in a translation workflow. However current Machine Translation MT systems do not yet propose anything proactive upon tools which assist in managing terminological databases. In this work we investigate several enhancements to analogical learning and test our implementation on translating medical terms. We show that the analogical engine works equally well when translating from and into a morphologically rich language or when dealing with language pairs written in different scripts. Combining it with a phrasebased statistical engine leads to significant improvements. 1 Introduction If machine translation is to meet commercial needs it must offer a sensible approach to translating terms. Currently MT systems offer at best database management tools which allow a human typically a translator a terminologist or even the vendor of the system to specify bilingual terminological entries. More advanced tools are meant to identify inconsistencies in terminological translations and might prove useful in controlled-language situations Itagaki et al. 2007 . One approach to translate terms consists in using a domain-specific parallel corpus with standard alignment techniques Brown et al. 1993 to mine new translations. Massive amounts of parallel data are certainly available in several pairs of languages for domains such as parliament debates or the like. However having at our disposal a domain-specific . computer science bitext with an adequate coverage is another issue. One might argue that domain-specific comparable or perhaps unrelated corpora are easier to acquire in which case context-vector techniques Rapp 1995 Fung and McKeown 1997 can be used to identify the .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.