TAILIEUCHUNG - Báo cáo khoa học: "Improving Machine Translation of Null Subjects in Italian and Spanish"

Null subjects are non overtly expressed subject pronouns found in pro-drop languages such as Italian and Spanish. In this study we quantify and compare the occurrence of this phenomenon in these two languages. Next, we evaluate null subjects’ translation into French, a “non prodrop” language. We use the Europarl corpus to evaluate two MT systems on their performance regarding null subject translation: Its-2, a rule-based system developed at LATL, and a statistical system built using the Moses toolkit. . | Improving Machine Translation of Null Subjects in Italian and Spanish Lorenza Russo Sharid Loaiciga Asheesh Gulati Language Technology Laboratory LATL Department of Linguistics - University of Geneva 2 rue de Candolle - CH-1211 Geneva 4 - Switzerland @ Abstract Null subjects are non overtly expressed subject pronouns found in pro-drop languages such as Italian and Spanish. In this study we quantify and compare the occurrence of this phenomenon in these two languages. Next we evaluate null subjects translation into French a non prodrop language. We use the Europarl corpus to evaluate two MT systems on their performance regarding null subject translation Its-2 a rule-based system developed at LATL and a statistical system built using the Moses toolkit. Then we add a rule-based preprocessor and a statistical post-editor to the Its-2 translation pipeline. A second evaluation of the improved Its-2 system shows an average increase of in correct pro-drop translations for Italian-French and for Spanish-French. 1 Introduction Romance languages are characterized by some morphological and syntactical similarities. Italian and Spanish the two languages we are interested in here share the null subject parameter also called the pro-drop parameter among other characteristics. The null subject parameter refers to whether the subject of a sentence is overtly expressed or not Haegeman 1994 . In other words due to their rich morphology Italian and Spanish allow non lexically-realized subject pronouns also called null subjects zero pronouns or pro-drop .1 From a monolingual point of view regarding Spanish previous work by Ferrandez and Peral 1Henceforth the terms will be used indiscriminately. 2000 has shown that 46 of verbs in their test corpus had their subjects omitted. Continuation of this work by Rello and Ilisei 2009 has found that in a corpus of 2 606 sentences there were 1 042 sentences without overtly expressed

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.