TAILIEUCHUNG - Báo cáo khoa học: "Monolingual Alignment by Edit Rate Computation on Sentential Paraphrase Pairs"

In this paper, we present a novel way of tackling the monolingual alignment problem on pairs of sentential paraphrases by means of edit rate computation. In order to inform the edit rate, information in the form of subsentential paraphrases is provided by a range of techniques built for different purposes. We show that the tunable TER-PLUS metric from Machine Translation evaluation can achieve good performance on this task and that it can effectively exploit information coming from complementary sources. . | Monolingual Alignment by Edit Rate Computation on Sentential Paraphrase Pairs Houda Bouamor Aurelien Max Anne Vilnat LIMSI-CNRS Univ. Paris Sud Orsay France @ Abstract In this paper we present a novel way of tackling the monolingual alignment problem on pairs of sentential paraphrases by means of edit rate computation. In order to inform the edit rate information in the form of subsenten-tial paraphrases is provided by a range of techniques built for different purposes. We show that the tunable TER-PLUS metric from Machine Translation evaluation can achieve good performance on this task and that it can effectively exploit information coming from complementary sources. 1 Introduction The acquisition of subsentential paraphrases has attracted a lot of attention recently Madnani and Dorr 2010 . Techniques are usually developed for extracting paraphrase candidates from specific types of corpora including monolingual parallel corpora Barzi-lay and McKeown 2001 monolingual comparable corpora Deleger and Zweigenbaum 2009 bilingual parallel corpora Bannard and Callison-Burch 2005 and edit histories of multi-authored text Max and Wisniewski 2010 . These approaches face two main issues which correspond to the typical measures of precision or how appropriate the extracted paraphrases are and of recall or how many of the paraphrases present in a given corpus can be found effectively. To start with both measures are often hard to compute in practice as 1 the definition of what makes an acceptable paraphrase pair is still a research question and 2 it is often impractical to extract a complete set of acceptable paraphrases 395 from most resources. Second as regards the precision of paraphrase acquisition techniques in particular it is notable that most works on paraphrase acquisition are not based on direct observation of larger paraphrase pairs. Even monolingual corpora obtained by pairing very closely related texts such as news headlines on the same .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.