TAILIEUCHUNG - Báo cáo khoa học: "Word Alignment via Submodular Maximization over Matroids"

We cast the word alignment problem as maximizing a submodular function under matroid constraints. Our framework is able to express complex interactions between alignment components while remaining computationally efficient, thanks to the power and generality of submodular functions. We show that submodularity naturally arises when modeling word fertility. Experiments on the English-French Hansards alignment task show that our approach achieves lower alignment error rates compared to conventional matching based approaches. . | Word Alignment via Submodular Maximization over Matroids Hui Lin Dept. of Electrical Engineering University of Washington Seattle WA 98195 USA hlin@ Jeff Bilmes Dept. of Electrical Engineering University of Washington Seattle Wa 98195 USA bilmes@ Abstract We cast the word alignment problem as maximizing a submodular function under matroid constraints. Our framework is able to express complex interactions between alignment components while remaining computationally efficient thanks to the power and generality of submodular functions. We show that submodularity naturally arises when modeling word fertility. Experiments on the English-French Hansards alignment task show that our approach achieves lower alignment error rates compared to conventional matching based approaches. 1 Introduction Word alignment is a key component in most statistical machine translation systems. While classical approaches for word alignment are based on generative models . IBM models Brown et al. 1993 and HMM Vogel et al. 1996 word alignment can also be viewed as a matching problem where each word pair is associated with a score reflecting the desirability of aligning that pair and the alignment is then the highest scored matching under some constraints. Several matching-based approaches have been proposed in the past. Melamed 2000 introduces the competitive linking algorithm which greedily constructs matchings under the one-to-one mapping assumption. In Matusov et al. 2004 matchings are found using an algorithm for constructing a maximum weighted bipartite graph matching Schrijver 2003 where word pair scores come from alignment posteriors of generative models. Similarly Taskar et al. 2005 cast word alignment as a maximum weighted matching problem and propose a 170 framework for learning word pair scores as a function of arbitrary features of that pair. These approaches however have two potentially substantial limitations words have fertility of at most .

TỪ KHÓA LIÊN QUAN
TÀI LIỆU MỚI ĐĂNG
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.