TAILIEUCHUNG - Báo cáo khoa học: "Sequential Conditional Generalized Iterative Scaling"

We describe a speedup for training conditional maximum entropy models. The algorithm is a simple variation on Generalized Iterative Scaling, but converges roughly an order of magnitude faster, depending on the number of constraints, and the way speed is measured. Rather than attempting to train all model parameters simultaneously, the algorithm trains them sequentially. The algorithm is easy to implement, typically uses only slightly more memory, and will lead to improvements for most maximum entropy problems. . | Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics ACL Philadelphia July 2002 pp. 9-16. Sequential Conditional Generalized Iterative Scaling Joshua Goodman Microsoft Research One Microsoft Way Redmond WA 98052 joshuago@ Abstract We describe a speedup for training conditional maximum entropy models. The algorithm is a simple variation on Generalized Iterative Scaling but converges roughly an order of magnitude faster depending on the number of constraints and the way speed is measured. Rather than attempting to train all model parameters simultaneously the algorithm trains them sequentially. The algorithm is easy to implement typically uses only slightly more memory and will lead to improvements for most maximum entropy problems. 1 Introduction Conditional Maximum Entropy models have been used for a variety of natural language tasks including Language Modeling Rosenfeld 1994 part-of-speech tagging prepositional phrase attachment and parsing Ratnaparkhi 1998 word selection for machine translation Berger et al. 1996 and finding sentence boundaries Reynar and Ratnaparkhi 1997 . Unfortunately although maximum entropy maxent models can be applied very generally the typical training algorithm for maxent Generalized Iterative Scaling GIS Darroch and Ratcliff 1972 can be extremely slow. We have personally used up to a month of computer time to train a single model. There have been several attempts to speed up maxent training Della Pietra et al. 1997 Wu and Khu-danpur 2000 Goodman 2001 . However as we describe later each of these has suffered from applicability to a limited number of applications. Darroch and Ratcliff 1972 describe GIS for joint probabilities and mention a fast variation which appears to have been missed by the conditional maxent community. We show that this fast variation can also be used for conditional probabilities and that it is useful for a larger range of problems than traditional speedup techniques. .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.