TAILIEUCHUNG - Báo cáo khoa học: "Investigating GIS and Smoothing for Maximum Entropy Taggers"

This paper investigates two elements of Maximum Entropy tagging: the use of a correction feature in the Generalised Iterative Scaling (Gis) estimation algorithm, and techniques for model smoothing. We show analytically and empirically that the correction feature, assumed to be required for the correctness of GIS, is unnecessary. We also explore the use of a Gaussian prior and a simple cutoff for smoothing. The experiments are performed with two tagsets: the standard Penn Treebank POS tagset and the larger set of lexical types from Combinatory Categorial Grammar. . | Investigating GIS and Smoothing for Maximum Entropy Taggers James R. Curran and Stephen Clark School of Informatics University of Edinburgh 2 Buccleuch Place Edinburgh. EH8 9LW jamesc stephenc @ Abstract This paper investigates two elements of Maximum Entropy tagging the use of a coưection feature in the Generalised Iterative Scaling Gis estimation algorithm and techniques for model smoothing. We show analytically and empirically that the correction feature assumed to be required for the correctness of GIS is unnecessary. We also explore the use of a Gaussian prior and a simple cutoff for smoothing. The experiments are performed with two tagsets the standard Penn Treebank POS tagset and the larger set of lexical types from Combinatory Categorial Grammar. 1 Introduction The use of maximum entropy ME models has become popular in Statistical NLP some example applications include part-of-speech POS tagging Ratnaparkhi 1996 parsing Ratnaparkhi 1999 Johnson et al. 1999 and language modelling Rosenfeld 1996 . Many tagging problems have been successfully modelled in the ME framework including POS tagging with state of the art performance van Halteren et al. 2001 supertagging Clark 2002 and chunking Koeling 2000 . Generalised Iterative Scaling GIS is a very simple algorithm for estimating the parameters of a ME model. The original formulation of GIS Dar-roch and Ratcliff 1972 required the sum of the feature values for each event to be constant. Since this is not the case for many applications the standard method is to add a correction or slack feature to each event. Improved Iterative Scaling IIS Berger et al. 1996 Della Pietra et al. 1997 eliminated the correction feature to improve the convergence rate of the algorithm. However the extra book keeping required for IIS means that GIS is often faster in practice Malouf 2002 . This paper shows by a simple adaptation of Berger s proof for the convergence of IIS Berger 1997 that GIS does not require a correction

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.