Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Investigating GIS and Smoothing for Maximum Entropy Taggers"

Ngọc Bích 61 8 pdf

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ Tải xuống

This paper investigates two elements of Maximum Entropy tagging: the use of a correction feature in the Generalised Iterative Scaling (Gis) estimation algorithm, and techniques for model smoothing. We show analytically and empirically that the correction feature, assumed to be required for the correctness of GIS, is unnecessary. We also explore the use of a Gaussian prior and a simple cutoff for smoothing. The experiments are performed with two tagsets: the standard Penn Treebank POS tagset and the larger set of lexical types from Combinatory Categorial Grammar. . | Investigating GIS and Smoothing for Maximum Entropy Taggers James R. Curran and Stephen Clark School of Informatics University of Edinburgh 2 Buccleuch Place Edinburgh. EH8 9LW jamesc stephenc @cogsci.ed.ac.uk Abstract This paper investigates two elements of Maximum Entropy tagging the use of a coưection feature in the Generalised Iterative Scaling Gis estimation algorithm and techniques for model smoothing. We show analytically and empirically that the correction feature assumed to be required for the correctness of GIS is unnecessary. We also explore the use of a Gaussian prior and a simple cutoff for smoothing. The experiments are performed with two tagsets the standard Penn Treebank POS tagset and the larger set of lexical types from Combinatory Categorial Grammar. 1 Introduction The use of maximum entropy ME models has become popular in Statistical NLP some example applications include part-of-speech POS tagging Ratnaparkhi 1996 parsing Ratnaparkhi 1999 Johnson et al. 1999 and language modelling Rosenfeld 1996 . Many tagging problems have been successfully modelled in the ME framework including POS tagging with state of the art performance van Halteren et al. 2001 supertagging Clark 2002 and chunking Koeling 2000 . Generalised Iterative Scaling GIS is a very simple algorithm for estimating the parameters of a ME model. The original formulation of GIS Dar-roch and Ratcliff 1972 required the sum of the feature values for each event to be constant. Since this is not the case for many applications the standard method is to add a correction or slack feature to each event. Improved Iterative Scaling IIS Berger et al. 1996 Della Pietra et al. 1997 eliminated the correction feature to improve the convergence rate of the algorithm. However the extra book keeping required for IIS means that GIS is often faster in practice Malouf 2002 . This paper shows by a simple adaptation of Berger s proof for the convergence of IIS Berger 1997 that GIS does not require a correction

TÀI LIỆU LIÊN QUAN

Báo cáo khoa học: "Investigating Pitch Accent Recognition in Non-native Speech"

Báo cáo khoa học: "Investigating Cue Selection and Placement in Tutorial Discourse"

Báo cáo nghiên cứu khoa học: "INVESTIGATING THE ECTOMYCORRHIZAL APPEARANCE OF SEEDLINGS IN TAN PHU FOREST ENTERPRISE’S NURSERY, DONG NAI PROVINCE"

báo cáo khoa học: " Sticky knowledge: A possible model for investigating implementation in healthcare contexts"

Báo cáo khoa học: " A long-term follow-up study investigating health-related quality of life and resource use in survivors of severe sepsis: comparison of recombinant human activated protein C with standard care"

Báo cáo khoa hoc:" Investigating the synchronization of hippocampal neural network in response to acute nicotine exposure"

Báo cáo khoa hoc:" Investigating the Influence of PFC Transection and Nicotine on Dynamics of AMPA and NMDA Receptors of VTA Dopaminergic Neurons"

Báo cáo khoa hoc:" Investigating the complexity of respiratory patterns during the laryngeal chemoreflex"

báo cáo khoa học: " Investigating the complementary value of discrete choice experiments for the evaluation of barriers and facilitators in implementation research: a questionnaire survey"

Báo cáo y học: " Optimizing automated characterization of liver fibrosis histological images by investigating color spaces at different resolutions"

Đã phát hiện trình chặn quảng cáo AdBlock

Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.