Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
Maximum entropy framework proved to be expressive and powerful for the statistical language modelling, but it suffers from the computational expensiveness of the model building. The iterative scaling algorithm that is used for the parameter estimation is computationally expensive while the feature selection process might require to estimate parameters for many candidate features many times. | Feature Lattices for Maximum Entropy Modelling Andrei Mikheev HCRC Language Technology Group University of Edinburgh 2 Buccleuch Place Edinburgh EH8 9LW Scotland UK. e-mail Andrei.Mikheev@ed.ac.uk Abstract Maximum entropy framework proved to be expressive and powerful for the statistical language modelling but it suffers from the computational expensiveness of the model building. The iterative scaling algorithm that is used for the parameter estimation is computationally expensive while the feature selection process might require to estimate parameters for many candidate features many times. In this paper we present a novel approach for building maximum entropy models. Our approach uses the feature collocation lattice and builds complex candidate features without resorting to iterative scaling. 1 Introduction Maximum entropy modelling has been recently introduced to the NLP community and proved to be an expressive and powerful framework. The maximum entropy model is a model which fits a set of pre-defined constraints and assumes maximum ignorance about everything which is not subject to its constraints thus assigning such cases with the most uniform distribution. The most uniform distribution will have the entropy on its maximum Because of its ability to handle overlapping features the maximum entropy framework provides a principle way to incorporate information from multiple knowledge sources. It is superior to traditionally used for this purpose linear interpolation and Katz back-off method. Rosenfeld 1996 evaluates in detail a maximum entropy language model which combines unigrams bigrams trigrams and long-distance trigger words and provides a thorough analysis of all the merits of the approach. Now at Harlequin Ltd. The iterative scaling algorithm Darroch Ratcliff 1972 applied for the parameter estimation of maximum entropy models computes a set of feature weights As which ensure that the model fits the reference distribution and does not make spurious .