TAILIEUCHUNG - Báo cáo khoa học: "An alternative method of training probabilistic LR parsers"

We discuss existing approaches to train LR parsers, which have been used for statistical resolution of structural ambiguity. These approaches are nonoptimal, in the sense that a collection of probability distributions cannot be obtained. In particular, some probability distributions expressible in terms of a context-free grammar cannot be expressed in terms of the LR parser constructed from that grammar, under the restrictions of the existing approaches to training of LR parsers. We present an alternative way of training that is provably optimal, and that allows all probability distributions expressible in the context-free grammar to be carried over to the. | An alternative method of training probabilistic LR parsers Mark-Jan Nederhof Faculty of Arts University of Groningen . Box 716 NL-9700 AS Groningen The Netherlands markjan@ Abstract We discuss existing approaches to train LR parsers which have been used for statistical resolution of structural ambiguity. These approaches are non-optimal in the sense that a collection of probability distributions cannot be obtained. In particular some probability distributions expressible in terms of a context-free grammar cannot be expressed in terms of the LR parser constructed from that grammar under the restrictions of the existing approaches to training of LR parsers. We present an alternative way of training that is provably optimal and that allows all probability distributions expressible in the context-free grammar to be carried over to the LR parser. We also demonstrate empirically that this kind of training can be effectively applied on a large treebank. 1 Introduction The LR parsing strategy was originally devised for programming languages Sippu and Soisalon-Soininen 1990 but has been used in a wide range of other areas as well such as for natural language processing Lavie and Tomita 1993 Briscoe and Carroll 1993 Ruland 2000 . The main difference between the application to programming languages and the application to natural languages is that in the latter case the parsers should be nondetermin-istic in order to deal with ambiguous context-free grammars CFGs . Nondeterminism can be handled in a number of ways but the most efficient is tabulation which allows processing in polynomial time. Tabular LR parsing is known from the work by Tomita 1986 but can also be achieved by the generic tabulation technique due to Lang 1974 Billot and Lang 1989 which assumes an input pushdown transducer PDT . In this context the LR parsing strategy can be seen as a particular mapping from context-free grammars to PDTs. The acronym LR stands for Left-to-right processing of the .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.