TAILIEUCHUNG - Báo cáo khoa học: "String Transformation Learning"

String transformation systems have been introduced in (Brill, 1995) and have several applications in natural language processing. In this work we consider the computational problem of automatically learning from a given corpus the set of transformations presenting the best evidence. We introduce an original data structure and efficient algorithms that learn some families of transformations that are relevant for part-of-speech tagging and phonological rule systems. We also show that the same learning problem becomes NP-hard in cases of an unbounded use of d o n ' t care symbols in a transformation. . | string Transformation Learning Giorgio Satta John c. Henderson Dipartimento di Elettronica e Informatica Department of Computer Science Università di Padova via Gradenigo 6 A 1-35131 Padova Italy Johns Hopkins University Baltimore MD 21218-2694 j Abstract String transformation systems have been introduced in Brill 1995 and have several applications in natural language processing. In this work we consider the computational problem of automatically learning from a given corpus the set of transformations presenting the best evidence. We introduce an original data structure and efficient algorithms that learn some families of transformations that are relevant for part-of-speech tagging and phonological rule systems. We also show that the same learning problem becomes NP-hard in cases of an unbounded use of don t care symbols in a transformation. 1 Introduction Ordered sequences of rewriting rules are used in several applications in natural language processing including phonological and morphological systems Kaplan and Kay 1994 morphological disambiguation part-of-speech tagging and shallow syntactic parsing Brill 1995 Karlsson et aL 1995 . In Brill 1995 a learning paradigm called error-driven learning has been introduced for automatic induction of a specific kind of rewriting rules called transformations and it has been shown that the achieved accuracy of the resulting transformation systems is competitive with that of existing systems. In this work we further elaborate on the error-driven learning paradigm. Our main contribution is summarized in what follows. We consider some families of transformations and design efficient algorithms for the associated learning problem that improve existing methods. Our results are achieved by exploiting a data structure originally introduced in this work. This allows US to simultaneously represent and test the search space of all possible transformations. The transformations we investigate make

TỪ KHÓA LIÊN QUAN
TÀI LIỆU MỚI ĐĂNG
65    142    1    29-12-2024
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.