TAILIEUCHUNG - Báo cáo khoa học: "Alternative Approaches for Generating Bodies of Grammar Rules"

We compare two approaches for describing and generating bodies of rules used for natural language parsing. In today’s parsers rule bodies do not exist a priori but are generated on the ﬂy, usually with methods based on n-grams, which are one particular way of inducing probabilistic regular languages. We compare two approaches for inducing such languages. One is based on n-grams, the other on minimization of the Kullback-Leibler divergence. | Alternative Approaches for Generating Bodies of Grammar Rules Gabriel Infante-Lopez and Maarten de Rijke Informatics Institute University of Amsterdam infante mdr @ Abstract We compare two approaches for describing and generating bodies of rules used for natural language parsing. In today s parsers rule bodies do not exist a priori but are generated on the fly usually with methods based on n-grams which are one particular way of inducing probabilistic regular languages. We compare two approaches for inducing such languages. One is based on n-grams the other on minimization of the Kullback-Leibler divergence. The inferred regular languages are used for generating bodies of rules inside a parsing procedure. We compare the two approaches along two dimensions the quality of the probabilistic regular language they produce and the performance of the parser they were used to build. The second approach outperforms the first one along both dimensions. 1 Introduction N-grams have had a big impact on the state of the art in natural language parsing. They are central to many parsing models Charniak 1997 Collins 1997 2000 Eisner 1996 and despite their simplicity n-gram models have been very successful. Modeling with n-grams is an induction task Gold 1967 . Given a sample set of strings the task is to guess the grammar that produced that sample. Usually the grammar is not be chosen from an arbitrary set of possible grammars but from a given class. Hence grammar induction consists of two parts choosing the class of languages amongst which to search and designing the procedure for performing the search. By using n-grams for grammar induction one addresses the two parts in one go. In particular the use of n-grams implies that the solution will be searched for in the class of probabilistic regular languages since n-grams induce probabilistic automata and consequently probabilistic regular languages. However the class of probabilistic regular languages induced using .

Kim Phượng 85 8 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Báo cáo khoa học: "Alternative Approaches for Generating Bodies of Grammar Rules"

8 60 0

ALTERNATIVE BREAST IMAGING FOUR MODEL-BASED APPROACHES

270 62 0

Anesthesia in cosmetic surgery: Part 2

152 61 1

Complementary and Alternative Medicine for Older Adults A Guide to Holistic Approaches to Healthy Aging

344 56 0

The Nature of Northern Australia Natural values, ecological processes and future prospects

136 36 0

Marrying Prevention and Resiliency - Balancing Approaches to an Uncertain Terrorist Threat

1 41 0

Second Review of a New Data Management System for the Social Security Administration

61 60 0

Call for evidence - Implementing measures on the alternative investment fund managers directive

1 49 0

Insight into Alternative Approaches for Control of Avian Influenza in Poultry, with Emphasis on Highly Pathogenic H5N1

30 53 0

Ebook Manual of otologic surgery: Part 2

35 51 0

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 462340 61

Giới thiệu :Lập trình mã nguồn mở

14 26025 79

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11345 542

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10550 466

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9841 108

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8889 1161

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8504 426

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8100 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7738 1790

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 7263 268

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

B2B Content Marketing: 2012 Benchmarks, Budgets & Trends

17 229 3 27-12-2024

Đóng mới oto 8 chỗ ngồi part 9

10 179 3 27-12-2024

Báo cáo nghiên cứu khoa học " HÃY LÀM CHO HUẾ XANH HƠN VÀ ĐẸP HƠN "

6 180 3 27-12-2024

BÀI GIẢNG Biến Đổi Năng Lượng Điện Cơ - TS. Hồ Phạm Huy

137 158 1 27-12-2024

CHƯƠNG 2: RỦI RO THÂM HỤT TÀI KHÓA

28 158 1 27-12-2024

Báo cáo nghiên cứu khoa học " Vai trò chính quyền địa phương trong phát triển kinh tế : khu chuyên doanh gốm sứ ( Trung Quốc ) và Bát Tràng ( Việt Nam )("

11 213 1 27-12-2024

Xinh xinh vườn nhà

6 131 0 27-12-2024

Phạm trù Chủ nghĩa cá nhân của tư tưởng phương Tây trong sự lý giải của Phan Khôi _1

9 130 0 27-12-2024

Báo cáo khoa học: "A rare coexistence of adrenal cavernous hemangioma with extramedullar hemopoietic tissue: a case report and brief review of the literature"

4 106 0 27-12-2024

Giáo trình môn cầu đường

26 136 2 27-12-2024

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8100 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7738 1790

Ebook Chào con ba mẹ đã sẵn sàng

112 4406 1371

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 6283 1266

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8889 1161

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3839 680

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3919 609

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4708 565

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11345 542

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4508 490