TAILIEUCHUNG - Báo cáo khoa học: "Corpus-Oriented Development of Japanese HPSG Parsers"

This paper reports the corpus-oriented development of a wide-coverage Japanese HPSG parser. We ﬁrst created an HPSG treebank from the EDR corpus by using heuristic conversion rules, and then extracted lexical entries from the treebank. The grammar developed using this method attained wide coverage that could hardly be obtained by conventional manual development. We also trained a statistical parser for the grammar on the treebank, and evaluated the parser in terms of the accuracy of semantic-role identiﬁcation and dependency analysis. . | Corpus-Oriented Development of Japanese HPSG Parsers Kazuhiro Yoshida Department of Computer Science University of Tokyo 7-3-1 Hongo Bunkyo-ku Tokyo 113-0033 kyoshida@ Abstract This paper reports the corpus-oriented development of a wide-coverage Japanese HPSG parser. We first created an HPSG treebank from the EDR corpus by using heuristic conversion rules and then extracted lexical entries from the treebank. The grammar developed using this method attained wide coverage that could hardly be obtained by conventional manual development. We also trained a statistical parser for the grammar on the treebank and evaluated the parser in terms of the accuracy of semantic-role identification and dependency analysis. 1 Introduction In this study we report the corpus-oriented development of a Japanese HPSG parser using the EDR Japanese corpus 2002 . Although several researchers have attempted to utilize linguistic grammar theories such as LFG Bresnan and Kaplan 1982 CCG Steedman 2001 and HPSG Pollard and Sag 1994 for parsing real-world texts such attempts could hardly be successful because manual development of wide-coverage linguistically motivated grammars involves years of labor-intensive effort. Corpus-oriented grammar development is a grammar development method that has been proposed as a promising substitute for conventional manual development. In corpus-oriented methods a treebank of a target grammar is constructed first and various grammatical constraints are extracted from the treebank. Previous studies reported that wide-coverage grammars can be obtained at low cost by using this method. Hockenmaier and Steedman 2002 Miyao et al. 2004 The treebank can also be used for training statistical disambiguation models and hence we can construct a statistical parser for the extracted grammar. The corpus-oriented method enabled us to develop a Japanese HPSG parser with semantic information whose coverage on real-world sentences is . This high coverage

Huy Lĩnh 40 6 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 462351 61

Giới thiệu :Lập trình mã nguồn mở

14 26653 79

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11375 543

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10566 468

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9854 108

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8906 1161

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8518 426

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8109 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7912 1821

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 7289 268

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

Đóng mới oto 8 chỗ ngồi part 9

10 187 3 08-01-2025

báo cáo hóa học:" Increased androgen receptor expression in serous carcinoma of the ovary is associated with an improved survival"

6 162 3 08-01-2025

Bảng màu theo chữ cái – V

11 177 2 08-01-2025

Chương 10: Các phương pháp tính quá trình quá độ trong mạch điện tuyến tính

57 246 8 08-01-2025

báo cáo hóa học:" Perceptions of rewards among volunteer caregivers of people living with AIDS working in faith-based organizations in South Africa: a qualitative study"

10 162 1 08-01-2025

báo cáo hóa học:" Quality of data collection in a large HIV observational clinic database in sub-Saharan Africa: implications for clinical research and audit of care"

7 163 4 08-01-2025

Valve Selection Handbook - Fourth Edition

337 150 2 08-01-2025

Sáng kiến kinh nghiệm môn mỹ thuật

5 184 1 08-01-2025

Lập trình Java cơ bản : Luồng và xử lý file part 8

5 143 1 08-01-2025

Báo cáo lâm nghiệp: "Assessment of the effects of below-zero temperatures on photosynthesis and chlorophyll a fluorescence in leaf discs of Eucalyptus globulu"

4 152 0 08-01-2025

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8109 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7912 1821

Ebook Chào con ba mẹ đã sẵn sàng

112 4435 1376

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 6353 1276

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8906 1161

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3859 680

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3930 610

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4768 567

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11375 543

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4533 490