TAILIEUCHUNG - Báo cáo khoa học: "Japanese Dependency Structure Analysis Based on Maximum Entropy Models"

This paper describes a dependency structure analysis of Japanese sentences based on the maximum entropy models. Our model is created by learning the weights of some features from a training corpus to predict the dependency between bunsetsus or phrasal units. The dependency accuracy of our system is using the Kyoto University corpus. We discuss the contribution of each feature set and the relationship between the number of training data and the accuracy. | Proceedings of EACL 99 Japanese Dependency structure Analysis Based on Maximum Entropy Models Kiyotaka Uchimoto Satoshi Sekine Hitoshi Isahara Communications Research Laboratory Ministry of Posts and Telecommunications 588-2 Iwaoka Iwaoka-cho Nishi-ku Kobe Hyogo 651-2401 Japan uch imot oIi sahara New York University 715 Broadway 7th floor New York NY 10003 USA Abstract This paper describes a dependency structure analysis of Japanese sentences based on the maximum entropy models. Our model is created by learning the weights of some features from a training corpus to predict the dependency between bunsetsus or phrasal units. The dependency accuracy of our system is using the Kyoto University corpus. We discuss the contribution of each feature set and the relationship between the number of training data and the accuracy. 1 Introduction Dependency structure analysis is one of the basic techniques in Japanese sentence analysis. The Japanese dependency structure is usually represented by the relationship between phrasal units called bunsetsu. The analysis has two conceptual steps. In the first step a dependency matrix is prepared. Each element of the matrix represents how likely one bunsetsu is to depend on the other. In the second step an optimal set of dependencies for the entire sentence is found. In this paper we will mainly discuss the first step a model for estimating dependency likelihood. So far there have been two different approaches to estimating the dependency likelihood. One is the rule-based approach in which the rules are created by experts and likelihoods are calculated by some means including semiautomatic corpusbased methods but also by manual assignment of scores for rules. However hand-crafted rules have the following problems. They have a problem with their coverage. Because there are many features to find correct dependencies it is difficult to find them manually. They also have a problem with their consistency .

Ái Khanh 75 8 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Báo cáo khoa học: "Using Smaller Constituents Rather Than Sentences in Active Learning for Japanese Dependency Parsing"

10 84 0

Báo cáo khoa học: "Dependency Parsing of Japanese Spoken Monologue Based on Clause Boundaries"

8 38 0

Báo cáo khoa học: "Japanese Dependency Parsing Using Co-occurrence Information and a Combination of Case Elements"

8 49 0

Báo cáo khoa học: "Detection of Quotations and Inserted Clauses and its Application to Dependency Structure Analysis in Spontaneous Japanese"

7 55 0

Báo cáo khoa học: "A Uniﬁed Single Scan Algorithm for Japanese Base Phrase Chunking and Dependency Parsing"

4 70 0

Báo cáo khoa học: "Simultaneous English-Japanese Spoken Language Translation Based on Incremental Dependency Parsing and Transfer"

8 72 0

Báo cáo khoa học: "Japanese Dependency Parsing Using Sequential Labeling for Semi-spoken Language"

4 43 0

Báo cáo khoa học: "Japanese Dependency Structure Analysis Based on Maximum Entropy Models"

8 53 0

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 462286 61

Giới thiệu :Lập trình mã nguồn mở

14 24867 79

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11283 542

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10510 466

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9786 108

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8876 1160

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8465 426

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8090 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7467 1763

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 7186 268

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

Báo cáo nghiên cứu khoa học " KẾT QUẢ NGHIÊN CỨU BƯỚC ĐẦU VỀ THIÊN ĐỊCH CHÂN KHỚP TRÊN CÂY THANH TRÀ Ở THỪA THIÊN HUẾ "

7 261 4 24-11-2024

Data Structures and Algorithms - Chapter 8: Heaps

41 172 5 24-11-2024

Báo cáo nghiên cứu nông nghiệp " Biofertiliser inoculant technology for the growth of rice in Vietnam: Developing technical infrastructure for quality assurance and village production for farmers "

12 132 2 24-11-2024

Quy Trình Canh Tác Cây Bông Vải

8 148 2 24-11-2024

Bảng màu theo chữ cái – V

11 153 2 24-11-2024

Chương 10: Các phương pháp tính quá trình quá độ trong mạch điện tuyến tính

57 226 7 24-11-2024

BÀI GIẢNG Biến Đổi Năng Lượng Điện Cơ - TS. Hồ Phạm Huy

137 146 1 24-11-2024

Đề tài " Dự báo về tác động của Tổ chức Thương mại Thế giới WTO đối với các doanh nghiệp xuất khẩu vừa và nhỏ Việt Nam – Những giải pháp đề xuất "

72 177 2 24-11-2024

Báo cáo nghiên cứu khoa học " Sự nhất quán phát triển kinh tế thị trường XHCN trong xây dựng xã hội hài hoà của Trung Quốc và đổi mới của Việt Nam "

8 138 1 24-11-2024

Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Introduction to Data Mining

101 133 1 24-11-2024

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8090 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7467 1763

Ebook Chào con ba mẹ đã sẵn sàng

112 4364 1369

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 6151 1258

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8876 1160

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3787 680

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3909 609

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4615 562

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11283 542

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4449 490