TAILIEUCHUNG - Báo cáo khoa học: "Nonlinear Evidence Fusion and Propagation for Hyponymy Relation Mining"

This paper focuses on mining the hyponymy (or is-a) relation from large-scale, open-domain web documents. A nonlinear probabilistic model is exploited to model the correlation between sentences in the aggregation of pattern matching results. Based on the model, we design a set of evidence combination and propagation algorithms. | Nonlinear Evidence Fusion and Propagation for Hyponymy Relation Mining Fan Zhang2 Shuming Shi1 Jing Liu2 Shuqi Sun3 Chin-Yew Lin1 1Microsoft Research Asia 2Nankai University China 3Harbin Institute of Technology China shumings cyl @ Abstract This paper focuses on mining the hyponymy or is-a relation from large-scale open-domain web documents. A nonlinear probabilistic model is exploited to model the correlation between sentences in the aggregation of pattern matching results. Based on the model we design a set of evidence combination and propagation algorithms. These significantly improve the result quality of existing approaches. Experimental results conducted on 500 million web pages and hypernym labels for 300 terms show over 20 performance improvement in terms of P@5 MAP and R-Precision. 1 Introduction An important task in text mining is the automatic extraction of entities and their lexical relations this has wide applications in natural language processing and web search. This paper focuses on mining the hyponymy or is-a relation from large-scale open-domain web documents. From the viewpoint of entity classification the problem is to automatically assign fine-grained class labels to terms. There have been a number of approaches Hearst 1992 Pantel Ravichandran 2004 Snow et al. 2005 Durme Pasca 2008 Talukdar et al. 2008 to address the problem. These methods typically exploited manually-designed or automatical- This work was performed when Fan Zhang and Shuqi Sun were interns at Microsoft Research Asia 1159 ly-learned patterns . NP such as NP NP like NP NP is a NP . Although some degree of success has been achieved with these efforts the results are still far from perfect in terms of both recall and precision. As will be demonstrated in this paper even by processing a large corpus of 500 million web pages with the most popular patterns we are not able to extract correct labels for many especially rare entities. Even for popular terms incorrect .

Duy Cẩn 60 10 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Báo cáo khoa học: "Nonlinear Evidence Fusion and Propagation for Hyponymy Relation Mining"

10 55 0

Molecular evidence of IGFBP-3 dependent and independent VD3 action and its nonlinear response on IGFBP-3 induction in prostate cancer cells

13 16 1

Nonlinear effects of fiscal policy on national saving - Empirical evidence from emerging Asian economies

13 88 0

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 461942 55

Giới thiệu :Lập trình mã nguồn mở

14 23096 64

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 10986 531

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10175 451

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9572 106

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8383 1132

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8278 423

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 7889 2228

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 6835 256

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 6104 1471

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

Management and Services Part 1

10 171 0 21-05-2024

THE ANTHROPOLOGY OF ONLINE COMMUNITIES BY Samuel M.Wilson and Leighton C. Peterson

19 161 0 21-05-2024

Đóng mới oto 8 chỗ ngồi part 9

10 127 0 21-05-2024

QUẢN LÝ CHẤT LƯỢNG KHÔNG KHÍ

75 145 0 21-05-2024

Data Structures and Algorithms - Chapter 9: Hashing

54 121 0 21-05-2024

báo cáo hóa học:" Rare ligamentum flavum cyst causing incapacitating lumbar spinal stenosis: Experience with 3 Chinese patients"

4 108 0 21-05-2024

Truyện kiếm hiệp - Duy ngã độc tôn phần 5/7

1 103 0 21-05-2024

MẪU CHỨNG CHỈ QUẢN LÝ VŨ KHÍ, VẬT LIỆU NỔ, CCHT

1 128 0 21-05-2024

Báo cáo nghiên cứu nông nghiệp " Biofertiliser inoculant technology for the growth of rice in Vietnam: Developing technical infrastructure for quality assurance and village production for farmers "

12 97 0 21-05-2024

Bảng màu theo chữ cái – V

11 108 0 21-05-2024

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 7889 2228

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 6104 1471

Ebook Chào con ba mẹ đã sẵn sàng

112 3788 1254

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 5412 1138

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8383 1132

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3552 656

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3754 544

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 10986 531

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4166 523

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4190 483