TAILIEUCHUNG - Báo cáo khoa học: "Experiments in Graph-based Semi-Supervised Learning Methods for Class-Instance Acquisition"

Graph-based semi-supervised learning (SSL) algorithms have been successfully used to extract class-instance pairs from large unstructured and structured text collections. However, a careful comparison of different graph-based SSL algorithms on that task has been lacking. We compare three graph-based SSL algorithms for class-instance acquisition on a variety of graphs constructed from different domains. We ﬁnd that the recently proposed MAD algorithm is the most effective. | Experiments in Graph-based Semi-Supervised Learning Methods for Class-Instance Acquisition Partha Pratim Talukdar Search Labs Microsoft Research Mountain View CA 94043 partha@ Fernando Pereira Google Inc. Mountain View CA 94043 pereira@ Abstract Graph-based semi-supervised learning SSL algorithms have been successfully used to extract class-instance pairs from large unstructured and structured text collections. However a careful comparison of different graph-based SSL algorithms on that task has been lacking. We compare three graph-based SSL algorithms for class-instance acquisition on a variety of graphs constructed from different domains. We find that the recently proposed MAD algorithm is the most effective. We also show that class-instance extraction can be significantly improved by adding semantic information in the form of instance-attribute edges derived from an independently developed knowledge base. All of our code and data will be made publicly available to encourage reproducible research in this area. 1 Introduction Traditionally named-entity recognition NER has focused on a small number of broad classes such as person location organization. However those classes are too coarse to support important applications such as sense disambiguation semantic matching and textual inference in Web search. For those tasks we need a much larger inventory of specific classes and accurate classification of terms into those classes. While supervised learning methods perform well for traditional NER they are impractical for fine-grained classification because sufficient labeled data to train classifiers for all the classes is unavailable and would be very expensive to obtain. Research carried out while at the University of Pennsylvania Philadelphia PA USA. To overcome these difficulties seed-based information extraction methods have been developed over the years Hearst 1992 Riloff and Jones 1999 Etzioni et al. 2005 Talukdar et al. 2006 Van Durme and

Thiện Minh 68 9 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Analysis of experiments and design

0 73 0

Global Chemical Industry Compliance Programme GC-ICP

44 81 0

Analysis of experiments in design

0 71 0

Analysis of experiments of Design

748 67 1

Sách: Experiments Planning, Analysis, and Optimization Second Edition

1 63 0

Ebook The usborne big book of experiments

95 80 1

A free accessible individual-based simulator enabling virtual experiments on soil organic matter processes in classroom

16 112 2

Large scale experiments simulating hydrogen distribution in a spent fuel pool building during a hypothetical fuel uncovery accident scenario

12 78 0

Introduction to designing experiments

0 68 0

From macro- to micro-experiments: Specimen-size independent identification of plasticity and fracture properties

15 44 3

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 462343 61

Giới thiệu :Lập trình mã nguồn mở

14 26114 79

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11350 542

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10553 466

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9844 108

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8891 1161

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8507 426

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8101 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7765 1793

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 7274 268

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

Giáo án mầm non chương trình đổi mới: Gia đình vui nhộn

4 392 3 29-12-2024

Data Structures and Algorithms - Chapter 8: Heaps

41 188 5 29-12-2024

CHƯƠNG 2: RỦI RO THÂM HỤT TÀI KHÓA

28 160 1 29-12-2024

Báo cáo y học: "The Factors Influencing Depression Endpoints Research (FINDER) study: final results of Italian patients with depressio"

9 151 1 29-12-2024

Báo cáo " Bàn về hành vi pháp luật và hành vi đạo đức "

11 180 2 29-12-2024

ĐỀ TÀI " ĐÁNH GIÁ HIỆU QUẢ HOẠT ĐỘNG KINH DOANH NGOẠI HỐI CỦA NGÂN HÀNG THƯƠNG MẠI CỔ PHẦN XUẤT NHẬP KHẨU VIỆT NAM "

51 153 3 29-12-2024

Word Games with English 1

65 142 1 29-12-2024

báo cáo khoa học: "Malignant peripheral nerve sheath tumor arising from the greater omentum: Case report"

4 142 1 29-12-2024

CUỘC KHÁNG CHIẾN CHỐNG THỰC DÂN PHÁP KẾT THÚC (1953 - 1954)_5

11 148 1 29-12-2024

5 thói quen ăn uống hủy hoại hàm răng đẹp

5 171 1 29-12-2024

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8101 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7765 1793

Ebook Chào con ba mẹ đã sẵn sàng

112 4409 1371

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 6305 1268

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8891 1161

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3843 680

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3920 609

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4719 565

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11350 542

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4511 490