TAILIEUCHUNG - Báo cáo khoa học: "Weakly Supervised Learning for Cross-document Person Name Disambiguation Supported by Information Extraction"

Among major categories of named entities (NEs, which in this paper refer to entity names, excluding the MUC time and numerical NEs), company and product names are often trademarked or uniquely registered, and hence less subject to name ambiguity. This paper focuses on cross-document disambiguation of person names. Previous research for cross-document name disambiguation applies vector space model (VSM) for context similarity, only using co-occurring words [Bagga & Baldwin 1998]. | Weakly Supervised Learning for Cross-document Person Name Disambiguation Supported by Information Extraction Cheng Niu Wei Li and Rohini K. Srihari Cymfony Inc. 600 Essjay Road Williamsville NY 14221 USA. cniu wei rohini @ Abstract It is fairly common that different people are associated with the same name. In tracking person entities in a large document pool it is important to determine whether multiple mentions of the same name across documents refer to the same entity or not. Previous approach to this problem involves measuring context similarity only based on co-occurring words. This paper presents a new algorithm using information extraction support in addition to co-occurring words. A learning scheme with minimal supervision is developed within the Bayesian framework. Maximum entropy modeling is then used to represent the probability distribution of context similarities based on heterogeneous features. Statistical annealing is applied to derive the final entity coreference chains by globally fitting the pairwise context similarities. Benchmarking shows that our new approach significantly outperforms the existing algorithm by 25 percentage points in overall F-measure. 1 Introduction Cross document name disambiguation is required for various tasks of knowledge discovery from textual documents such as entity tracking link discovery information fusion and event tracking. This task is part of the co-reference task if two mentions of the same name refer to same different entities by definition they should should not be co-referenced. As far as names are concerned co-reference consists of two sub-tasks i name disambiguation to handle the problem of different entities happening to use the same name ii alias association to handle the problem of the same entity using multiple names aliases . Message Understanding Conference MUC community has established within-document coreference standards MUC-7 1998 . Compared with within-document name disambiguation which

Ý Lan 70 8 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Báo cáo khoa học: "A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relation Extraction"

6 47 0

Báo cáo khoa học: "Weakly Supervised Learning of Presupposition Relations between Verbs"

6 75 1

Báo cáo khoa học: "Multi-Task Transfer Learning for Weakly-Supervised Relation Extraction"

9 49 0

Báo cáo khoa học: "Weakly-Supervised Acquisition of Open-Domain Classes and Class Attributes from Web Documents and Query Logs"

9 66 0

Báo cáo khoa học: "Hedge classiﬁcation in biomedical texts with a weakly supervised selection of keywords"

9 55 0

Báo cáo khoa học: "Weakly Supervised Named Entity Transliteration and Discovery from Multilingual Comparable Corpora"

8 80 0

Báo cáo khoa học: "Weakly Supervised Learning for Hedge Classiﬁcation in Scientiﬁc Literature"

8 69 0

Báo cáo khoa học: "Weakly Supervised Learning for Cross-document Person Name Disambiguation Supported by Information Extraction"

8 62 0

Báo cáo khoa học: "Weakly Supervised Part-of-Speech Tagging for Morphologically-Rich, Resource-Scarce Languages"

9 50 0

Báo cáo khoa học: "Weakly-Supervised Acquisition of Attributes over Conceptual Hierarchies"

9 57 0

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 462351 61

Giới thiệu :Lập trình mã nguồn mở

14 26682 79

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11375 543

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10567 468

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9855 108

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8906 1161

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8518 426

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8109 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7920 1821

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 7290 268

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

Giáo trình phân tích phương trình vi phân viết dưới dạng thuật toán đặc tính của hệ thống p1

5 170 1 09-01-2025

Báo cáo nghiên cứu nông nghiệp " Field control of pest fruit flies in Vietnam "

14 195 4 09-01-2025

Bảng màu theo chữ cái – V

11 177 2 09-01-2025

Giáo án điện tử tiểu học môn lịch sử: Cách mạng mùa thu

39 168 1 09-01-2025

Báo cáo " Bàn về hành vi pháp luật và hành vi đạo đức "

11 182 2 09-01-2025

Valve Selection Handbook - Fourth Edition

337 150 2 09-01-2025

ĐỀ TÀI " ĐÁNH GIÁ HIỆU QUẢ HOẠT ĐỘNG KINH DOANH NGOẠI HỐI CỦA NGÂN HÀNG THƯƠNG MẠI CỔ PHẦN XUẤT NHẬP KHẨU VIỆT NAM "

51 159 3 09-01-2025

báo cáo khoa học: "Malignant peripheral nerve sheath tumor arising from the greater omentum: Case report"

4 147 1 09-01-2025

Lịch sử Trung Quốc 5000 năm tập 3 part 2

54 157 1 09-01-2025

Determini prounoun 1

6 147 0 09-01-2025

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8109 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7920 1821

Ebook Chào con ba mẹ đã sẵn sàng

112 4436 1376

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 6360 1276

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8906 1161

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3859 680

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3930 610

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4778 567

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11375 543

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4533 490