TAILIEUCHUNG - Báo cáo khoa học: "A Probabilistic Model for Canonicalizing Named Entity Mentions"

We present a statistical model for canonicalizing named entity mentions into a table whose rows represent entities and whose columns are attributes (or parts of attributes). The model is novel in that it incorporates entity context, surface features, ﬁrstorder dependencies among attribute-parts, and a notion of noise. | A Probabilistic Model for Canonicalizing Named Entity Mentions Dani Yogatama Yanchuan Sim Noah A. Smith Language Technologies Institute Carnegie Mellon University PittsbUrgh PA 15213 UsA dyogatama ysim nasmith @ Abstract We present a statistical model for canonicalizing named entity mentions into a table whose rows represent entities and whose columns are attributes or parts of attributes . The model is novel in that it incorporates entity context surface features first-order dependencies among attribute-parts and a notion of noise. Transductive learning from a few seeds and a collection of mention tokens combines Bayesian inference and conditional estimation. We evaluate our model and its components on two datasets collected from political blogs and sports news finding that it outperforms a simple agglom-erative clustering approach and previous work. 1 Introduction Proper handling of mentions in text of real-world entities identifying and resolving them is a central part of many NLP applications. We seek an algorithm that infers a set of real-world entities from mentions in a text mapping each entity mention token to an entity and discovers general categories of words used in names . titles and last names . Here we use a probabilistic model to infer a structured representation of canonical forms of entity attributes through transductive learning from named entity mentions with a small number of seeds see Table 1 . The input is a collection of mentions found by a named entity recognizer along with their contexts and following Eisenstein et al. 2011 the output is a table in which entities are rows the number of which is not pre-specified and attribute words are organized into columns. This paper contributes a model that builds on the approach of Eisenstein et al. 2011 but also incorporates context of the mention to help with disambiguation and to allow mentions that do not share words to be merged liberally conditions against shape features which .

Hồng Diệp 59 9 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

A fuzzy probabilistic relational database model and algebra

19 70 0

A probabilistic relational database model and algebra

17 71 0

Towards Model-checking Probabilistic Timed Automata against Probabilistic Duration Properties

16 70 2

Extending relational database model for uncertain information

18 51 0

Báo cáo khoa học: "A Probabilistic Model of Syntactic and Semantic Acquisition from Child-Directed Utterances and their Meanings"

11 55 0

Báo cáo khoa học: "A Probabilistic Answer Type Model"

8 43 0

Báo cáo "Applying probabilistic model for ranking Webs in multi-context "

12 34 0

Báo cáo khoa học: "A Probabilistic Model for Canonicalizing Named Entity Mentions"

9 42 0

Báo cáo khoa học: "Enhanced word decomposition by calibrating the decision threshold of probabilistic models and using a model ensemble"

9 69 0

Báo cáo khoa học: "A probabilistic generative model for an intermediate constituency-dependency representation"

6 76 0

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 461990 55

Giới thiệu :Lập trình mã nguồn mở

14 23341 68

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11033 533

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10244 453

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9593 106

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8464 1139

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8312 423

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 7904 2239

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 6891 257

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 6321 1529

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

Đánh giá hao mòn và độ tin cậy của chi tiết và kết cấu trên đầu máy diezel part 3

12 329 0 02-06-2024

Trading Strategies Profit Making Techniques For Stock_3

23 208 1 02-06-2024

Báo cáo nghiên cứu khoa học " KẾT QUẢ NGHIÊN CỨU BƯỚC ĐẦU VỀ THIÊN ĐỊCH CHÂN KHỚP TRÊN CÂY THANH TRÀ Ở THỪA THIÊN HUẾ "

7 194 0 02-06-2024

Đề tài: Tìm hiểu một số yêu cầu đặt ra với một phòng thu âm, để đảm bảo chất lượng âm thanh trong sản phẩm đa phương tiện

8 175 1 02-06-2024

Giáo trình CẤU TRÚC DỮ LIỆU VÀ GIẢI THUẬT - Chương 1

5 145 0 02-06-2024

Data Structures and Algorithms - Chapter 8: Heaps

41 136 0 02-06-2024

báo cáo hóa học:" Rare ligamentum flavum cyst causing incapacitating lumbar spinal stenosis: Experience with 3 Chinese patients"

4 111 0 02-06-2024

báo cáo hóa học:" Journal of the International AIDS Society: an important step forward"

2 98 0 02-06-2024

Quy Trình Canh Tác Cây Bông Vải

8 121 0 02-06-2024

Chương 10: Các phương pháp tính quá trình quá độ trong mạch điện tuyến tính

57 191 4 02-06-2024

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 7904 2239

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 6321 1529

Ebook Chào con ba mẹ đã sẵn sàng

112 3888 1277

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 5505 1148

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8464 1139

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3584 658

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3783 570

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11033 533

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4228 527

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4236 483