TAILIEUCHUNG - Báo cáo khoa học: "Dictionary Definitions based Homograph Identification using a Generative Hierarchical Model"

A solution to the problem of homograph (words with multiple distinct meanings) identification is proposed and evaluated in this paper. It is demonstrated that a mixture model based framework is better suited for this task than the standard classification algorithms – relative improvement of 7% in F1 measure and 14% in Cohen’s kappa score is observed. | Dictionary Definitions based Homograph Identification using a Generative Hierarchical Model Anagha Kulkarni Jamie Callan Language Technologies Institute School of Computer Science Carnegie Mellon University 5000 Forbes Ave Pittsburgh Pa 15213 USA anaghak callan @ Abstract A solution to the problem of homograph words with multiple distinct meanings identification is proposed and evaluated in this paper. It is demonstrated that a mixture model based framework is better suited for this task than the standard classification algorithms -relative improvement of 7 in F1 measure and 14 in Cohen s kappa score is observed. 1 Introduction Lexical ambiguity resolution is an important research problem for the fields of information retrieval and machine translation Sanderson 2000 Chan et al. 2007 . However making fine-grained sense distinctions for words with multiple closely-related meanings is a subjective task Jorgenson 1990 Palmer et al. 2005 which makes it difficult and error-prone. Fine-grained sense distinctions aren t necessary for many tasks thus a possibly-simpler alternative is lexical disambiguation at the level of homographs Ide and Wilks 2006 . Homographs are a special case of semantically ambiguous words Words that can convey multiple distinct meanings. For example the word bark can imply two very different concepts - outer layer of a tree trunk or the sound made by a dog and thus is a homograph. Ironically the definition of the word homograph is itself ambiguous and much debated however in this paper we consistently use the above definition. If the goal is to do word-sense disambiguation of homographs in a very large corpus a manually-generated homograph inventory may be impractical. In this case the first step is to determine which words in a lexicon are homographs. This problem is the subject of this paper. 2 Finding the Homographs in a Lexicon Our goal is to identify the homographs in a large lexicon. We assume that manual labor is a scarce resource

Ðình Cường 81 4 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Báo cáo khoa học: "Dictionary Definitions based Homograph Identification using a Generative Hierarchical Model"

4 68 0

Báo cáo khoa học: "Statistical Sense Disambiguation with Relatively Small Corpora Using Dictionary Definitions "

8 42 0

**Báo cáo khoa học: "SEMANTICALLY SIGNIFICANT PATTERNS IN DICTIONARY DEFINITIONS *"**

8 53 0

Báo cáo khoa học: "PARSING VS. TEXT PROCESSING IN THE ANALYSIS OF DICTIONARY DEFINITIONS"

8 53 0

Illustrated Maths Dictionary _ edition 4

167 66 0

“Book/Definitions” Electrical Engineering Dictionary.

751 60 0

Encyclopedic Dictionary of International Finance and Banking

334 52 0

“Book/Definitions” - Electrical Engineering Dictionary

751 95 0

Electrical Engineering Dictionary by Ed. Phillip A. Laplante

751 58 0

SKILLS FOR THE TOEIC TEST Listening and Reading

256 109 1

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 462051 59

Giới thiệu :Lập trình mã nguồn mở

14 23751 74

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11118 535

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10355 458

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9635 106

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8632 1148

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8356 423

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 7942 2249

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 6976 260

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 6697 1606

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

Management and Services Part 1

10 187 1 26-06-2024

XỬ TRÍ CHẤN THƯƠNG SỌ NÃO KÍN

1 146 2 26-06-2024

Bài Tiểu Luận Chuyên Đề Tổ Chức Hoạt Động Nhận Thức Trong Dạy Học Vật Lý " Định Luật Ôm Cho Các Loại Đoạn Mạch Chứa Nguồn Điện"

10 179 3 26-06-2024

MẪU GIẤY PHÉP VẬN TẢI LOẠI C

2 136 0 26-06-2024

Báo cáo nghiên cứu nông nghiệp " Field control of pest fruit flies in Vietnam "

14 160 2 26-06-2024

báo cáo hóa học:" Journal of the International AIDS Society: an important step forward"

2 112 0 26-06-2024

Thương hiệu sản phẩm làng nghề: Đã ít, lại thiếu tính cạnh tranh

5 139 0 26-06-2024

Quy Trình Canh Tác Cây Bông Vải

8 129 0 26-06-2024

Tự học thổi sáo và ngâm thơ part 4

11 173 1 26-06-2024

Chương 10: Các phương pháp tính quá trình quá độ trong mạch điện tuyến tính

57 199 5 26-06-2024

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 7942 2249

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 6697 1606

Ebook Chào con ba mẹ đã sẵn sàng

112 4005 1299

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 5688 1193

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8632 1148

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3633 665

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3845 601

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4378 543

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11118 535

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4291 483