TAILIEUCHUNG - Báo cáo khoa học: "Supersense Tagging of Unknown Nouns using Semantic Similarity"

The limited coverage of lexical-semantic resources is a signiﬁcant problem for NLP systems which can be alleviated by automatically classifying the unknown words. Supersense tagging assigns unknown nouns one of 26 broad semantic categories used by lexicographers to organise their manual insertion into W ORD N ET. Ciaramita and Johnson (2003) present a tagger which uses synonym set glosses as annotated training examples. We describe an unsupervised approach, based on vector-space similarity, which does not require annotated examples but signiﬁcantly outperforms their tagger. . | Supersense Tagging of Unknown Nouns using Semantic Similarity James R. Curran School of Information Technologies University of Sydney NSW 2006 Australia james@ Abstract The limited coverage of lexical-semantic resources is a significant problem for NLP systems which can be alleviated by automatically classifying the unknown words. Supersense tagging assigns unknown nouns one of 26 broad semantic categories used by lexicographers to organise their manual insertion into WordNet. Ciaramita and Johnson 2003 present a tagger which uses synonym set glosses as annotated training examples. We describe an unsupervised approach based on vector-space similarity which does not require annotated examples but significantly outperforms their tagger. We also demonstrate the use of an extremely large shallow-parsed corpus for calculating vector-space semantic similarity. 1 Introduction Lexical-semantic resources have been applied successful to a wide range of Natural Language Processing NLP problems ranging from collocation extraction Pearce 2001 and class-based smoothing Clark and Weir 2002 to text classification Baker and McCallum 1998 and question answering Pasca and Harabagiu 2001 . In particular WORDNET Fellbaum 1998 has significantly influenced research in NLP. Unfortunately these resource are extremely timeconsuming and labour-intensive to manually develop and maintain requiring considerable linguistic and domain expertise. Lexicographers cannot possibly keep pace with language evolution sense distinctions are continually made and merged words are coined or become obsolete and technical terms migrate into the vernacular. Technical domains such as medicine require separate treatment since common words often take on special meanings and a significant proportion of their vocabulary does not overlap with everyday vocabulary. Bur-gun and Bodenreider 2001 compared an alignment of WORDNET with the UMLS medical resource and found only a very small degree of overlap. .

Cẩm Tú 80 8 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Báo cáo khoa học: "Supersense Tagging of Unknown Nouns using Semantic Similarity"

8 64 0

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 461912 55

Giới thiệu :Lập trình mã nguồn mở

14 22850 64

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 10949 531

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10134 449

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9553 104

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8328 1127

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8267 423

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 7880 2224

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 6753 253

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 5930 1428

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

CẤU TẠO HẠT NHÂN NGUYÊN TỬ-ĐỘ HỤT KHỐI-NĂNG LƯỢNG LIÊN KẾT-LK RIÊNG

12 271 0 10-05-2024

MySQL Database Usage & Administration PHẦN 7

37 162 0 10-05-2024

Lịch sử Đội TNTP Hồ Chí Minh - CHƯƠNG III VÂNG LỜI BÁC DẠY, LÀM NGHÌN VIỆC TỐT, CHỐNG MỸ, CỨU NƯỚC, THIẾU NIÊN SĂN SÀNG

45 140 0 10-05-2024

Báo cáo tốt nghiệp: Vận hành và bảo dưỡng trong MPLS

92 147 3 10-05-2024

báo cáo hóa học:" Rare ligamentum flavum cyst causing incapacitating lumbar spinal stenosis: Experience with 3 Chinese patients"

4 99 0 10-05-2024

Truyện kiếm hiệp - Duy ngã độc tôn phần 5/7

1 97 0 10-05-2024

Gastroenterology an illustrated colour text - part 10

10 92 0 10-05-2024

báo cáo hóa học:" Increased androgen receptor expression in serous carcinoma of the ovary is associated with an improved survival"

6 103 0 10-05-2024

Giáo trình phân tích phương trình vi phân viết dưới dạng thuật toán đặc tính của hệ thống p1

5 106 0 10-05-2024

báo cáo hóa học:" Journal of the International AIDS Society: an important step forward"

2 87 0 10-05-2024

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 7880 2224

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 5930 1428

Ebook Chào con ba mẹ đã sẵn sàng

112 3779 1247

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 5371 1137

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8328 1127

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3529 650

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 10949 531

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3718 525

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4125 522

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4156 481