TAILIEUCHUNG - Báo cáo khoa học: "Scaling Context Space"

Context is used in many NLP systems as an indicator of a term’s syntactic and semantic function. The accuracy of the system is dependent on the quality and quantity of contextual information available to describe each term. However, the quantity variable is no longer ﬁxed by limited corpus resources. Given ﬁxed training time and computational resources, it makes sense for systems to invest time in extracting high quality contextual information from a ﬁxed corpus. However, with an effectively limitless quantity of text available, extraction rate and representation size need to be considered. . | Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics ACL Philadelphia July 2002 pp. 231-238. Scaling Context Space James R. Curran and Marc Moens Institute for Communicating and Collaborative Systems University of Edinburgh 2 Buccleuch Place Edinburgh EH8 9LW United Kingdom jamesc marc @ Abstract Context is used in many NLP systems as an indicator of a term s syntactic and semantic function. The accuracy of the system is dependent on the quality and quantity of contextual information available to describe each term. However the quantity variable is no longer hxed by limited corpus resources. Given hxed training time and computational resources it makes sense for systems to invest time in extracting high quality contextual information from a hxed corpus. However with an effectively limitless quantity of text available extraction rate and representation size need to be considered. We use thesaurus extraction with a range of context extracting tools to demonstrate the interaction between context quantity time and size on a corpus of 300 million words. 1 Introduction Context plays an important role in many natural language tasks. For example the accuracy of part of speech taggers or word sense disambiguation systems depends on the quality and quantity of contextual information these systems can extract from the training data. When predicting the sense of a word for instance the immediately preceding word is likely to be more important than the tenth previous word similar observations can be made about POS taggers or chunkers. A crucial part of training these systems lies in extracting from the data high-quality contextual information in the sense of dehning contexts that are both accurate and correlated with the information the POS tags the word senses the chunks the system is trying to extract. The quality of contextual information is often determined by the size of the training corpus with less data available .

Yến My 69 8 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Báo cáo khoa học: "Scaling Context Space"

8 57 0

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 461867 55

Giới thiệu :Lập trình mã nguồn mở

14 22642 59

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 10892 529

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10066 446

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9519 104

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8281 1125

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8238 423

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 7864 2220

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 6686 253

Vật lý hạt cơ bản (1)

29 5770 85

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

Giáo án mầm non chương trình đổi mới: Gia đình vui nhộn

4 312 1 27-04-2024

Động cơ đốt trong và máy kéo công nghiêp tập 1 part 7

23 258 0 27-04-2024

extremetech Hacking BlackBerry phần 9

31 250 0 27-04-2024

TƯƠNG QUAN GIỮA MÔ HỌC, GIẢI PHẪU VÀ HÌNH ẢNH CỦA CÁC KHỐI U PHẦN PHỤ

3 167 0 27-04-2024

MySQL Basics for Visual Learners PHẦN 9

15 184 0 27-04-2024

Posted prices versus bargaining in markets_7

23 155 0 27-04-2024

MySQL Database Usage & Administration PHẦN 9

37 141 0 27-04-2024

BÀI GIẢNG VỀ - MẠCH ĐIỆN II - Chương I: Phân tích mạch trong miền thời gian

38 140 0 27-04-2024

MÔN HỌC VẬT LIỆU VÀ CÔNG NGHỆ KIM LOẠI - PHẦN I: KIM LOẠI HỌC

32 176 2 27-04-2024

Hướng dẫn sử dụng Quickoffice cho Ipad và Iphone

13 151 0 27-04-2024

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 7864 2220

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 5735 1368

Ebook Chào con ba mẹ đã sẵn sàng

112 3767 1231

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 5319 1136

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8281 1125

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3499 643

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 10892 529

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3684 525

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4046 515

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4128 480