TAILIEUCHUNG - Báo cáo khoa học: "Towards the Orwellian Nightmare"

This paper describes the largest scale annotation project involving the Enron email corpus to date. Over 12,500 emails were classified, by humans, into the categories “Business” and “Personal”, and then subcategorised by type within these categories. The paper quantifies how well humans perform on this task (evaluated by inter-annotator agreement). It presents the problems experienced with the separation of these language types. As a final section, the paper presents preliminary results using a machine to perform this classification task. . | Towards the Orwellian Nightmare Separation of Business and Personal Emails Sanaz Jabbari Ben Allison David Guthrie Louise Guthrie Department of Computer Science University of Sheffield 211 Portobello St. Sheffield S1 4DP @ Abstract This paper describes the largest scale annotation project involving the Enron email corpus to date. Over 12 500 emails were classified by humans into the categories Business and Personal and then subcategorised by type within these categories. The paper quantifies how well humans perform on this task evaluated by inter-annotator agreement . It presents the problems experienced with the separation of these language types. As a final section the paper presents preliminary results using a machine to perform this classification task. 1 Introduction Almost since it became a global phenomenon computers have been examining and reasoning about our email. For the most part this intervention has been well natured and helpful - computers have been trying to protect us from attacks of unscrupulous blanket advertising mail shots. However the use of computers for more nefarious surveillance of email has so far been limited. The sheer volume of email sent means even government agencies who can legally intercept all mail must either filter email by some preconceived notion of what is interesting or they must employ teams of people to manually sift through the volumes of data. For example the NSA has had massive parallel machines filtering e-mail traffic for at least ten years. The task of developing such automatic filters at research institutions has been almost impossible but for the opposite reason. There is no shortage of willing researchers but progress has been hampered by the lack of any data - one s email is often hugely private and the prospect of surrendering it in its entirety for research purposes is somewhat unsavoury. Recently a data resource has become available where exactly this .

Bảo Thoa 47 5 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Attitude of farmers towards agriculture

6 98 0

Summary of doctoral thesis: Urban road network planning in Hai Phong towards EcoCities

27 38 1

Building a survey tool to assess consumers’ perception and behavior towards green consumption

9 80 0

Factors that influence on students’ attitudes towards plagiarism: The case of Vietnam

12 113 3

A scale to measure the attitude of farmers towards livelihood diversification

8 52 1

Customer buying intention towards electric vehicle in India

8 58 1

Doctor of Philosophy: A study on purchase intention, satisfaction and loyalty of customers towards private label brands

218 39 1

Ebook Towards career fitness

46 77 2

Teachers’ attitude towards teacher collaboration for professional development

13 117 0

Active teaching methods applied in training system towards CDIO educational framework

6 89 0

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 462302 61

Giới thiệu :Lập trình mã nguồn mở

14 24979 79

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11294 542

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10514 466

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9797 108

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8878 1161

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8468 426

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8092 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7483 1764

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 7196 268

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

THE ANTHROPOLOGY OF ONLINE COMMUNITIES BY Samuel M.Wilson and Leighton C. Peterson

19 212 4 30-11-2024

báo cáo hóa học:" Increased androgen receptor expression in serous carcinoma of the ovary is associated with an improved survival"

6 151 3 30-11-2024

Bảng màu theo chữ cái – V

11 155 2 30-11-2024

Giáo án điện tử tiểu học môn lịch sử: Cách mạng mùa thu

39 158 1 30-11-2024

Valve Selection Handbook - Fourth Edition

337 141 1 30-11-2024

ĐỀ TÀI " ĐÁNH GIÁ HIỆU QUẢ HOẠT ĐỘNG KINH DOANH NGOẠI HỐI CỦA NGÂN HÀNG THƯƠNG MẠI CỔ PHẦN XUẤT NHẬP KHẨU VIỆT NAM "

51 146 3 30-11-2024

The Ombudsman Enterprise and Administrative Justice

309 134 0 30-11-2024

Báo cáo lâm nghiệp: "Assessment of the effects of below-zero temperatures on photosynthesis and chlorophyll a fluorescence in leaf discs of Eucalyptus globulu"

4 132 0 30-11-2024

Phạm trù Chủ nghĩa cá nhân của tư tưởng phương Tây trong sự lý giải của Phan Khôi _1

9 120 0 30-11-2024

Báo cáo nghiên cứu nông nghiệp " KẾ HOẠCH THỐNG NHẤT GIỮA SẢN XUẤT, PHÂN PHỐI VÀ GIỚI THIỆU SẢN PHẨM CÂY DƯA CHUỘT CẢI BẮP "

3 115 1 30-11-2024

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8092 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7483 1764

Ebook Chào con ba mẹ đã sẵn sàng

112 4369 1369

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 6162 1259

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8878 1161

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3797 680

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3911 609

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4623 562

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11294 542

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4460 490