TAILIEUCHUNG - Báo cáo khoa học: "Assessing the Effect of Inconsistent Assessors on Summarization Evaluation"

National Institute of Standards and Technology We investigate the consistency of human assessors involved in summarization evaluation to understand its effect on system ranking and automatic evaluation techniques. Using Text Analysis Conference data, we measure annotator consistency based on human scoring of summaries for Responsiveness, Readability, and Pyramid scoring. | Assessing the Effect of Inconsistent Assessors on Summarization Evaluation Karolina Owczarzak National Institute of Standards and Technology Gaithersburg MD 20899 Peter A. Rankel University of Maryland College Park Maryland rankel@ Hoa Trang Dang National Institute of Standards and Technology Gaithersburg MD 20899 John M. Conroy IDA Center for Computing Sciences Bowie Maryland conroy@ Abstract We investigate the consistency of human assessors involved in summarization evaluation to understand its effect on system ranking and automatic evaluation techniques. Using Text Analysis Conference data we measure annotator consistency based on human scoring of summaries for Responsiveness Readability and Pyramid scoring. We identify inconsistencies in the data and measure to what extent these inconsistencies affect the ranking of automatic summarization systems. Finally we examine the stability of automatic metrics ROUGE and CLASsY with respect to the inconsistent assessments. 1 Introduction Automatic summarization of documents is a research area that unfortunately depends on human feedback. Although attempts have been made at automating the evaluation of summaries none is so good as to remove the need for human assessors. Human judgment of summaries however is notper-fect either. We investigate two ways of measuring evaluation consistency in order to see what effect it has on summarization evaluation and training of automatic evaluation metrics. 2 Assessor consistency In the Text Analysis Conference TAC Summarization track participants are allowed to submit more than one run usually two and this option is often used to test different settings or versions of the same summarization system. In cases when the system versions are not too divergent they sometimes produce identical summaries for a given topic. Summaries are randomized within each topic before they are evaluated so the identical copies are usually

Ðình Hợp 54 4 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Báo cáo khoa học: "Assessing the Effect of Inconsistent Assessors on Summarization Evaluation"

4 40 0

Báo cáo y học: "Assessing the effect of HAART on change in quality of life among HIV-infected women."

11 31 0

Rationale, design and protocol of a longitudinal study assessing the effect of total knee arthroplasty on habitual physical activity and sedentary behavior in adults with osteoarthritis

9 22 1

Self-efficacy, perceived behavioral control and entrepreneurial intention among Polish students in the context of industry 4.0: Assessing the effect of education level

16 49 5

Master's thesis of Engineering: Assessing the effect of surfactants on activated sludge processes using sequencing batch reactors

111 1 1

Assessing the effect of grit and employability on organizational commitment mediating by job involvement

8 47 1

Assessing the effect of waiting time management strategies on waiting time satisfaction among bank customers in Ghana

13 60 0

Assessing the effect of geotextile mulch on yield and physico-chemical qualities of litchi – A new technical approach

6 63 0

Assessing the effect of organizational cultural values and employees engagement on performance excellence

19 66 2

Assessing the effect of agrometeorological indices on rainfed rice crop at Bhubaneswar (Odisha), India

7 62 1

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 461868 55

Giới thiệu :Lập trình mã nguồn mở

14 22645 59

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 10893 529

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10067 446

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9522 104

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8283 1125

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8240 423

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 7864 2220

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 6687 253

Vật lý hạt cơ bản (1)

29 5771 85

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

Động cơ đốt trong và máy kéo công nghiêp tập 1 part 7

23 258 0 27-04-2024

Động cơ đốt trong và máy kéo công nghiêp tập 2 part 8

32 260 0 27-04-2024

Mass Transfer in Multiphase Systems and its Applications Part 19

40 256 1 27-04-2024

extremetech Hacking BlackBerry phần 9

31 250 0 27-04-2024

Trading Strategies Profit Making Techniques For Stock_3

23 184 0 27-04-2024

Lịch sử Đội TNTP Hồ Chí Minh - CHƯƠNG III VÂNG LỜI BÁC DẠY, LÀM NGHÌN VIỆC TỐT, CHỐNG MỸ, CỨU NƯỚC, THIẾU NIÊN SĂN SÀNG

45 137 0 27-04-2024

Giáo trình CẤU TRÚC DỮ LIỆU VÀ GIẢI THUẬT - Chương 1

5 126 0 27-04-2024

QUẢN LÝ CHẤT LƯỢNG KHÔNG KHÍ

75 137 0 27-04-2024

Báo cáo tốt nghiệp: Vận hành và bảo dưỡng trong MPLS

92 144 3 27-04-2024

Data Structures and Algorithms - Chapter 9: Hashing

54 113 0 27-04-2024

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 7864 2220

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 5738 1368

Ebook Chào con ba mẹ đã sẵn sàng

112 3767 1231

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 5321 1136

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8283 1125

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3500 643

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 10893 529

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3685 525

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4052 516

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4129 480