TAILIEUCHUNG - Báo cáo khoa học: "Discrete vs. Continuous Rating Scales for Language Evaluation in NLP"

Studies assessing rating scales are very common in psychology and related fields, but are rare in NLP. In this paper we assess discrete and continuous scales used for measuring quality assessments of computergenerated language. We conducted six separate experiments designed to investigate the validity, reliability, stability, interchangeability and sensitivity of discrete vs. continuous scales. We show that continuous scales are viable for use in language evaluation, and offer distinct advantages over discrete scales. . | Discrete vs. Continuous Rating Scales for Language Evaluation in NLP Anja Belz Eric Kow School of Computing Engineering and Mathematics University of Brighton Brighton Bn2 4gJ UK @ Abstract Studies assessing rating scales are very common in psychology and related fields but are rare in NLP. In this paper we assess discrete and continuous scales used for measuring quality assessments of computergenerated language. We conducted six separate experiments designed to investigate the validity reliability stability interchangeability and sensitivity of discrete vs. continuous scales. We show that continuous scales are viable for use in language evaluation and offer distinct advantages over discrete scales. 1 Background and Introduction Rating scales have been used for measuring human perception of various stimuli for a long time at least since the early 20th century Freyd 1923 . First used in psychology and psychophysics they are now also common in a variety of other disciplines including NLP. Discrete scales are the only type of scale commonly used for qualitative assessments of computer-generated language in NLP . in the DUC TAC evaluation competitions . Continuous scales are commonly used in psychology and related fields but are virtually unknown in NLP. While studies assessing the quality of individual scales and comparing different types of rating scales are common in psychology and related fields such studies hardly exist in NLP and so at present little is known about whether discrete scales are a suitable rating tool for NLP evaluation tasks or whether continuous scales might provide a better alternative. A range of studies from sociology psychophysiology biometrics and other fields have compared 230 discrete and continuous scales. Results tend to differ for different types of data. . results from pain measurement show a continuous scale to outperform a discrete scale ten Klooster et al. 2006 . Other results Svensson 2000 from .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.