TAILIEUCHUNG - Báo cáo khoa học: "A Simple Measure to Assess Non-response"

There are several tasks where is preferable not responding than responding incorrectly. This idea is not new, but despite several previous attempts there isn’t a commonly accepted measure to assess non-response. We study here an extension of accuracy measure with this feature and a very easy to understand interpretation. The measure proposed (c@1) has a good balance of discrimination power, stability and sensitivity properties. We show also how this measure is able to reward systems that maintain the same number of correct answers and at the same time decrease the number of incorrect ones, by leaving some questions unanswered | A Simple Measure to Assess Non-response Anselmo Penas and Alvaro Rodrigo UNED NLP IR Group Juan del Rosal 16 28040 Madrid Spain anselmo alvarory@ Abstract There are several tasks where is preferable not responding than responding incorrectly. This idea is not new but despite several previous attempts there isn t a commonly accepted measure to assess non-response. We study here an extension of accuracy measure with this feature and a very easy to understand interpretation. The measure proposed c@1 has a good balance of discrimination power stability and sensitivity properties. We show also how this measure is able to reward systems that maintain the same number of correct answers and at the same time decrease the number of incorrect ones by leaving some questions unanswered. This measure is well suited for tasks such as Reading Comprehension tests where multiple choices per question are given but only one is correct. 1 Introduction There is some tendency to consider that an incorrect result is simply the absence of a correct one. This is particularly true in the evaluation of Information Retrieval systems where in fact the absence of results sometimes is the worse output. However there are scenarios where we should consider the possibility of not responding because this behavior has more value than responding incorrectly. For example during the process of introducing new features in a search engine it is important to preserve users confidence in the system. Thus a system must decide whether it should give or not a result in the new fashion or keep on with the old kind of output. A similar example is the decision 1415 about showing or not ads related to the query. Showing wrong ads harms the business model more than showing nothing. A third example more related to Natural Language Processing is the Machine Reading evaluation through reading comprehension tests. In this case where multiple choices for a question are offered choosing a wrong option should be

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.