Đang chuẩn bị liên kết để tải về tài liệu:
Learning optimal threshold for bayesian posterior probabilities to mitigate the class imbalance problem

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ

In the second method, instead of learning on each classifier separately as in the former, we combine these classifiers by a voting ensemble. The experimental results on 20 benchmark imbalanced datasets collected from the UCI repository show that our methods significantly outperform the baseline NB. These methods also perform as good as the state-of-the-art sampling methods and significantly better in certain cases. | Journal of Science and Technology Volume 48, Issue 4, 2010 pp. 38-50 LEARNING OPTIMAL THRESHOLD FOR BAYESIAN POSTERIOR PROBABILITIES TO MITIGATE THE CLASS IMBALANCE PROBLEM NGUYEN THAI-NGHE, THANH-NGHI DO, AND LARS SCHMIDT-THIEME ABSTRACT Class imbalance is one of the problems which degrade the classifier's performance. Researchers have introduced many methods to tackle this problem including pre-processing, internal classifier processing, and post-processing – which mainly relies on posterior probabilities. Bayesian Network (BN) is known as a classifier which produces good posterior probabilities. This study proposes two methods which utilize Bayesian posterior probabilities to deal with imbalanced data. In the first method, we optimize the threshold on the posterior probabilities produced by BNs to maximize the F1-Measure. Once the optimal threshold is found, we use it for the final classification. We investigate this method on several Bayesian classifiers such as Naive Bayes (NB), BN, TAN, BAN, and Markov Blanket BN. In the second method, instead of learning on each classifier separately as in the former, we combine these classifiers by a voting ensemble. The experimental results on 20 benchmark imbalanced datasets collected from the UCI repository show that our methods significantly outperform the baseline NB. These methods also perform as good as the state-of-the-art sampling methods and significantly better in certain cases. 1. INTRODUCTION In binary classification problems, class imbalance can be described as the majority class outnumbering of the minority one by a large factor. This phenomenon appears in many machine learning applications, such as credit card fraud detection, intrusion detection, oil-spill detection, disease diagnosis, and many other areas [1 - 3]. Most classifiers in supervised machine learning are designed to maximize the accuracy of their models. Thus, when learning from imbalanced data, they are usually overwhelmed by the majority .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.