TAILIEUCHUNG - Hupsmt: An efficient algorithm for mining high utility probability sequences in uncertain databases with multiple minimum utility thresholds

This paper proposes a framework for mining high utility-probability sequences (HUPSs) in uncertain QSDBs (UQSDBs) with multiple minimum utility thresholds using a minimum utility. Two new width and depth pruning strategies are also introduced to eliminate low utility or low probability sequences as well as their extensions early, and to reduce the sets of candidate items for extensions during the mining process. | Journal of Computer Science and Cybernetics, , (2019), 1–20 DOI HUPSMT: AN EFFICIENT ALGORITHM FOR MINING HIGH UTILITY-PROBABILITY SEQUENCES IN UNCERTAIN DATABASES WITH MULTIPLE MINIMUM UTILITY THRESHOLDS TRUONG CHI TIN1,∗ , TRAN NGOC ANH1 , DUONG VAN HAI1,2 , LE HOAI BAC2 1 Department of Mathematics and Computer Science, University of Dalat 2 Department of Computer Science, VNU-HCMC University of Science ∗ tintc@ Abstract. The problem of high utility sequence mining (HUSM) in quantitative sequence databases (QSDBs) is more general than that of mining frequent sequences in sequence databases. An important limitation of HUSM is that a user-predefined minimum utility threshold is used to decide if a sequence is high utility. However, this is not suitable for many real-life applications as sequences may differ in importance. Another limitation of HUSM is that data in QSDBs are assumed to be precise. But in the real world, data collected by sensors, or other means, may be uncertain. Thus, this paper proposes a framework for mining high utility-probability sequences (HUPSs) in uncertain QSDBs (UQSDBs) with multiple minimum utility thresholds using a minimum utility. Two new width and depth pruning strategies are also introduced to eliminate low utility or low probability sequences as well as their extensions early, and to reduce the sets of candidate items for extensions during the mining process. Based on these strategies, a novel efficient algorithm named HUPSMT is designed for discovering HUPSs. Finally, an experimental study conducted with both real-life and synthetic UQSDBs shows the performance of HUPSMT in terms of time and memory consumption. Keywords. High utility-probability sequence; Uncertain quantitative sequence database; Upper and lower-bounds; Width and depth pruning strategies. 1. INTRODUCTION Discovering frequent itemsets in transaction databases and frequent sequences in sequence databases .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.