TAILIEUCHUNG - Data Streams Models and Algorithms- P4

Data Streams Models and Algorithms- P4: In recent years, the progress in hardware technology has made it possible for organizations to store and record large streams of transactional data. Such data sets which continuously and rapidly grow over time are referred to as data streams. In addition, the development of sensor technology has resulted in the possibility of monitoring many events in real time. | Frequent Pattern Mining in Data Streams 75 StreamMining-Bounded Stream D. 0 e global Lattice local Buffer T local Transaction t -fr T f t i - l 2 c 0 Number of ReducFreq Invocations foreach t G P T-Tu t Update t 1 Update t 2 if ir2 rw ReducFreq 2 c i - 2 while i foreach i G T Update t i ReducFreq i T 0 foreach s G if 0 D c s - Output Figure . StreamMining-Bounded Algorithm with a Bound on Accuracy lease purchase PDF Split-Merge on to remove this watermark. 76 DATA STREAMS MODELS AND ALGORITHMS will report a superset of itemsets occurring with frequency more than NOe. We record the number of invocations of ReducFreq c in the first step. Clearly c is bounded by N0e. In the second step we remove all items whose reported frequency is less than N0 c NO 1 e . This is achieved by the last foreach loop. The new algorithm has the following property 1 if an itemset has frequency more than 0 it will be reported. 2 if an itemset is reported as a potential frequent itemset it must have a frequency more than 0 1 e . Theorem 2 formally states this property and its proof is available in a technical report 23 . Theorem 2 In using the algorithm StreamMining-Bounded on a set of transactions with a fixed length for any k 2 6k Ç k Ç. Ck Note that the number of invocations of ReducFreq c is usually much smaller than N0e after processing a data stream. Therefore an interesting property of this approach is that it produces a very small number of false frequent itemsets even with relatively large e. The experiments in 22 also support this observation. The following lemma claims that the memory cost of 2 is increased by a factor proportional to 1 e. Lemma In using the algorithm StreamMining-Bounded on a set of transactions with a fixed length the size of 2 is bounded by 1 0e 1 2 Dealing with Variable Length Transactions In this subsection we present our final algorithm which improves upon the algorithm from the previous subsection by dealing with

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.