TAILIEUCHUNG - Data Mining and Knowledge Discovery Handbook, 2 Edition part 51

Data Mining and Knowledge Discovery Handbook, 2 Edition part 51. Knowledge Discovery demonstrates intelligent computing at its best, and is the most desirable and interesting end-product of Information Technology. To be able to discover and to extract knowledge from data is a task that many researchers and practitioners are endeavoring to accomplish. There is a lot of hidden knowledge waiting to be discovered – this is the challenge created by today’s abundance of data. Data Mining and Knowledge Discovery Handbook, 2nd Edition organizes the most current concepts, theories, standards, methodologies, trends, challenges and applications of data mining (DM) and knowledge discovery. | 480 Swagatam Das and Ajith Abraham Clustering can also be performed in two different modes crisp and fuzzy. In crisp clustering the clusters are disjoint and non-overlapping in nature. Any pattern may belong to one and only one class in this case. In case of fuzzy clustering a pattern may belong to all the classes with a certain fuzzy membership grade Jain et al. 1999 . The most widely used iterative k-means algorithm MacQueen 1967 for partitional clustering aims at minimizing the ICS Intra-Cluster Spread which for k cluster centers can be defined as k ICS C1 C2 . Ck X --mdl2 i 1 XteCt The k-means or hard c-means algorithm starts with k cluster-centroids these centroids are initially selected randomly or derived from some a priori information . Each pattern in the data set is then assigned to the closest cluster-centre. Centroids are updated by using the mean of the associated patterns. The process is repeated until some stopping criterion is met. In the c-medoids algorithm Kaufman and Rousseeuw 1990 on the other hand each cluster is represented by one of the representative objects in the cluster located near the center. Partitioning around medoids PAM Kaufman and Rousseeuw 1990 starts from an initial set of medoids and iteratively replaces one of the medoids by one of the non-medoids if it improves the total distance of the resulting clustering. Although PAM works effectively for small data it does not scale well for large datasets. Clustering large applications based on randomized search CLARANS Ng and Han 1994 using randomized sampling is capable of dealing with the associated scalability issue. The fuzzy c-means FCM Bezdek 1981 seems to be the most popular algorithm in the field of fuzzy clustering. In the classical FCM algorithm a within cluster sum function Jm is minimized to evolve the proper cluster centers Jm uij m Xj - Vi 2 j 1 i 1 where Vi is the i-th cluster center Xj is the j-th d-dimensional data vector and . is an inner product-induced .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.