TAILIEUCHUNG - Data Mining and Knowledge Discovery Handbook, 2 Edition part 19

Data Mining and Knowledge Discovery Handbook, 2 Edition part 19. Knowledge Discovery demonstrates intelligent computing at its best, and is the most desirable and interesting end-product of Information Technology. To be able to discover and to extract knowledge from data is a task that many researchers and practitioners are endeavoring to accomplish. There is a lot of hidden knowledge waiting to be discovered – this is the challenge created by today’s abundance of data. Data Mining and Knowledge Discovery Handbook, 2nd Edition organizes the most current concepts, theories, standards, methodologies, trends, challenges and applications of data mining (DM) and knowledge discovery. | 160 Lior Rokach and Oded Maimon However this correction still produces an optimistic error rate. Consequently one should consider pruning an internal node t if its error rate is within one standard error from a reference tree namely Quinlan 1993 e pruned T t S e T S i s T S - 1 - e T S s The last condition is based on statistical confidence interval for proportions. Usually the last condition is used such that T refers to a sub-tree whose root is the internal node t and S denotes the portion of the training set that refers to the node t. The pessimistic pruning procedure performs top-down traversing over the internal nodes. If an internal node is pruned then all its descendants are removed from the pruning process resulting in a relatively fast pruning. Error-based Pruning EBP Error-based pruning is an evolution of pessimistic pruning. It is implemented in the well-known algorithm. As in pessimistic pruning the error rate is estimated using the upper bound of the statistical confidence interval for proportions. T S T S Z J y S where e T S denotes the misclassification rate of the tree T on the training set S. Z is the inverse of the standard normal cumulative distribution and a is the desired significance level. Let subtree Tt denote the subtree rooted by the node t. Let maxchild Tt denote the most frequent child node of t namely most of the instances in S reach this particular child and let St denote all instances in S that reach the node t. The procedure performs bottom-up traversal over all nodes and compares the following values 1. eUB subtree T t St 2. eUB pruned subtree T t t St 3. UB subtree T maxchild T t Smaxchild T t According to the lowest value the procedure either leaves the tree as is prune the node t or replaces the node t with the subtree rooted by maxchild Tt . Optimal Pruning The issue of finding optimal pruning has been studied in Bratko and Bohanec 1994 and Almuallim 1996 . The first research introduced an algorithm which .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.