TAILIEUCHUNG - Data Preparation for Data Mining- P17

Data Preparation for Data Mining- P17: Ever since the Sumerian and Elam peoples living in the Tigris and Euphrates River basin some 5500 years ago invented data collection using dried mud tablets marked with tax records, people have been trying to understand the meaning of, and get use from, collected data. More directly, they have been trying to determine how to use the information in that data to improve their lives and achieve their objectives. | include points that should otherwise be excluded. Or again in the nearest-neighbor methods neighborhoods were unbalanced. How does preparation help Figure shows the data range normalized in state space on the left. The data with both range and distribution normalized is shown on the right. The range-normalized and redistributed space is a toy representation of what full data preparation accomplishes. This data is much easier to characterize manifolds are more easily fitted cluster boundaries are more easily found neighbors are more neighborly. The data is simply easier to access and work with. But what real difference does it make Figure Some of the effects of data preparation normalization of data range left and normalization and redistribution of data set right . Neural Networks and the CREDIT Data Set The CREDIT data set is a derived extract from a real-world data set. Full data preparation and surveying enable the miner to build reasonable models reasonable in terms of addressing the business objective. But what does data preparation alone achieve in this data set In order to demonstrate that we will look at two models of the data one on prepared data and the other on unprepared data. Any difficulty in showing the effect of preparation alone is due to the fact that with ingenuity much better models can be built with the prepared data in many circumstances than with the data unprepared. All this demonstrates however is the ingenuity of the miner To try to level the playing field as it were for this example the neural network models will use all of the inputs have the same number of nodes in the hidden layer and will use no extracted features. There is no change in network architecture for the prepared and unprepared data sets. Thus this uses no knowledge gleaned from the either the data assay or the data survey. Much if not most of the useful information discovered about the data set and how to build better models is simply discarded so that the

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.