Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
Data Preparation for Data Mining- P2: Ever since the Sumerian and Elam peoples living in the Tigris and Euphrates River basin some 5500 years ago invented data collection using dried mud tablets marked with tax records, people have been trying to understand the meaning of, and get use from, collected data. More directly, they have been trying to determine how to use the information in that data to improve their lives and achieve their objectives. | This brief look at the process of data exploration emphasizes that none of the pieces stands alone. Problems need to be identified which leads to identifying potential solutions which leads to finding and preparing suitable data that is then surveyed and finally modeled. Each part has an inextricable relationship to the other parts. Modeling the types of tools and the types of models made also has a very close relationship with how data is best prepared and before leaving this introduction a first look at modeling is helpful to set the frame of reference for what follows. 1.2 Data Mining Modeling and Modeling Tools One major purpose for preparing data is so that mining can discover models. But what Zs modeling In actual fact what is being attempted is very simple. The ways of doing it may not be so simple but the actual intent is quite straightforward. It is assumed that a data set either one immediately available or one that is obtainable might contain information that would be of interest if we could only understand what was in it. Therein lies the rub. Since we don t understand the information that is in the data just by looking at it some tool is needed that will turn the information enfolded in the data set into a form that is understandable. That s all. That s the modeling part of data mining a process for transforming information enfolded in data into a form amenable to human cognition. 1.2.1 Ten Golden Rules As discussed earlier in this chapter the data exploration process helps build a framework for data mining so that appropriate tools are applied to appropriate data that is appropriately prepared to solve key business problems and deliver required solutions. This framework or one similar to it is critical to helping miners get the best results and return from their data mining projects. In addition to this framework it may be helpful to keep in mind the 10 Golden Rules for Building Models 1. Select clearly defined problems that will yield tangible .