TAILIEUCHUNG - Data Preparation for Data Mining- P10

Data Preparation for Data Mining- P10: Ever since the Sumerian and Elam peoples living in the Tigris and Euphrates River basin some 5500 years ago invented data collection using dried mud tablets marked with tax records, people have been trying to understand the meaning of, and get use from, collected data. More directly, they have been trying to determine how to use the information in that data to improve their lives and achieve their objectives. | TABLE The effect of missing values . on the summary values of x and y. n x y x2 y2 xy 1 2 3 4 5 Sum 1 2 . . . 3 4 . . . 5 Sum . . . . . The problem is what to do if values are missing when the complete totals for all the values are needed. Regressions simply do not work with any of the totals missing. Yet if any single number is missing it is impossible to determine the necessary totals. Even a single missing x value destroys the ability to know the sums for x x2 and xy What to do Since getting the aggregated values correct is critical the modeler requires some method to determine the appropriate values even with missing values. This sounds a bit like pulling one s self up by one s bootstraps Estimate the missing values to estimate the missing values However things are not quite so difficult. Please purchase PDF Split-Merge on to remove this watermark. In a representative sample for any particular joint distribution the ratios between the various values xx and xx2 and xy and xy2 remain constant. So too do the ratios between xx and xxy and xy and xxy. When these ratios are found they are the equivalent of setting the value of n to 1. One way to see why this is so is because in any representative sample the ratios are constant regardless of the number of instance values and that includes n 1. More mathematically the effect of the number of instances cancels out. The end result is that when using ratios n can be set to unity. In the linear regression formulae values are multiplied by n and multiplying a value by 1 leaves the original value unchanged. When multiplying by n 1 the n can be left out of the expression. In the calculations that follow that piece is dropped since it has no effect on the result. The

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.