TAILIEUCHUNG - Data Preparation for Data Mining- P17

Data Preparation for Data Mining- P17: Ever since the Sumerian and Elam peoples living in the Tigris and Euphrates River basin some 5500 years ago invented data collection using dried mud tablets marked with tax records, people have been trying to understand the meaning of, and get use from, collected data. More directly, they have been trying to determine how to use the information in that data to improve their lives and achieve their objectives. | include points that should otherwise be excluded. Or again in the nearest-neighbor methods neighborhoods were unbalanced. How does preparation help Figure shows the data range normalized in state space on the left. The data with both range and distribution normalized is shown on the right. The range-normalized and redistributed space is a toy representation of what full data preparation accomplishes. This data is much easier to characterize manifolds are more easily fitted cluster boundaries are more easily found neighbors are more neighborly. The data is simply easier to access and work with. But what real difference does it make Figure Some of the effects of data preparation normalization of data range left and normalization and redistribution of data set right . Neural Networks and the CREDIT Data Set The CREDIT data set is a derived extract from a real-world data set. Full data preparation and surveying enable the miner to build reasonable models reasonable in terms of addressing the business objective. But what does data preparation alone achieve in this data set In order to demonstrate that we will look at two models of the data one on prepared data and the other on unprepared data. Any difficulty in showing the effect of preparation alone is due to the fact that with ingenuity much better models can be built with the prepared data in many circumstances than with the data unprepared. All this demonstrates however is the ingenuity of the miner To try to level the playing field as it were for this example the neural network models will use all of the inputs have the same number of nodes in the hidden layer and will use no extracted features. There is no change in network architecture for the prepared and unprepared data sets. Thus this uses no knowledge gleaned from the either the data assay or the data survey. Much if not most of the useful information discovered about the data set and how to build better models is simply discarded so that the

Diệu Ngọc 37 15 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Bài giảng Bảo mật cơ sở dữ liệu: Chương 1 - Trần Thị Kim Chi

195 114 4

Bài giảng Bảo mật cơ sở dữ liệu: Chương 3 - Trần Thị Kim Chi

130 114 2

Bài giảng Bảo mật hệ thống thông tin: Chương 7 - ĐH Bách khoa TP HCM

70 114 2

Bảo mật trong SQL

12 119 5

Bài giảng Bảo mật cơ sở dữ liệu: Chương 2 - Trần Thị Kim Chi

177 95 3

Bài giảng Bảo mật cơ sở dữ liệu: Chương 3 - Trần Thị Kim Chi (tt)

59 93 3

Bài giảng Hệ quản trị cơ sở dữ liệu: Chương 4 - ĐH Công nghiệp Thực phẩm

92 166 1

Bài giảng Hệ quản trị cơ sở dữ liệu: Các tác vụ quản trị hệ thống - TS. Lại Hiền Phương (Phần 1)

32 95 1

Bài giảng Bảo mật cơ sở dữ liệu: Discretionary Access Control - Trần Thị Kim Chi

138 99 2

Bài giảng Bảo mật cơ sở dữ liệu: Security models - Trần Thị Kim Chi

141 95 1

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 461937 55

Giới thiệu :Lập trình mã nguồn mở

14 23080 64

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 10982 531

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10174 451

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9572 106

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8378 1130

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8278 423

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 7889 2228

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 6827 256

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 6090 1467

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

Đánh giá hao mòn và độ tin cậy của chi tiết và kết cấu trên đầu máy diezel part 3

12 324 0 20-05-2024

Anh văn bằng C-124

8 192 0 20-05-2024

TƯƠNG QUAN GIỮA MÔ HỌC, GIẢI PHẪU VÀ HÌNH ẢNH CỦA CÁC KHỐI U PHẦN PHỤ

3 174 0 20-05-2024

B2B Content Marketing: 2012 Benchmarks, Budgets & Trends

17 147 0 20-05-2024

báo cáo hóa học:" Endoscopic decompression for intraforaminal and extraforaminal nerve root compression"

7 118 0 20-05-2024

Báo cáo tốt nghiệp: Vận hành và bảo dưỡng trong MPLS

92 152 3 20-05-2024

HƯỚNG DẪN SỬ DỤNG PHẦN MỀM CAITA part 9

18 137 0 20-05-2024

Diseases of the Liver and Biliary System - part 1

33 139 0 20-05-2024

báo cáo hóa học:" Rare ligamentum flavum cyst causing incapacitating lumbar spinal stenosis: Experience with 3 Chinese patients"

4 108 0 20-05-2024

Hệ thống làm lạnh và điều hòa không khí

21 134 0 20-05-2024

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 7889 2228

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 6090 1467

Ebook Chào con ba mẹ đã sẵn sàng

112 3788 1253

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 5411 1138

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8378 1130

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3551 656

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3754 537

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 10982 531

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4165 523

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4189 483