TAILIEUCHUNG - Mining Database Structure; Or, How to Build a Data Quality Browser

To learn the white space availability at a given location, the database could use one of two schemes: use spectrum measurements for that location, or compute spectrum availabil- ity using RF propagation models. The former, a data-driven approach, requires extensive wardriving measurements at low sensitivity thresholds and may take a long time to be complete. Furthermore, the measurements will have to be repeated when- ever the primary user’s transmission characteristics, such as transmit power, antenna height, license terms, etc., change. In our experience, these changes are not uncommon. The latter, a model-driven approach does not suffer from these drawbacks, and our SenseLess system takes this approach. However, the key question of any. | Mining Database Structure Or How to Build a Data Quality Browser Tamraparni Dasu Theodore Johnson S. Muthukrishnan Vladislav Shkapenyuk AT T Labs-Research tamrjohnsont muthu vshkap @resear ABSTRACT Data mining research typically assumes that the data to be analyzed has been identified gathered cleaned and processed into a convenient form. While data mining tools greatly enhance the ability of the analyst to make data-driven discoveries most of the time spent in performing an analysis is spent in data identification gathering cleaning and processing the data. Similarly schema mapping tools have been developed to help automate the task of using legacy or federated data sources for a new purpose but assume that the structure of the data sources is well understood. However the data sets to be federated may come from dozens of databases containing thousands of tables and tens of thousands of fields with little reliable documentation about primary keys or foreign keys. We are developing a system Bellman which performs data mining on the structure of the database. In this paper we present techniques for quickly identifying which fields have similar values identifying join paths estimating join directions and sizes and identifying structures in the database. The results of the database structure mining allow the analyst to make sense of the database content. This information can be used to . prepare data for data mining find foreign key joins for schema mapping or identify steps to be taken to prevent the database from collapsing under the weight of its complexity. 1. INTRODUCTION A seeming invariant of large production databases is that they become disordered over time. The disorder arises from a variety of causes including incorrectly entered data incorrect use of the database perhaps due to a lack of documentar tion and use of the database to model unanticipated events and entities . new services or customer types . Administrators and users of these .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.