TAILIEUCHUNG - Query Optimization In Compressed Database Systems

Unfortunately there is a lack of effective data management tools that can help users in managing such data and in ap- plying models, forcing them to use external tools for this purpose. Scientists, for instance, typically import the raw data into an analysis package such as Matlab, where they apply various models to the data. Once the data has been filtered, they typically process it further using customized programs that are often quite similar to database queries (., that find peaks in the cleaned data, extract particu- lar subsets, or compute aggregates over different regions). It is impractical for them to use databases for this later processing, because data has already. | Query Optimization In Compressed Database Systems Zhiyuan Chen Johannes Gehrke _Flip Korn Cornell University Cornell University AT T Labs-Research zhychen@ johannes@ flip@ ABSTRACT Over the last decades improvements in CPU speed have outpaced improvements in main memory and disk access rates by orders of magnitude enabling the use of data compression techniques to improve the performance of database systems. Previous work describes the benefits of compression for numerical attributes where data is stored in compressed format on disk. Despite the abundance of stringvalued attributes in relational schemas there is little work on compression for string attributes in a database context. Moreover none of the previous work suitably addresses the role of the query optimizer During query execution data is either eagerly decompressed when it is read into main memory or data lazily stays compressed in main memory and is decompressed on demand only. In this paper we present an effective approach for database compression based on lightweight attribute-level compression techniques. We propose a Hierarchical Dictionary Encoding strategy that intelligently selects the most effective compression method for string-valued attributes. We show that eager and lazy decompression strategies produce sub-optimal plans for queries involving compressed string attributes. We then formalize the problem of compression-aware query optimization and propose one provably optimal and two fast heuristic algorithms for selecting a query plan for relational schemas with compressed attributes our algorithms can easily be integrated into existing cost-based query optimizers. Experiments using TPC-H data demonstrate the impact of our string compression methods and show the importance of compression-aware query optimization. Our approach results in up to an order speed up over existing approaches. 1. INTRODUCTION Over the last decades improvements in CPU speed

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.