Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
Keyword Search in Databases- P24:Conceptually, a database can be viewed as a data graph GD(V ,E), where V represents a set of objects, and E represents a set of connections between objects. In this book, we concentrate on two kinds of databases, a relational database (RDB) and an XML database. In an RDB, an object is a tuple that consists of many attribute values where some attribute values are strings or full-text; there is a connection between two objects if there exists at least one reference from one to the other | 115 CHAPTER 5 Other Topics for Keyword Search on Databases In this chapter we discuss several interesting research issues regarding keyword search on databases. In Section 5.1 we discuss some approaches that are proposed to select some RDB among many to answer a keyword query. In Section 5.2 we discuss keyword search in a spatial database. In Section 5.3 we introduce a PageRank based approach called ObjectRank in RDB and an approach that projects a database that only contains tuples relating to a keyword query. 5.1 KEYWORD SEARCH ACROSS DATABASES There are two main issues to be considered in keyword search across multiple databases 1. When the number of databases is large a proper subset of databases need to be selected that are most suitable to answer a keyword query. This is the problem of keyword-based selection of the top-k databases and it is studied in M-KS Yu et al. 2007 and G-KS Vu et al. 2008 . 2. The keyword query needs to be executed across the databases that are selected. This problem is studied in Kite Sayyadian et al. 2007 . 5.1.1 SELECTION OF DATABASES In order to rank a set of databases D Di D according to the their suitability to answer a certain keyword query Q a score function score D Q is defined for each database D e D. In the ideal case if the keyword query is evaluated in each database individually the best database to answer the query is the one that can generate high quality results. Suppose T Ti TJ is the set of results MTJNTs see Chapter 2 for query Q over database D.The following equation can be used to score database D score D Q score T Q 5.1 T eT where score T Q can be any scoring function for the MTJNT T as discussed in Chapter 2. In practice it is inefficient to evaluate Q on every database D e D. A straightforward way to solve the problem efficiently is to calculate the keyword statistics for each k e Q on each database D e D and summarize the statistics as a score reflecting the relevance of Q to D. There are two 116 5. OTHERTOPICS