Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
Keyword Search in Databases- P16:Conceptually, a database can be viewed as a data graph GD(V ,E), where V represents a set of objects, and E represents a set of connections between objects. In this book, we concentrate on two kinds of databases, a relational database (RDB) and an XML database. In an RDB, an object is a tuple that consists of many attribute values where some attribute values are strings or full-text; there is a connection between two objects if there exists at least one reference from one to the other | 74 3. GRAPH-BASED KEYWORD SEARCH memory and the edges between innernodes are stored in cache or on disk in the form of adjacency lists the edges between supernode and innernode do not need to be stored explicitly. The weight of different kinds of edges are defined as follows. supernode supernode S S The edge weight of .V S2 is defined as the minimum weight of those edges between the innernodes of si and that of S2 i.e. we si S2 minViSsi V2Ss2 we vi V2 where weight of edge vi V2 is defined to be to if it does not exist. supernode innernode S I The edge weight of si V2 is defined as we si V2 minViSsi we vi V2 . These edges need not necessarily be explicitly represented. During the graph traversal if si is an unexpanded supernode and there is a supernode S2 in the adjacency list of supernode si and S2 is expanded such edges can be enumerated by locating all innernodes V2 e s21 the adjacency list of V2 contains some inner node in si . innernode supernode I S The edge weight in this case is defined in an analogous fashion to the previous case. innernode innernode I I Edge weight is the same as in the original graph. When searching the multi-granular graph the answers generated may contain supernodes called supernode answer. If an answer does not contain any supernodes it is called pure answer. The final answer returned to users must be pure answer. The Iterative Expansion Search algorithm IES Dalvi et al. 2008 is a multi-stage algorithm that is applicable to mulit-granular graphs as shown in Algorithm 27. Each iteration of IES can be broken up into two phases. Explore phase Run an in-memory search algorithm on the current state of the multi-granular graph. The multi-granular graph is entirely in memory whereas the supernode graph is stored in main memory and details of expanded supernodes are stored in cache. When the search reaches an expanded supernode it searches on the corresponding innernodes in cache. Expand phase Expand the supernodes found in top-n n k results .