Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
Keyword Search in Databases- P12:Conceptually, a database can be viewed as a data graph GD(V ,E), where V represents a set of objects, and E represents a set of connections between objects. In this book, we concentrate on two kinds of databases, a relational database (RDB) and an XML database. In an RDB, an object is a tuple that consists of many attribute values where some attribute values are strings or full-text; there is a connection between two objects if there exists at least one reference from one to the other | 54 3. GRAPH-BASED KEYWORD SEARCH Algorithm 17 BackwardSearch Gd Q Input a data graph Gd and an l-keyword query Q ki ki . Output Q-subtrees in increasing weight order. 1 Find the sets of nodes containing keywords Si Si S J 1 Si 2 3 ItHeap 0 OutHeap 0 for each keyword node v e S do 4 5 Create a single source shortest path iterator Iv with v as the source node ItHeap. insert Iv and the priority of Iv is the distance of the next node it will return 6 while ItHeap 0 and more results required do 7 8 9 Iv - ItHeap.pop u - Iv.next if Iv has more nodes to return then 10 ItHeap. insert Iv 11 if u is not visited before by any iterator then 12 Create u.Li and set u.Li 0 for 1 i l 13 CP v x j i u.Lj where v e Si 14 Insert v into u.Li 15 for each tuple e CP do 16 Create ResultTree from tuple 17 if root of ResultTree has only one child then 18 continue 19 if OutHeap is full then 20 Output and remove top result from OutHeap 21 Insert ResultTree into OutHeap ordered by its weight 22 output all results in OutHeap in increasing weight order The connected tress generated by BackwardSearch are only approximately sorted in increasing weight order. Generating all the connected trees followed by sorting would increase the computation time and also lead to a greatly increased time to output the first result. A fixed-size heap is maintained as a buffer for the generated connected trees. Newly generated trees are added into the heap without outputting them line 21 . Whenever the heap is full the top result tree is output and removed line 20 . Although BackwardSearch is a heuristic algorithm the first Q-subtree output is an l-approximation of the optimal steiner tree and the Q-subtrees are generated in increasing height order. The Q-subtrees generated by BackwardSearch is not complete as BackwardSearch only considers the shortest path from the root of a tree to nodes containing keywords. 3.3. STEINER TREE-BASED KEYWORD SEARCH 55 Figure 3.4 Optimal Substructure Ding et al. 2007 3.3.2 DYNAMIC .