Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
Keyword Search in Databases- P4:Conceptually, a database can be viewed as a data graph GD(V ,E), where V represents a set of objects, and E represents a set of connections between objects. In this book, we concentrate on two kinds of databases, a relational database (RDB) and an XML database. In an RDB, an object is a tuple that consists of many attribute values where some attribute values are strings or full-text; there is a connection between two objects if there exists at least one reference from one to the other | 4 2. SCHEMA-BASED KEYWORD SEARCH ON RELATIONAL DATABASES Author Write Paper Cite Figure 2.1 DBLP Database Schema Qin et al. 2009a relation r Ri . Together with the two values a tuple is uniquely identified in the entire RDB. For simplicity and without loss of generality in the following discussions we assume primary keys are TID and we use primary key and TID interchangeably. Given an RDB on the schema graph Gs we say two tuples ti and tj in an RDB are connected if there exists at least one foreign key reference from ti to tj or vice versa and we say two tuples ti and tj in an RDB are reachable if there exists at least a sequence of connections between ti and tj . The distance between two tuples ti and tj denoted as dis ti tj is defined as the minimum number of connections between ti and tj . An RDB can be viewed as a database graph Gd V E on the schema graph Gs. Here V represents a set of tuples and E represents a set of connections between tuples. There is a connection between two tuples ti and tj in Gd if there exists at least one foreign key reference from ti to tj or vice versa undirected in the RDB. In general two tuples ti and tj are reachable if there exists a sequence of connections between ti and tj in Gd .The distance dis ti tj between two tuples ti and tj is defined the same as over an RDB. It is worth noting that we use Gd to explain the semantics of keyword search but do not materialize Gd over RDB. Example 2.1 A simple DBLP database schema Gs is shown in Figure 2.1. It consists of four relation schemas Author Write Paper and Cite. Each relation has a primary key TID. Author has a text attribute Name. Paper has a text attribute Title. Write has two foreign key references AID refers to the primary key defined on Author and PID refers to the primary key defined on Paper . Cite specifies a citation relationship between two papers using two foreign key references namely PID1 and PID2 paper PID2 is cited by paper PID1 and both refer to the primary key .