A Logic Query Language and Its Algebraic Optimization For A Multiprocessor Database Machine

A Logic Query Language and its Algebraic Optimization for a Multiprocessor Database Machine1
Maurice A.W. Houtsma2 Hendricus J.A. van Kuijk2 Jan Flokstra2 Peter M.G. Apers2 Martin L. Kersten3 Memorandum INF-88-52 December 1988
The work reported in this document was conducted in the PRISMA project, a joint e ort with Philips Research Laboratories Eindhoven, partially supported by the Dutch \Stimuleringsprojectteam Informaticaonderzoek Nederland (SPIN)" 2 University of Twente, Computer Science Department, P.O. Box 217, 7500 AE Enschede, the Netherlands 3 Centre for Mathematics and Computer Science, Kruislaan 413, Amsterdam, the Netherlands
1
Abstract
A logic query language, called PRISMAlog, is introduced. The language is one of the interfaces of a multiprocessor, main-memory database machine, called PRISMA. It is a language with a purely declarative semantics; the meaning of a program is given by its least xed-point. Besides allowing recursive rules, PRISMAlog supports operations like negation, arithmetic, aggregates, and group-by. Optimization of PRISMAlog programs is completely algebraic, and focusses on the use of distributed database techniques to introduce parallelism. Optimization criterion is minimization of response time. Techniques used to optimize PRISMAlog programs and to produce parallel schedules are illustrated.
Chapter 1
Introduction
In the PRISMA project one of the main research issues is to develop a multiprocessor, main-memory database machine. The research is focussed on the use of distributed database design techniques to achieve a high degree of parallelism, which is used to improve query response time. Besides a powerful relational database machine, we would like to o er a powerful query language as well. Therefore, we have developed a logic database query language, called PRISMAlog. The choice for a pure logic language was made for several reasons, such as: It o ers a high level of abstraction, allowing speci cation of rules and complex views, thereby o ering possibilities for support of reasoning. It allows for recursive rules. It is a good basis for extensions, such as complex objects. It has a clear semantics, but allows for di erent control structures specifying its execution model, for instance, tuple-oriented resolution or set-oriented database compilation. It is amenable to parallel processing. The choice of logic-based query languages for database systems has been taken in several research projects, such as 12,24,28]. We will introduce our approach through comparison with some of the predominant approaches. To extend the expressive power of a database, research has been conducted on coupling a logic language, like Prolog, to an existing database system 10,18]. Because of the close t between a Horn clause-like language 1
and a relational database, this coupling is conceptually of a simple nature. However, Prolog has a sequential execution model, which heavily in uences the e ciency of this coupling. It leads to single-tuple calls to the database system, and a nested-loop join employed by the Prolog program to combine tuples in di erent relations. Whereas the relational database system itself works in a set-oriented way, and can, for example, choose from a variety of join-strategies, making use from all its knowledge about indexes, sorting, etc. Moreover, Prolog contains some non-logical features, like the cut-operator, which turns it into a rather imperative language with a strict sequential execution model. This makes it more di cult to write programs, and heavily in uences its amenability to parallel processing. The aforementioned drawbacks have triggered research on developing a better logic language. For instance, LDL is a logic-based data language that has a non-sequential execution model. It uses compilation techniques to achieve a set-oriented model of computation, and to support some extra features 28]. The design of LDL is based on pure Horn clause logic; it contains sets as primitive data objects and it supports complex terms. The development of LDL takes place in the context of pure logic. Its semantics is described by means of a term model, which is not completely satisfactory because this description leads to a very complex and extensive model. For example, capturing the semantic properties of the equality relation (=) is a very tedious, extensive task 6]. Evaluable predicates (like e.g. arithmetic functions) are not included in the language, neither is typing. A number of possible extensions is being investigated 7,20]. In PRISMAlog, we start from a pure Horn clause language too (sometimes called Datalog 24]). We have restricted PRISMAlog to be a pure data retrieval language. In this way, we can supply a simple semantics for the language. Data de nition primitives and updates are not yet well understood in the context of logic, and would thus lead to a very complex semantics. Therefore, data de nition and updates have to take place through the SQLinterface of the PRISMA database machine. Because we noticed that a language such as SQL o ers a number of retrieval capabilities not encountered in logic languages, we have sought for ways to incorporate these in PRISMAlog as well. This is done for arithmetic functions, group by, and aggregate operations. PRISMAlog is based on a strict set-oriented model of computation. The meaning of a program is given by its least xed-point, and therefore the order of the rules, and predicates within a rule, is of no importance for its meaning. Hence, PRISMAlog is a purely declarative language. Actually, the semantics 2
of PRISMAlog is de ned in terms of an eXtended Relational Algebra (XRA). This allows a complete algebraic optimization of a PRISMAlog program, and gives the relational query optimizer ample opportunities to employ any of the known optimization strategies. The translation of recursive rules uses a xed-point operator, and a rewriting strategy is employed by the query optimizer to rewrite a recursive program into a number of relational expressions and transitive closure operations. This rewriting strategy itself is described in more detail in 1,2, 15]. In this document we show how the query optimizer handles recursive programs. We also brie y discuss the transitive closure operation and the use of parallelism to compute it. The remainder of the document is structured as follows: in Ch. 2 we discuss the PRISMA machine and the architecture of the database system, in Ch. 3 we describe PRISMAlog in full detail, and nally in Ch. 4 we describe the optimization of PRISMAlog programs and the production of parallel schedules.
Chapter 2
The PRISMA Database Machine

The PaRallel Inference and Storage MAchine (PRISMA) is a highly parallel machine for data and knowledge processing. The PRISMA machine is designed to support both a main-memory database system and an expert system shell. In this chapter we focus on the architecture, as far as it is relevant for the database machine, and design of the database system. More information about the PRISMA database machine can be found in 3,19]. The PRISMA database machine consists of 64 nodes. Each node is composed of a data processor, a communication processor, and 16 Mbyte memory. The nodes are connected by a high-speed network. The machine is built from commercially available, state-of-the-art hardware. Most database management systems are organized as a set of tightly coupled programs. The requirement of good performance often results in systems that are coupled more tightly than advisable for good software maintainability. Because the theory of distributed database systems has passed its infancy, we claim that it becomes possible to apply distributed techniques within a single database management system as well. According to this philosophy, which we have adopted for the PRISMA database system, a traditional database management system is viewed as a tightly-coupled distributed system. This approach is e ectuated in PRISMA by fragmenting the relations, and letting each fragment be managed by its own local database management system. Such a local database management system is called a One-Fragment Manager (OFM). It contains all functionality normally encountered in a full4
blown DBMS. These OFM's are regarded as the unit of parallelism in the database machine. Parallelism is obtained by having a number of OFM's working simultaneously on several nodes. This shows that mainly coarsegrain parallelism is considered. The design supports two major ways of improving the performance of a database system. First, the query of a user is transformed into a parallel schedule. This leads to several OFM's working simultaneously to process the query, and minimizes response time. Second, each user has a private instantiation of a parser, a query optimizer, and a transaction manager. Hence, concurrent users are working in parallel on the machine; if the consistency of the database allows it. The PRISMA database machine has two di erent user interfaces. An SQL interface is included to support existing applications and gateways to other systems. A logic programming interface, PRISMAlog, is supported to allow for complex query formulation including recursion. In the next chapter, PRISMAlog is discussed in full detail.
Chapter 3
The logic query language PRISMAlog

A logic query language like PRISMAlog allows for the formulation of complex queries, and therefore a powerful database machine is required to obtain an acceptable performance. To use the available parallelism in the database machine, it should be a language with a set-oriented model of computation, which does not contain extra-logical features (such as the Prolog cut) that a ect this model. The reason is that in this way we can leave all the issues regarding `inference' and program execution up to the relational query optimizer. Additional constructs, which do not a ect the set-oriented model of computation and are supported by the relational database, can be integrated in the PRISMAlog language. The approach taken in the design of PRISMAlog is the following: It is based on pure Horn clause logic; no arbitrary functions are allowed. It has a purely declarative semantics; the meaning of a program is given by its least xed-point. It is a data retrieval language, there are no data de nition or update predicates. It uses a derived typing mechanism. Because a program is based on pure Horn clause logic, and its meaning is given by its least xed-point, the sequential order of execution of the rules 6
in a program has been removed. Also, the evaluation order of the subgoals within a rule is not determined by their syntactic ordering. This relieves the programmer from the burden of controlling the e cient execution of a program, and it means he does not have to worry about non-termination of his program caused by an improper order of the rules/subgoals (cf. with Prolog, where programs can be written that never terminate because of such an improper order). We have not introduced any data de nition constructs in PRISMAlog, like is done, for instance, in LDL 28]. For a language to contain data de nition constructs it should, to our opinion, at least contain typing, and the meaning of data de nition constructs and their consequences should be perfectly clear (which is true for a relational environment). Unfortunately, the concept of an update (be it at the schema level or at the tuple level) is not yet well understood in the context of logic languages. Instead, we rely on the SQL-interface to the PRISMA database machine for the de nition and maintenance of relations. Since relations are de ned through an SQL-interface, we can use the data type of the attributes as an extra check on the correctness of a PRISMAlog program. Whenever we use a relation in a PRISMAlog program, we can thus determine the type of the variables from the relation de nition in the data dictionary. We will now describe the PRISMAlog language, its extra constructs, and its meaning (which is given by a translation into eXtended Relational Algebra).
3.1 Simple Horn Clauses

PRISMAlog resembles other logic languages, like e.g. Prolog, in its syntax. Its alphabet is composed of constants, variables, predicates, and the boolean connective & (`logical and'). We adopt the Prolog convention that variables are denoted by identi ers starting with an uppercase character, predicates are denoted in lowercase characters, and constants are denoted in lowercase characters or between quotes. A PRISMAlog program consists of a set of Horn clauses. The three types of Horn clauses that can occur are facts, which are unit clauses 23], rules, and queries, which are de nite goals that starts with a question mark. An example of a PRISMAlog program, which derives all employees and their salary that either work for the accounting department, or for the sales 7
department is: acc or sales(Enr, Sal) employee(Enr, Name, accounting, Sal). acc or sales(Enr, Sal) employee(Enr, Name, sales, Sal). ? acc or sales(X, Y). In this program we suppose a relation EMPLOYEE with attributes employee number, name, department, and salary to exist in the database, and we solve the program w.r.t. the actual database extension. This amounts to a proof of satis ability in the model theoretic sense 13], where the database extension is viewed as an interpretation and the query as a formula to be evaluated on this interpretation. On the PRISMA database machine, however, we do not take a logic approach in solving a query. Instead, a query is translated into an eXtended Relational Algebra (XRA) expression. This expression is regarded as the meaning of a query, and its result is the least xed-point of the query. The meaning of the above-mentioned program would thus be given by the following XRA-expression: accountingEMPLOYEE 2=sales EMPLOYEE As can be seen, every predicate de nition leads to one XRA expression, and when there are several clauses that de ne the same predicate a union of their corresponding XRA expressions is taken. By considering the meaning of a program in this way, it is guaranteed that the order of the predicates in a rule does not in uence the meaning, or execution, of the program; the same is true for the order of the rules. The translation of non-recursive Horn clauses into Relational Algebra is straightforward 9]. E ectively, every predicate de nition (which can be composed of several PRISMAlog rules) is translated into a view de nition in XRA. The translation of recursive rules is not completely straightforward, which is explained in the next section.
2=
3.2 Recursive Rules

One important feature of logic languages is the possibility to specify recursive rules. Let us present an example in which we assume a relation CONN in our database, with attributes departure city and arrival city. This relation represents all direct connections between two cities that can be made by train. A rule that de nes all possible connections is speci ed as follows: 8
ind conn(Dep, Arr) ind conn(Z, Arr) & conn(Dep, Z). ind conn(Dep, Arr) conn(Dep, Arr). This example clearly shows that the order of speci cation of the rules is irrelevant. As a Prolog program it would never terminate because of the order of the rules, as a PRISMAlog program there is no problem. The least xed-point of the program de nes its meaning and no inference strategy is imposed by the language. Since we cannot specify recursion in Relational Algebra, we have extended Relational Algebra with a xed-point operator called -calculus expression. This concept is borrowed from theoretical computer science, where it is used for describing the semantics of sequential programs 27]. The translation of our example would now be as follows: IND CONN = X CONN
1;4
(CONN 12=1 X )]
where IND CONN has two attributes, just like the relation CONN, denoting departure and arrival city. The meaning of this expression is obtained by iterating over the variable X . First, the empty relation (;) is substituted for X and the result of the expression computed. Then, this result is substituted for X and the expression is computed, and so on. Because all operations are monotone (negation is not allowed) and the database is nite, it is guaranteed that the least xed-point of an expression exists. Therefore, the result of the expression will, at a certain iteration step, become stable; no new tuples are generated beyond this iteration step. This process is shown in Table 3.1, where CONNi denotes 1;4(CONN 12=1 CONNi?1), CONN1 = CONN, and CONN0 is the identity relation. The projection is necessary to make the result of the expression union compatible with the starting relation (here CONN), so that it can be substituted without any problems. Note that in Table 3.1 the meaning of the expression is given. It does not imply that the result of a -calculus expression is obtained by such a computation. Actually, we will use a rewriting algorithm for regular recursive queries 1,15], and use an iterative parallel strategy for non-regular recursive queries 2]. The terms regular and non-regular stem from formal languages, and mean there does exist a corresponding regular or context-free (non-regular) grammar 14]. Note that in the context of logic languages one sometimes uses the term regular in a di erent way, to indicate what we shall call linear recursion. Linear recursion means there is precisely one recursive 9
iteration 1 2 .. . n
variable
CONN .. .S n?1 i i=0 CONN
result CONN CONN 1;4(CONN 12=1 CONN) .. .S n CONNi i=0
Table 3.1: Meaning of -calculus expression rule for each predicate, and the only recursive predicate allowed in the body of a rule is the one corresponding to the rule that is being de ned. We will now extend our example to make a system of regular, mutual recursive rules. Assume a relation TRAIN in our database with as attributes departure city, arrival city, departure time, and arrival time. There also exists a relation BUS with the same schema, and a relation CHANGE with an attribute city that indicates one can change from bus to train and vice versa. When we want to de ne a, possibly alternating, sequence of train and bus connections, we have the following possibilities. First, we can take a simple connection by train. Second, we can take a single connection by train, followed by a number of train and bus connections. We have to make sure that we do not go back to the departure city, and we only take trips leaving a city later then our time of arrival in that city. Third, we can take a single train connection, change to a bus, and continue with a number of bus and train connections. The same restrictions concerning time of departure and place of arrival apply. The same three cases as distinguished for trips starting with a train, can be described for trips starting with a bus. The complete example, which forms a system of regular, mutual recursive rules, and de nes a, possible alternating, trip of bus and train connections is expressed in PRISMAlog as follows: train trip(Dep, Arr, Dtime, Atime) train(Dep, Arr, Dtime, Atime). train trip(Dep, Arr, Dtime, Atime) train(Dep, Z, Dtime, At) & train trip(Z, Arr, Dt, Atime) & Arr6=Dep & At < Dt. train trip(Dep, Arr, Dtime, Atime) train(Dep, A, Dtime, At) & change(A) & Arr6=Dep & At < Dt & bus trip(A, Arr, Dt, Atime). bus(Dep, Arr, Dtime, Atime). bus trip(Dep, Arr, Dtime, Atime) 10
bus(Dep, Z, Dtime, At) & bus trip(Z, Arr, Dt, Atime) & Arr6=Dep & At < Dt. bus(Dep, A, Dtime, At) & bus trip(Dep, Arr, Dtime, Atime) change(A) & Arr6=Dep & At < Dt & train trip(A, Arr, Dt, Atime). When we translate these rules into -calculus expressions we get the following: TRAIN TRIP = X TRAIN 1;6;3;8 (TRAIN 12=1^16=2^4<3 X ) (TC 12=1^16=2^4<3 BUS TRIP) ] 1;6;3;8 BUS TRIP = Y BUS 1;6;3;8 (BUS 12=1^16=2^4<3 Y ) 1;6;3;8 (BC 12=1^16=2;4<3 TRAIN TRIP) ] with TC = 1;2;3;4(TRAIN 12=1 CHANGE) BC = 1;2;3;4(BUS 11=2 CHANGE) For simplicity and readability we have introduced some virtual variables here, but since -calculus expressions can be nested it would be no problem to write the translation down in one XRA-expression. The notation used to express Relational Algebra operations is that of 29]. Note that we have only considered how to express recursion in XRA. Optimization of recursive queries | which has nothing to do with the way they are expressed | is described later, in Sec. 4.2 The same example has been used in 15,16] to demonstrate the aforementioned rewriting mechanism, and is used in Sec. 4.2 when explaining the production of schedules for computation of the result of recursive rules. In the next section we describe some of the additional constructs allowed in PRISMAlog, such as the comparison operators used in the previous example.
bus trip(Dep, Arr, Dtime, Atime)
3.3 Additional Constructs

Although the most important feature of a logic language is the speci cation of recursive rules, the language would be rather limited if some of the common operations that are available in a normal relational database are not included. Since there also is an SQL-interface available on the PRISMA database machine, and, therefore, retrieval operations like comparisons, set di erence, aggregates, and grouping are supported, we have sought for ways to integrate these operations in PRISMAlog as well. Of course, without 11
compromising the set-oriented model of computation that allows e cient query optimization. These extensions are described below.
3.3.1 Comparisons
A straightforward extension was already illustrated in the last example, namely the use of comparisons. In PRISMAlog, we allow the use of comparison operators to compare the value of two variables, or a variable and a constant. Comparison operations are easy to support because, unlike their role in inference processing, their semantics can be directly expressed by selections, or join conditions. This means the set-oriented model of computation is not a ected.
3.3.2 Negation
A second extension is the introduction of a kind of negation. It has been widely accepted that negation by failure, like e.g. used in Prolog, is not appropriate in a database context 28]. But this does not mean that we cannot introduce negation in a logic language. After all, common relational database systems already support a certain, well-de ned, type of negation. The type of negation supported by relational database systems is set di erence. This type of negation can, with necessary care, be integrated in the PRISMAlog language. The meaning of this type of negation is very clear: a set of tuples is always negated w.r.t. a certain superset. Therefore, we allow the negation of a predicate in a PRISMAlog program whenever there is another predicate with exactly the same schema. The relation corresponding to this predicate then functions as a superset w.r.t. the relation corresponding to the negated predicate. The restriction that there has to appear a predicate with exactly the same schema as the negated predicate in the body of the rule, is not enforced in LDL. There, a negated predicate is translated into a complement expression; this means the compiler tries to nd a superset for the negated predicate and then generates a set di erence expression 28]. This leads to problems such as de ning a partial order on the relations, and a, to our opinion, sometimes counter-intuitive semantics of negation. An example of the way we use negation is the following: conn(Dep, Arr) train(Dep, Arr, Dt, At). early conn(Dep, Arr) train(Dep, Arr, Dt, At) & Dt < 7:30. late conn(Dep, Arr) conn(Dep, Arr) & : early conn(Dep, Arr). 12
Its translation is given by the following XRA-statement: LATE CONN = ( 1;2TRAIN) ? ( 1;2 3<"7:30" TRAIN) When there are several predicates in the body of a rule with the same schema as the negated predicate, brackets have to be used to indicate which one serves as a superset to the negated predicate. A problem with this approach is that set di erence, which gives negation its required semantics, is a non-monotonic operation. Careless use would not guarantee that the least xed-point of a program exists. This problem can be solved by only allowing so-called strati ed programs 4]. In e ect, this leads to splitting a program into several layers, each containing a number of de nitions. Whenever a predicate is negated, its de nition should have been completed in a previous layer. This means that the corresponding relation can be computed completely before the negation, without interfering with the de nition at hand. An example of incorrect use of negation, which is not allowed in PRISMAlog, is the following: rule1(X, Y) rel1(X, Y) & : rule2(X, Y). rule2(X, Y) rel2(X, Y) & Y > 100. rule2(X, Y) rel3(X, Z) & rule1(Z, Y). In this program rule2 uses rule1 in its de nition. Therefore, the negation of rule2 in the body of rule1 leads to a de nition of rule2 in terms of its own negation. This kind of use of negation is prevented by only allowing strati ed programs.
3.3.3 Arithmetic
Many database query languages, like e.g. SQL, support some kind of arithmetic. We have chosen to incorporate arithmetic in PRISMAlog in the form of evaluable predicates. Care should be taken in two respects; we should not introduce an operational semantics, and we should avoid programs that do not have a xed-point solution. The rst problem was avoided by not allowing evaluable predicates in the body of a rule, but only in the head. This means the order of evaluation of the predicates in the body of a rule is still free. It also stays in line with our 13
previous view on rules: the body of a rule is the actual view de nition and the head determines the view name and the appropriate projection, which determines the presentation to the user. In fact, arithmetic is regarded as a kind of extended projection. An example of the use of arithmetic is the following: emp view(E, D, add(Sal, Bonus)) sales represent(E, D, Sal, Bonus). Its translation is given by the following XRA-expression: 1;2;3+4 SALES REPRESENT Note the extended projection that models the arithmetic operation. To avoid the second problem we have, at least for the moment, taken a rigorous approach. It is not allowed to use evaluable predicates in rules that are recursively de ned. Let us illustrate by an example why evaluable predicates in a recursively de ned rule can lead to problems. p(1). p(add(X, 1)) p(X). ?p(X). This program has no nite least xed-point, and concludes to p(1), p(2), : : : Therefore, we do not allow evaluable predicates in a recursively de ned rule. By analogy we now introduce aggregates, which are a special kind of arithmetic functions. However, in many query languages aggregates appear in conjunction with groupings. For instance, \For each department, give the average salary of its employees." The restriction of aggregates to the head of a rule allows us to integrate grouping into PRISMAlog as well. When aggregate functions are being used, the other variables in the head determine the grouping. This means the above-mentioned question is formulated in PRISMAlog as follows: mean sal per dep(Dep, avg(Sal)) employee(Emp, Dep, Sal). When there are no other variables in the head, there is no grouping and the aggregate is computed over the whole relation generated by the view de nition. Hence, the computation of the average salary paid by a company to its employees can be formulated as follows: 14
3.3.4 Aggregates and Grouping
mean sal(avg(Sal)) employee(Emp, Dep, Sal). The translation of the two PRISMAlog statements, using the same notation as in 11], is as follows: MEAN SAL PER DEP = GB(2), AVG(3), EMPLOYEE MEAN SAL = GB () AVG (3), EMPLOYEE The construct in XRA to express group by and aggregate operations is essentially an extension of Relational Algebra as presented in 11]. By integrating group by and aggregates in PRISMAlog in this way, we have the same expressive power as this extension to Relational Algebra.
15
Chapter 4
Producing Parallel Schedules

In this chapter we introduce the production of parallel schedules for PRISMAlog programs. First, we show how a schedule is produced for a standard, non-recursive program. Then, we show the optimization path of a recursive program and its resulting schedule. Finally, we focus on the transitive closure operation, because it is crucial to the computation of a large, relevant subclass of recursive programs.
4.1 Non-Recursive Programs

Basically the task of a Query Processor is divided into two parts: generating an e cient query evaluation plan, and executing the corresponding schedule. In PRISMA/DB, minimizing response time is chosen as the main optimization criterion. To facilitate query optimization, the work has been divided into three stages. Fragmentation and views are removed from the query. Selections and projections are pushed down the query tree as far as possible. During this stage, fragmentation criteria are used to derive extra information. The query tree is transformed to discard operands detected to result in the empty relation. During this stage, e cient join sequences are generated to exploit parallelism as much as possible. 16
4.1.1 Exploiting Parallelism
In general, using more than one data-processor to exploit parallelism in query processing is a non-trivial problem. In PRISMA/DB the following potential sources of parallelism in query processing are explored: parallelism in the DBMS, and parallelism in the execution of schedules. Parallelism in the DBMS is obtained by allowing several instances of functional DB-components operating in parallel For each individual concurrent user-session a Parser, a QO, and a Transaction Manager are created. These components can be located at di erent processors, thereby increasing parallelism. Moreover, several users can work simultaneously, thereby increasing throughput. Parallelism in the execution of schedules can be obtained in several ways. First, pipelining can be used to send tuples as soon as they are created to a next operation. Second, several parts of the query tree can be executed in parallel (depending on the fragmentation). This can be viewed as splitting a tree in several subtrees that are executed in parallel. Third, intermediate results can be delivered in a distributed way. The results are then kept on di erent processors. Fourth, distributed algorithms can be used to implement some of the operations, for instance, a distributed computation of a join or a transitive closure. This is explained in more detail in Sec. 4.1.2 and Sec. 4.3. Consider the example schedule depicted in Fig. 4.1, which illustrates the di erent types of parallelism exploited. The join operations Q 1 R and S 1 T can be executed in parallel. Their results are pipelined to the next join, and to the selection. The nal union does not have to be computed, the result can be kept at di erent processors. The degree of parallelism in the execution of the schedules produced by the query optimizer depends on the partitioning of global relations into fragments. To exploit parallelism in query execution, the fragments of a global relation are stored at di erent locations. Currently, we only consider a horizontal fragmentation. This partitioning is done satisfying the the reconstruction, completeness, and disjointness condition 11]. When a distributed join is computed for relations global relations R and S , which are each distributed in n fragments, the number of joins will in general be n2 . If S has a fragmentation that is derived from R, the join 17
4.1.2 Horizontal Fragmentation
?? @@ ? @ 1 ? ? @@ ? ? @ ?? @1 ?? 1 ?? @@@ ??? @@@ ?

R S T Figure 4.1: Schedule graph
graph is simpli ed and this number can be reduced to n, as described in 11] and illustrated below:
Q = R 1 S: = (R1 R2 Rn ) 1 (S1 S2 Sn ): = (R1 1 S1 ) (R1 1 S2 ) (R1 1 Sn )
.. . (Rn 1 S1) (Rn 1 S2 ) = (R1 1 S1 ) (R2 1 S2 )
(Rn 1 Sn ): (Rn 1 Sn):
In the context of minimizing response time, it is very important to design a fragmentation and allocation resulting in a time-balanced schedule. A component called Data Allocation Manager is responsible for fragmentation and allocation. Currently, the fragments are designed to be of more or less equal size, however, special attention is given to the design of fragments with appropriate semantic properties. This can lead to early detection of operations not resulting to the nal answer of a query. We will discuss this now in some more detail. Special attention is given to selection operations on global relations. During the rst stage of query optimization, fragmentation and views are removed. During the second stage, selections are pushed down the querytree as far as possible:
18
Q =
= = =
F (Rg ): Rn ): F (R1 R2 F (R1 ) F (R2 ) F (Rn ): (R1) F ^C2 (R2 ) F ^Cn (Rn ): F ^C1
For each fragment Ri there is a fragmentation condition Ci. All the tuples of a fragment must satisfy its fragmentation condition. The fragmentation condition can therefore be added to the selection predicate. Studying the new selection predicates Fi = F ^ Ci , the following situations are distinguished by the QO (for more details see 22]): F and Ci are inconsistent, therefore, the entire subquery F ^C (R) can be discarded. None of the tuples of Ri can ever satisfy F . Ci implies F , therefore, the selection operation on Ri is redundant. All tuples of fragment Ri satisfy F . None of the above two situations, therefore, the selection operation must be evaluated. During the third stage of the optimization process, the query is rewritten according to the above situations to discard unnecessary selection operations. Let us illustrate this optimization by an example. Consider the global relation EMPLOYEE de ned before. This relation is partitioned into three horizontal fragments EMP1, EMP2, and EMP3. The three fragments are de ned by their fragmentation criteria C1, C2, and C3 involving a single attribute e#:
i
C1 : 0 < e# 100: C2 : 100 < e# 200: C3 : 200 < e# 300:
EMPLOYEE = EMP1 EMP2 EMP3.
Consider a request for the departments of all employees with an e# less than 150.
19
Q =
= = =
dep ( 0<e# 150 (EMPLOYEE)) dep ( 0<e# 150 (EMP1 EMP2 EMP3)) dep ( 0<e# 150 (EMP1) 0<e# 150 (EMP2) (EMP1 100<e# 150(EMP2)) dep
0<e# 150
(EMP3))
None of the tuples of EMP3 satisfy the selection predicate. Some of the tuples of EMP2 may satisfy the selection predicate. All of the tuples of EMP1 have to satisfy the selection predicate. A query optimizer should generate e cient query execution plans. Because the PRISMA-machine is still under development, the system parameters needed by the QO are still more or less unknown. To allow for as much exibility as possible, in PRISMA/DB a knowledge-based approach to query optimization is taken 21]. This allows an easy adaptation to changing system parameters, and the enhancement withy new application domains. At the heart of the QO is a general-purpose rewriting mechanism. For each of the individual subproblems to be solved during query optimization, a separate set of transformation rules has been de ned. Care is taken that the set of transformation rules is consistent. Because the individual rule sets are small, the rewriting mechanism is relatively e cient. Having relatively small and independent rule bases enables an easy extension and modi cation of the QO. The architecture facilitates a future extension to enable semantic query manipulation. An extension being implemented at this moment, is the detection of common subexpressions, on syntactical grounds.
4.1.3 Knowledge-Based Approach
4.2 Recursive Programs

As described in Sec. 3.2, we distinguish two types of recursion: regular and non-regular. Both types of recursion can be expressed in PRISMAlog as well as in XRA, and executed by the PRISMA database machine. Non-regular recursion leads to an iterative, parallel strategy, as described in 2]. This optimization strategy is not described in this document. The optimization for the much more common class of regular recursion is described now. The optimization of recursive queries consists of 2 main stages, illustrated in Fig. 4.2. In the rst stage an XRA expression is sent to the so20
Query
? CQ
1 XRA-expression possibly including many -calculus expressionssequence of XRA-statements possibly including several transitive closure operations
IQ
? Schedule
Figure 4.2: Optimization of recursive queries called Intelligent Query-optimizer (IQ). The IQ uses a rewriting mechanism 1,16] to remove the -calculus expressions indicating recursion. The output of the IQ is a sequence of XRA statements; simple statements consisting of normal relational algebra operations and transitive closure operations. This sequence of XRA statements is input to the Common Query-optimizer (CQ). While the IQ produces its output, the CQ can start optimizing these XRA expressions, one by one. It therefore rewrites view de nitions, introduces fragmentation, and chooses applicable transitive closure algorithms. Every optimized statement is transformed into a parallel schedule, and sent to the database to be executed. The production of these parallel schedules was already discussed in Sec. 4.1, except for the transitive closure. When we consider again the mutual recursive rules concerning train and bus connections, as described in Sec. 3.2, the output from the IQ would be the following sequence of XRA statements. TC = 1;2;3;4(TRAIN 12=1 CHANGE) BC = 1;2;3;4(BUS 12=1 CHANGE) T1 = TCL(BUS) T2 = TCL(TRAIN 1;6;3;8(TC 12=1^16=2^4<3 1;6;3;8(T1 12=1^16=2^4<3 BC))) T3 = TRAIN 1;6;3;8(TC 12=1^16=2^4<3 1;6;3;8(T1 12=1^16=2^4<3 BUS)) SOL = T3 1;6;3;8(T2 12=1^16=2^4<3 T3) 21
These statements are presented in a simpli ed form. Actually, the transitive closure operation has two additional parameters, indicating the projection list and join condition of the operation. The implementation of the production of these statements is not described in this document, instead, we concentrate on the production of parallel schedules for these statements. The XRA statements just presented are generated one by one, and a schedule is produced for each statement. Some of them can be executed in parallel, and pipelining can be used to further extend the degree of parallelism. For instance, TC and BC can be computed in parallel; so could T2 and T3, using the results of T1 and TC while these are being produced. When the query contains a selection, this will enable the CQ to push it down into, for instance, transitive closure operations. Consider the question about train trips leaving Amsterdam before 7:30. The last statement would thereby be changed into: SOL = 1=Amsterdam^3<7:30 (T3 1;6;3;8(T2 12=1^16=2^4<3 T3)) This selection could now be pushed into the expression T2, and would thereby lead to the choice of a transitive closure algorithm for this operation that uses the initial selection for e ciently computing the relevant part of the transitive closure. An example of such an algorithm is presented in the next section.
4.3 Transitive Closure Operation

As can be concluded from the last section, good transitive closure algorithms are important for our approach. Both computation of the transitive closure at one processor and parallel computation of the transitive closure. Both cases are discussed now, and some example algorithms are given. When starting the computation of the transitive closure, we can encounter two situations: the relation is located at one processor, or it is fragmented. Conceptually these situations are the same, because, at the cost of some overhead, they can be transformed into eachother. For both situations, there are three options in computing the transitive closure: On one processor, using a fast main-memory algorithm. Distributed, conceptually starting from a single global relation. Distributed, starting from a fragmented relation and using some precomputed information. 22
begin
end
(Rel 1=1 start-node); .< New; while New 6= ; do New 1;4 (Result 12=1^16=2 Rel) ? Result; Result Result New end; return result
1;2
New Result
Figure 4.3: Transitive Closure starting from a certain point
1=A
? ? ?
6
R
Coordinator node
6
R4
@@ R @
R2
R3
-R
Figure 4.4: Pipelining Approach The rst option assumes that the result of a transitive closure will not exceed the memory-capacity of a processor. If this is true, we can choose an arbitrary algorithm, for instance, the one in Fig. 4.3. The second option allows for several algorithms. A possible approach is a pipelining one, as depicted in Fig. 4.4. Other approaches are described in 15, 16]. A point we observed from simulations, is that transmitting intermediate results through the machine leads to a poor performance. In general, the work performed in parallel should be relatively independent. It should not require any overhead (meaning transmits of intermediate results), because this severely reduces the degree of parallelism. This will lead to a number of processors which are continually waiting for each other's results. Therefore, approaches like 25], which lead to a large number of union operations and thereby communication overhead, are in fact ine cient. 23
The third option uses a clever partitioning to create relatively independent processes and a large degree of parallelism. Thereby, the transitive closure can be computed local to the fragments, using some minimal precomputed information. This approach | which is described in 17] for connection, bill of material, and minimum cost path queries | is very promising according to simulations. For all three options, two point are very important: structure of the relations (graphs) and using selections on the initial query. From previous studies in a central environment we know that there is no transitive closure algorithm that works e cient for all kinds of relations 8]. The structure of the graph is of tantamount importance. When the graph is heavily connected and many cycles appear, a delta-approach is satisfying 9]. When there are few cycles present (e.g. a tree), a squaring approach is more appropriate 1]. The use of selections on the initial query is mostly studied in the context of logic approaches to recursive query optimization 5,26,30]. But it can also be done, automatically, by the query optimizer. Whenever a query is processed, it should choose an algorithm that uses a selection when it is present. Both the algorithms in Fig. 4.3 and Fig. 4.4 assumed a selection on the start node. Equivalent algorithms are used for a selection on the end node.
24
Chapter 5
Conclusions
In this document we have introduced a logic database query language called PRISMAlog, which serves as a user interface on the PRISMA database machine. PRISMAlog is de ned as a Horn clause language with a purely declarative semantics; the meaning of a program is given by its least xedpoint. This allows us to adopt a set-oriented model of computation, and to perform all optimizations on an algebraic level. Extensions incorporated into PRISMAlog are: the use of comparisons, a strictly de ned type of negation, arithmetic operations, and aggregates. We also described how the group by-operations and aggregates are integrated. Optimization focusses on the use of distributed database techniques to introduce parallelism and reduce the amount of work to be done. Overall goal is to optimize response time. Parallelism is introduced by having several users working simultaneously, by splitting a query into several subqueries that can be processed in parallel, and by pipelining results of operations. An eight node prototype of the PRISMA machine is being built at the moment. The PRISMAlog parser is completely implemented and generates XRA code for the query optimizer. The query optimizer uses standard optimization for non-recursive programs. Optimization of regular recursive programs is being implemented right now. A rst prototype of the complete database system is running on a workstation, and its functionality is continuously enhanced. Subjects for further research are extensions of PRISMAlog for support of complex objects, and more advanced use of arithmetic. Parallel algorithms for computation of the transitive closure are being developed, using simulation results as a guideline. Especially the use of some minimal precomputed 25
information to enable a distributed approach seems very promising.
Acknowledgements
We thank all the members of PRISMA/DB.
References
1. Apers, P. M. G., Houtsma, M. A. W. and Brandse, F. Extending a relational interface with recursion. In Proc. of the 6th Advanced Database Symposium. Tokyo, Japan, Aug. 29-30, 1986, 159{166. 2. Apers, P. M. G., Houtsma, M. A. W. and Brandse, F. Processing Recursive Queries in Relational Algebra. In Proc. of the Second IFIP 2.6 Working Conf. on Database Semantics, `Data and Knowledge'(DS2), R. A. Meersman and A. C. Sernadas, Eds. North-Holland, Albufeira, Portugal, Nov. 3-7, 1986, 17{39. 3. Apers, P. M. G., Kersten, M. L. and Oerlemans, H. PRISMA database machine: A Distributed, Main-Memory Approach. In Advances in Database Technology-EDBT '88, J. W. Schmidt, S. Ceri and M. Missiko , Eds. Lect. Notes in Comp. Sci., no. 303, Springer-Verlag, New York{ Heidelberg{Berlin, March 14-18, 1988. 4. Apt, K. R., Blair, H. A. and Walker, A. Towards a theory of declarative knowledge. In Foundations of Deductive Databases and Logic Programming, J. Minker, Ed. Morgan Kaufmann, Los Altos, USA, 1987. 5. Bancilhon, F., Maier, D., Sagiv, Y. and Ullman, J. D. Magic sets and other strange ways to implement logic programs. In Proc. ACM Principles of Database Systems. 1986, 1{15. 6. Beeri, C., Naqvi, S., Ramakrishnan, R., Shmueli, O. and Tsur, S. Sets and negation in a logic database language (LDL1). MCC DB-375-86, Austin, Texas, USA, Nov. 10, 1986. 7. Beeri, C., Nasr, R. and Tsur, S. Embedding -terms in a Hornclause Logic Language. In Proc. of the 3rd Int. Conf. on Data and Knowledge Bases, C. Beeri, J. W. Schmidt and U. Dayal, Eds. Jerusalem, Israel, June 28{30, 1988, 347{359.
26
8. Brehler, J. Transitive Closure Operation in a Relational Database. University of Twente, Enschede, the Netherlands, March, 1988, M.Sc.Thesis. 9. Ceri, S., Gottlob, G. and Lavazza, L. Translation and optimization of logic queries: the algebraic approach. In Proc. of the 12th Int. Conf. on Very Large Data Bases. Kyoto, Japan, Aug. 25-28, 1986, 395{402. 10. Ceri, S., Gottlob, G. and Wiederhold, G. Interfacing relational databases and Prolog e ciently. In Proc. of the First Int. Conf. on Expert Database Systems. vol. 2, Charleston, South Carolina, April 1-4, 1986, 141{153. 11. Ceri, S. and Pelagatti, G. Distributed Databases, principles and systems. McGraw-Hill, New York, NY, 1985. 12. Emde Boas, G. van and Emde Boas, P. van Storing and evaluating horn-clause rules in a relational database. IBM, Journal of Research and Development 301 (January 1986). 13. Gallaire, H., Minker, J. and Nicolas, J-M. Logic and Database: a deductive approach. Computing Surveys 162 (June 1984), 153{185. 14. Hopcroft, J. E. and Ullman, J. D. Introduction to Automata Theory, Languages, and Computation. Addison Wesley, Reading, Mass., 1979. 15. Houtsma, M. A. W. and Apers, P. M. G. Processing Regular Recursive Queries. In Proc. Computer Science in the Netherlands. Utrecht, Nov. 3{4, 1988, 495{514. 16. Houtsma, M. A. W. and Apers, P. M. G. Processing Regular Recursive Queries in a Truly Database Way. University of Twente, Oct. 1988, Submitted for publication. 17. Houtsma, M. A. W., Apers, P. M. G. and Ceri, S. Parallel Computation of Transitive Closure Queries on Fragmented Databases. University of Twente/University of Modena, Submitted for publication. 18. Jarke, M., Clifford, J. and Vassiliou, Y. An optimizing Prolog front-end to a relational query system. In Proc. ACM-SIGMOD Int. Conf. on Management of Data. Boston, June 18-21, 1984, 296{306. 19. Kersten, M. L., Apers, P. M. G., Houtsma, M. A. W., van Kuyk, H. J. A. and van de Weg, R. L. W. A Distributed, Main-Memory Database Machine. In Proc. of the Fifth Int. Workshop on Database Machines. Karuizawa, Japan, October 5-8, 1987. 27
20. Krishnamurthy, R. and Naqvi, S. Towards a real Horn clause language. In Proc. of the 14th Int. Conf. on Very Large Data Bases. Los Angeles, Cal., USA, Aug. 1988, 252{263. 21. Kuijk, H. J. A. van and Apers, P. M. G. Knowledge-based query optimization in a distributed database management system. University of Twente, INF-87-39, the Netherlands, Dec. 1987. 22. van Kuijk, H. J. A. Application of Range Constraints in Query Optimization. University of Twente, INF-88-55, Enschede, the Netherlands, Dec. 1988. 23. Lloyd, J. W. Foundations of Logic Programming. Springer-Verlag, New York{Heidelberg{Berlin, 1987, Second edition. 24. Morris, K., Ullman, J. D. and Gelder, A. V. Design overview of the NAIL! system. Stanford University, STAN-CS-86-1108, Stanford, CA, May 1986. 25. Raschid, L. and Shu, S. Y. W. A parallel strategy for evaluating recursive queries. In Proc. of the 12th Int. Conf. on Very Large Data Bases. Kyoto, Japan, Aug. 25-28, 1986, 412{419. 26. Roelants, D. Recursive rules in logic databases. Philips Research Labs., Brussels, Belgium, March 1987, Submitted for publication. 27. Scott, D. S. and de Bakker, J. W. A theory of programs. IBM, Vienna, 1969, Unpublished seminar notes. 28. Tsur, S. and Zaniolo, C. LDL: a logic-based data-language. In Proc. of the 12th Int. Conf. on Very Large Data Bases. Kyoto, Japan, Aug., 1986, 33{41. 29. Ullman, J. D. Principles of database systems. Computer Science Press, Rockville, Maryland, 1982, Second edition. 30. Ullman, J. D. Implementation of logical query languages for databases. ACM Transactions on Database Systems 103 (Sept. 1985), 289{321.
28

A Logic Query Language and Its Algebraic Optimization For A Multiprocessor Database Machine

Загружено:

Сведения о документе

Исходное описание:

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

A Logic Query Language and Its Algebraic Optimization For A Multiprocessor Database Machine

Загружено:

Авторское право:

Доступные форматы

A Logic Query Language and its Algebraic Optimization for a Multiprocessor Database Machine1

The PRISMA Database Machine

The logic query language PRISMAlog

3.1 Simple Horn Clauses

3.2 Recursive Rules

CONN .. .S n?1 i i=0 CONN

result CONN CONN 1;4(CONN 12=1 CONN) .. .S n CONNi i=0

bus trip(Dep, Arr, Dtime, Atime)

3.3 Additional Constructs

3.3.4 Aggregates and Grouping

Producing Parallel Schedules

4.1 Non-Recursive Programs

4.1.1 Exploiting Parallelism

4.1.2 Horizontal Fragmentation

?? @@ ? @ 1 ? ? @@ ? ? @ ?? @1 ?? 1 ?? @@@ ??? @@@ ?

.. . (Rn 1 S1) (Rn 1 S2 ) = (R1 1 S1 ) (R2 1 S2 )

(Rn 1 Sn ): (Rn 1 Sn):

C1 : 0 < e# 100: C2 : 100 < e# 200: C3 : 200 < e# 300:

EMPLOYEE = EMP1 EMP2 EMP3.

4.1.3 Knowledge-Based Approach

4.2 Recursive Programs

4.3 Transitive Closure Operation

Figure 4.3: Transitive Closure starting from a certain point

information to enable a distributed approach seems very promising.

Вам также может понравиться