Вы находитесь на странице: 1из 14

Lecture 10 Tue, April 14, 2009 1800 : 2100 FAST NU, Karachi

Physical Database Design


Logical database design
What are the facts and dimensions? Goals: Simplicity, Expressiveness Make the database easy to understand

Make queries easy to ask

Physical database design


How should the data be arranged on disk? Goal: Performance

Manageability is an important secondary concern Make queries run fast

Load vs. Query Trade-off


Trade-off between query performance and load

performance To make queries run fast


Precompute as much as possible Build lots of data structures

Indexes Materialized views Cube structures (MOLAP/ROLAP)

A Lesson in Data Warehouse Evolution


ROLAP MOLAP MATERAILIZED VIEWS

AGGREGATE TABLES? Whats the difference??


While Star Schema Data Warehouses originally gained high

performance from well-designed database indexes, both MOLAP and ROLAP took the approach of aggregating data grouped by dimensional hierarchies to speed up queries by orders of magnitude. When this came to the attention of the database research community, it became clear that MOLAP/ROLAP efficiency, leaving specialized semantics apart, could be traced to an approach of pre-computing query results, or in database terms materializing views, and a tremendous outpouring of papers on materialized views followed
4

Materialized Views (MVs)


View A derived relation (or a function) defined in terms of base (stored) relations Typically recomputed every time the view is referenced Materialized view A view can be materialized by storing the tuples of the view in the database Works like a cache Why use materialized views Provides fast access to data Critical in applications with high query rate and complex views not possible to re-compute the view for every query Many DBMSs support materialized views Goal: faster response for related queries
5

View Maintenance
The process of updating a materialized view in

response to changes to the underlying data MV gets dirty whenever the underlying base relations are modified Incremental View Maintenance
It is wasteful to maintain a view by re-computing it from

scratch Often it is cheaper to use the heuristic of inertia (only a part of the view changes in response to changes in the base relations) Compute only the changes in the view to update its materialization
6

Classification of the View Maintenance Problem


Four dimensions along which the view maintenance

problem can be studied


Information Dimension Modification Dimension Language Dimension Instance Dimension

Information Dimension
Information Dimension The amount of information available for view maintenance Information may include base relations, materialized view itself and knowledge of constraints and keys

Information Dimension (Example)


Consider relation part (part_num; part_cost;

contract); and View expensive_ parts (part_num) = part_num where part_cost > 1000 Consider maintaining the view when a tuple p1 is inserted into relation part Different view maintenance algorithms can be designed depending upon the information available Consider following cases
9

Information Dimension (Example)


CASE 1: The materialized view alone is available
Use the old materialized view to determine if part_num

already is present in the view If so, there is no change to the materialization, else insert part p1 into the materialization

CASE 2: The base relation part alone is available


Use relation part to check if an existing tuple in the relation

has the same part_num but greater or equal cost If such a tuple exists then the inserted tuple does not contribute to the view

CASE 3: It is known that part no is the key


Infer that part_num cannot already be in the view, so it must

be inserted

10

Modification Dimension
Modification Dimension What modifications can the view maintenance algorithm handle? Modifications may include insertion and deletion of tuples to base relations, direct updates, deletions followed by insertions, changes to the view definition etc.

11

Language Dimension
Language Dimension Is the view expressed as relational algebra (SQL or a subset of SQL) Can it have duplicates? Can it use aggregation?

12

Instance Dimension
Instance Dimension Database instance: Does the view maintenance algorithm work for all instances of the database? Modification instance: Does it work for all instances of the modification?

13

Instance Dimension (Example)


Extend the previous example with a new view
View supplier_parts as the equijoin between relations

supplier (supplier_num; part_num; price) and part (part_num,) The view contains the distinct part numbers that are supplied by at least one supplier

Maintenance
Use of a join makes it impossible to maintain supplier_parts

in response to insertions to part when using only the view View supplier_ parts is maintainable if the view contains part_num p1 but not otherwise Maintainability of a view depends also on the particular instances

14

Вам также может понравиться