Академический Документы
Профессиональный Документы
Культура Документы
SergeAbiteboul g
INRIASaclay,CollgedeFranceetENSCachan
20/03/2012
Organization
Theprinciples Th i i l
Abstraction U i Universality li Independence
Theprinciples
3/20/2012
DBMS
Goal:themanagementoflargeamountsofdata
L Largeamountsofdata:database t fd t d t b Softwarethatdoesthis:DBMS
Mediation
Thedatamanagementsystemactsasamediatorbetweenintelligentusers h d d b ll andobjectsthatstoreinformation
t d ( Fil (t d t,d (Film(t,d, Bogart) Sance(t,s,h))
3/20/2012
Simpledatastructure
Relations Trees Trees Graphs
Formallanguageforqueries
L i Logics Declarativevs.Procedural Graphicallanguages
3/20/2012
ComplexgraphicalquerieswithMSAccess 6
3/20/2012
Inreality
Less structured data are often stored in files Lessstructureddataareoftenstoredinfiles Toointenseapplicationsrequirespecialized software Todaymoreandmorespecializedsystems
3/20/2012
R li bilit ReliabilityandSecurity dS it
Datadistribution More
Scaling
Volumeofdata o u e o data Volumeofrequests
Performance
Responsetime: Throughput:
3/20/2012
Thetimeperoperation Thenumberofoperationspertimeunit
9
Largevarietyofapplicationswithimportant needsfordatamanagement
Twomainclasses OLTP:OnlineTransactionProcessing Transactional
Ecommerce,banking,etc.. Simpletransactions,knowninadvance Very high load in number of transactions per second* Veryhighloadinnumberoftransactionspersecond
OLAP:OnlineAnalyticalProcessing
Decisionmaking
3/20/2012
10
Logical level
Independence
Physical:Wecanchangethephysicalorganization withoutchangingthelogicallevel ith t h i th l i l l l Logical:Wecanevolvethelogicallevelwithout modifyingtheapplications External:Wecanchangeoraddviewswithout 11 affectingthelogicallevel
Physical level
3/20/2012
Abstraction
Therelationalmodel
20/03/2012
12
20/03/2012
13
3/20/2012
14
20/03/2012
15
Littleabstraction Languages
Navigational Procedural Recordatattime
Supplier(sno, sname,sadd)
Part(pno, pname)
Order(ono, qty,price) t i )
16
Abstraction
3/20/2012
17
Universality:functionalities
Herewithaveryrelationalviewpoint y p
3/20/2012
18
Forthistwomaintools
Optimization Parallelism
20/03/2012
19
Dependencies
Lawsaboutthedata L b t th d t
Toprotectdata Tooptimizequeries Todesignschemas Toexplaindata
Examples
Sance[titre] Film[titre]Inclusiondependency Onlyknownfilmsareshown Only known films are shown Sance:salle heure titre Functionaldependencies Onlyonemovieisshownatatimeinatheater
tgds
Someofthemostsophisticatedevelopmentsindbtheory
3/20/2012 20
Updateanomalies Nullvalues N ll l
3/20/2012 21
20/03/2012
22
Hotstandby:secondsystemrunningsimultaneously Availability:usersshouldnothavetowaitbeyondwhatisseen Availability: users should not have to wait beyond what is seen asreasonableforanapplication
3/20/2012
23
Distributedtransactions
Twophasecommit TypicallytooheavyforWebapplications
3/20/2012
24
More
Security
Protectcontentagainstunauthorizedusers(humansor programs) ) Confidentiality:accesscontrol,authentication,authorization
Independence:views
20/03/2012
26
Views
Definition:
Functionf:Database View
Oneofthemostfundamentaltopicsindatabases
db1 db2 Database states db3 db4 db5
3/20/2012
v1 View states
v2
27
db6
Implicitdefinitionand recursion
Datalog Dependencies(tgds) p (g )
state
resort t
resort
t f 1meter 2meters
n
28 LakeTahoe
Aspen
Update:propagate
Base view:costly y viewmaintenance View base: ambiguous
Query:simple
29
3/20/2012
Queriesarecomplex
Updatesarecomplex
Definitions Globalasview: v=(db1, ,dbn) Localasview: dbi=i(v) foreachI Arbitrarycomplexconstraints betweenthedatabaseandtheviews g Sometimescalledalignmentsbetweenthem
30
3/20/2012
Optimization
20/03/2012
31
3/20/2012
32
(a) For each f in film Foreachfinfilm Foreachsinsancedo (b) Iffewtuplespasstheselection (c) Usingtheindex
3/20/2012
20/03/2012
34
Optimization
Usingaccessstructures
Hash B trees Btrees
Usingsophisticatedalgorithm
Join
Costevaluationtoselectanexecutionplan Problem:searchspaceistoolarge Problem: search space is too large Technique:Rewritequeriesbasedonheuristicstoexploreonly partofit part of it
3/20/2012
35
Filtref
f f
3/20/2012 36
Complexityandexpressivity
20/03/2012
37
Complexity http://www.cs.rice.edu/~vardi/papers/sigmod08.pdf
Complexity:forafixedqueryq, C l i f fi d
Testinggiven(I,t)whethertisinq(I)asafunctionofthe sizeofI size of I FocusonBooleanquerytonotdependonoutputsize
3/20/2012
20/03/2012
39
Polynomialinthetreewidth
3/20/2012
40
20/03/2012
41
Expressivity
Onecannotcomputetransitiveclosure Addafixedpoint
Inflationary:fixpoint Ornot:while
3/20/2012
42
3/20/2012
43
Conclusion
20/03/2012
44
Conclusion
Andthen:alwaysquestioneverything
Revisitthemodels,languages,principles
Why?
Toscaletoalwaysmoredataandqueries Tosupportextremeapplicationsthatcannotbesupportedbystandard technology: Google Visatransactions Tofacilitateapplicationdevelopment Tooffermoreintermsofperformance,reliability,security,etc..
3/20/2012
45
Conclusion
Relationalmodel Entriesinrelations =atomic values Dataareregular Data are regular ACID Universal Dataarepersistant Dataarestatic Constraints arestatic(FDs,etc.)
3/20/2012
Beyond Entriesaresetofvalues Missingdata,probabilisiticdata Semistructured Weakerconcurrency Specialized: noSQL p Queriesondataflows Data&behavior:Objectdatabases Activedatabases A ti d t b Triggers
46
Merci !
20/03/2012
47