Вы находитесь на странице: 1из 5

9/8/2016

QueryoptimizationWikipedia,thefreeencyclopedia

Queryoptimization
FromWikipedia,thefreeencyclopedia

Queryoptimizationisafunctionofmanyrelationaldatabasemanagementsystems.Thequeryoptimizer
attemptstodeterminethemostefficientwaytoexecuteagivenquerybyconsideringthepossiblequeryplans.
Generally,thequeryoptimizercannotbeaccesseddirectlybyusers:oncequeriesaresubmittedtodatabaseserver,
andparsedbytheparser,theyarethenpassedtothequeryoptimizerwhereoptimizationoccurs.However,some
databaseenginesallowguidingthequeryoptimizerwithhints.
Aqueryisarequestforinformationfromadatabase.Itcanbeassimpleas"findingtheaddressofapersonwith
SS#123456789,"ormorecomplexlike"findingtheaveragesalaryofalltheemployedmarriedmenin
Californiabetweentheages30to39,thatearnlessthantheirwives."Queriesresultsaregeneratedbyaccessing
relevantdatabasedataandmanipulatingitinawaythatyieldstherequestedinformation.Sincedatabasestructures
arecomplex,inmostcases,andespeciallyfornotverysimplequeries,theneededdataforaquerycanbecollected
fromadatabasebyaccessingitindifferentways,throughdifferentdatastructures,andindifferentorders.Each
differentwaytypicallyrequiresdifferentprocessingtime.Processingtimesofthesamequerymayhavelarge
variance,fromafractionofasecondtohours,dependingonthewayselected.Thepurposeofqueryoptimization,
whichisanautomatedprocess,istofindthewaytoprocessagivenqueryinminimumtime.Thelargepossible
varianceintimejustifiesperformingqueryoptimization,thoughfindingtheexactoptimalwaytoexecuteaquery,
amongallpossibilities,istypicallyverycomplex,timeconsumingbyitself,maybetoocostly,andoften
practicallyimpossible.Thusqueryoptimizationtypicallytriestoapproximatetheoptimumbycomparingseveral
commonsensealternativestoprovideinareasonabletimea"goodenough"planwhichtypicallydoesnotdeviate
muchfromthebestpossibleresult.

Contents
1 Generalconsideration
2 Implementation
2.1 Joinordering
2.2 QueryplanningfornestedSQLqueries
2.3 Costestimation
3 Extensions
3.1 ParametricQueryOptimization
3.2 MultiObjectiveQueryOptimization
3.3 MultiObjectiveParametricQueryOptimization
4 Seealso
5 References
6 Externallinks

Generalconsideration
Thereisatradeoffbetweentheamountoftimespentfiguringoutthebestqueryplanandthequalityofthechoice
theoptimizermaynotchoosethebestansweronisown.Differentqualitiesofdatabasemanagementsystemshave
differentwaysofbalancingthesetwo.Costbasedqueryoptimizersevaluatetheresourcefootprintofvarious
queryplansandusethisasthebasisforplanselection.Theseassignanestimated"cost"toeachpossiblequery
plan,andchoosetheplanwiththesmallestcost.Costsareusedtoestimatetheruntimecostofevaluatingthe
query,intermsofthenumberofI/Ooperationsrequired,CPUpathlength,amountofdiskbufferspace,disk
https://en.wikipedia.org/wiki/Query_optimization

1/5

9/8/2016

QueryoptimizationWikipedia,thefreeencyclopedia

storageservicetime,andinterconnectusagebetweenunitsofparallelism,andotherfactorsdeterminedfromthe
datadictionary.Thesetofqueryplansexaminedisformedbyexaminingthepossibleaccesspaths(e.g.,primary
indexaccess,secondaryindexaccess,fullfilescan)andvariousrelationaltablejointechniques(e.g.,mergejoin,
hashjoin,productjoin).ThesearchspacecanbecomequitelargedependingonthecomplexityoftheSQLquery.
Therearetwotypesofoptimization.Theseconsistoflogicaloptimizationwhichgeneratesasequenceof
relationalalgebratosolvethequeryandphysicaloptimizationwhichisusedtodeterminethemeansof
carryingouteachoperation.

Implementation
Mostqueryoptimizersrepresentqueryplansasatreeof"plannodes".Aplannodeencapsulatesasingleoperation
thatisrequiredtoexecutethequery.Thenodesarearrangedasatree,inwhichintermediateresultsflowfromthe
bottomofthetreetothetop.Eachnodehaszeroormorechildnodesthosearenodeswhoseoutputisfedas
inputtotheparentnode.Forexample,ajoinnodewillhavetwochildnodes,whichrepresentthetwojoin
operands,whereasasortnodewouldhaveasinglechildnode(theinputtobesorted).Theleavesofthetreeare
nodeswhichproduceresultsbyscanningthedisk,forexamplebyperforminganindexscanorasequentialscan.

Joinordering
Theperformanceofaqueryplanisdeterminedlargelybytheorderinwhichthetablesarejoined.Forexample,
whenjoining3tablesA,B,Cofsize10rows,10,000rows,and1,000,000rows,respectively,aqueryplanthat
joinsBandCfirstcantakeseveralordersofmagnitudemoretimetoexecutethanonethatjoinsAandCfirst.
MostqueryoptimizersdeterminejoinorderviaadynamicprogrammingalgorithmpioneeredbyIBM'sSystemR
databaseproject.Thisalgorithmworksintwostages:
1.First,allwaystoaccesseachrelationinthequeryarecomputed.Everyrelationinthequerycanbeaccessed
viaasequentialscan.Ifthereisanindexonarelationthatcanbeusedtoanswerapredicateinthequery,an
indexscancanalsobeused.Foreachrelation,theoptimizerrecordsthecheapestwaytoscantherelation,as
wellasthecheapestwaytoscantherelationthatproducesrecordsinaparticularsortedorder.
2.Theoptimizerthenconsiderscombiningeachpairofrelationsforwhichajoinconditionexists.Foreach
pair,theoptimizerwillconsidertheavailablejoinalgorithmsimplementedbytheDBMS.Itwillpreserve
thecheapestwaytojoineachpairofrelations,inadditiontothecheapestwaytojoineachpairofrelations
thatproducesitsoutputaccordingtoaparticularsortorder.
3.Thenallthreerelationqueryplansarecomputed,byjoiningeachtworelationplanproducedbytheprevious
phasewiththeremainingrelationsinthequery.
Sortordercanavoidaredundantsortoperationlateroninprocessingthequery.Second,aparticularsortordercan
speedupasubsequentjoinbecauseitclustersthedatainaparticularway.

QueryplanningfornestedSQLqueries
ASQLquerytoamodernrelationalDBMSdoesmorethanjustselectionsandjoins.Inparticular,SQLqueries
oftennestseverallayersofSPJblocks(SelectProjectJoin),bymeansofgroupby,exists,andnotexistsoperators.
InsomecasessuchnestedSQLqueriescanbeflattenedintoaselectprojectjoinquery,butnotalways.Query
plansfornestedSQLqueriescanalsobechosenusingthesamedynamicprogrammingalgorithmasusedforjoin
ordering,butthiscanleadtoanenormousescalationinqueryoptimizationtime.Sosomedatabasemanagement
systemscanbeuseanalternativerulebasedapproachthatusesaquerygraphmodel.

Costestimation

https://en.wikipedia.org/wiki/Query_optimization

2/5

9/8/2016

QueryoptimizationWikipedia,thefreeencyclopedia

Oneofthehardestproblemsinqueryoptimizationistoaccuratelyestimatethecostsofalternativequeryplans.
Optimizerscostqueryplansusingamathematicalmodelofqueryexecutioncoststhatreliesheavilyonestimates
ofthecardinality,ornumberoftuples,flowingthrougheachedgeinaqueryplan.Cardinalityestimationinturn
dependsonestimatesoftheselectionfactorofpredicatesinthequery.Traditionally,databasesystemsestimate
selectivitiesthroughfairlydetailedstatisticsonthedistributionofvaluesineachcolumn,suchashistograms.This
techniqueworkswellforestimationofselectivitiesofindividualpredicates.Howevermanyquerieshave
conjunctionsofpredicatessuchasselectcount(*)fromRwhereR.make='Honda'andR.model='Accord'.Query
predicatesareoftenhighlycorrelated(forexample,model='Accord'impliesmake='Honda'),anditisveryhardto
estimatetheselectivityoftheconjunctingeneral.Poorcardinalityestimatesanduncaughtcorrelationareoneof
themainreasonswhyqueryoptimizerspickpoorqueryplans.Thisisonereasonwhyadatabaseadministrator
shouldregularlyupdatethedatabasestatistics,especiallyaftermajordataloads/unloads.

Extensions
Classicalqueryoptimizationassumesthatqueryplansarecomparedaccordingtoonesinglecostmetric,usually
executiontime,andthatthecostofeachqueryplancanbecalculatedwithoutuncertainty.Bothassumptionsare
sometimesviolatedinpractice[1]andmultipleextensionsofclassicalqueryoptimizationhavebeenstudiedinthe
researchliteraturethatovercomethoselimitations.Thoseextendedproblemvariantsdifferinhowtheymodelthe
costofsinglequeryplansandintermsoftheiroptimizationgoal.

ParametricQueryOptimization
Classicalqueryoptimizationassociateseachqueryplanwithonescalarcostvalue.Parametricquery
optimization[2]assumesthatqueryplancostdependsonparameterswhosevaluesareunknownatoptimization
time.Suchparameterscanforinstancerepresenttheselectivityofquerypredicatesthatarenotfullyspecifiedat
optimizationtimebutwillbeprovidedatexecutiontime.Parametricqueryoptimizationthereforeassociateseach
queryplanwithacostfunctionthatmapsfromamultidimensionalparameterspacetoaonedimensionalcost
space.
Thegoalofoptimizationisusuallytogenerateallqueryplansthatcouldbeoptimalforanyofthepossible
parametervaluecombinations.Thisyieldsasetofrelevantqueryplans.Atruntime,thebestplanisselectedoutof
thatsetoncethetrueparametervaluesbecomeknown.Theadvantageofparametricqueryoptimizationisthat
optimization(whichisingeneralaveryexpensiveoperation)isavoidedatruntime.

MultiObjectiveQueryOptimization
Thereareoftenothercostmetricsinadditiontoexecutiontimethatarerelevanttocomparequeryplans[1](http
s://www.youtube.com/watch?v=EZ9FHvOJ0Ws).Inacloudcomputingscenarioforinstance,oneshouldcompare
queryplansnotonlyintermsofhowmuchtimetheytaketoexecutebutalsointermsofhowmuchmoneytheir
executioncosts.Orinthecontextofapproximatequeryoptimization,itispossibletoexecutequeryplanson
randomlyselectedsamplesoftheinputdatainordertoobtainapproximateresultswithreducedexecution
overhead.Insuchcases,alternativequeryplansmustbecomparedintermsoftheirexecutiontimebutalsoin
termsoftheprecisionorreliabilityofthedatatheygenerate.
Multiobjectivequeryoptimization[3]modelsthecostofaqueryplanasacostvectorwhereeachvector
componentrepresentscostaccordingtoadifferentcostmetric.Classicalqueryoptimizationcanbeconsideredasa
specialcaseofmultiobjectivequeryoptimizationwherethedimensionofthecostspace(i.e.,thenumberofcost
vectorcomponents)isone.

https://en.wikipedia.org/wiki/Query_optimization

3/5

9/8/2016

QueryoptimizationWikipedia,thefreeencyclopedia

Differentcostmetricsmightconflictwitheachother(e.g.,theremightbeoneplanwithminimalexecutiontime
andadifferentplanwithminimalmonetaryexecutionfeesinacloudcomputingscenario).Therefore,thegoalof
optimizationcannotbetofindaqueryplanthatminimizesallcostmetricsbutmustbetofindaqueryplanthat
realizesthebestcompromisebetweendifferentcostmetrics.Whatthebestcompromiseisdependsonuser
preferences(e.g.,someusersmightpreferacheaperplanwhileotherspreferafasterplaninacloudscenario).The
goalofoptimizationisthereforeeithertofindthebestqueryplanbasedonsomespecificationofuserpreferences
providedasinputtotheoptimizer(e.g.,userscandefineweightsbetweendifferentcostmetricstoexpressrelative
importanceordefinehardcostboundsoncertainmetrics)ortogenerateanapproximationofthesetofPareto
optimalqueryplans(i.e.,planssuchthatnootherplanhasbettercostaccordingtoallmetrics)suchthattheuser
canselectthepreferredcosttradeoffoutofthatplanset.

MultiObjectiveParametricQueryOptimization
Multiobjectiveparametricqueryoptimization[1]generalizesparametricandmultiobjectivequeryoptimization.
Plansarecomparedaccordingtomultiplecostmetricsandplancostsmaydependonparameterswhosevaluesare
unknownatoptimizationtime.Thecostofaqueryplanisthereforemodeledasafunctionfromamulti
dimensionalparameterspacetoamultidimensionalcostspace.Thegoalofoptimizationistogeneratethesetof
queryplansthatcanbeoptimalforeachpossiblecombinationofparametervaluesanduserpreferences.

Seealso
Joinselectionfactor
Sargablequery

References
1.Trummer,ImmanuelKoch,Christoph(2015)."MultiObjectiveParametricQueryOptimization".VLDB:221232.
2.Ioannidis,YannisNg,RaymondT.Shim,KyuseokSellis,TimosK.(1997)."ParametricQueryOptimization".VLDB:
132151.
3.Trummer,ImmanuelKoch,Christoph(2014).ApproximationSchemesforManyObjectiveQueryOptimization.
SIGMOD.pp.12991310.

Chaudhuri,Surajit(1998)."AnOverviewofQueryOptimizationinRelationalSystems".Proceedingsofthe
ACMSymposiumonPrinciplesofDatabaseSystems.pp.3443.doi:10.1145/275487.275492.
Ioannidis,Yannis(March1996)."Queryoptimization".ACMComputingSurveys.28(1):121123.
doi:10.1145/234313.234367.
Selinger,P.G.Astrahan,M.M.Chamberlin,D.D.Lorie,R.A.Price,T.G.(1979)."AccessPath
SelectioninaRelationalDatabaseManagementSystem".Proceedingsofthe1979ACMSIGMOD
InternationalConferenceonManagementofData.pp.2334.doi:10.1145/582095.582099.
ISBN089791001X.

Externallinks
SQLqueryoptimizationtool(http://sqltuning.com)
TalkonMultiObjectiveQueryOptimization(https://www.youtube.com/watch?v=EZ9FHvOJ0Ws)
Retrievedfrom"https://en.wikipedia.org/w/index.php?title=Query_optimization&oldid=733922708"
Categories: Databasemanagementsystems Databasealgorithms SQL
https://en.wikipedia.org/wiki/Query_optimization

4/5

9/8/2016

QueryoptimizationWikipedia,thefreeencyclopedia

Thispagewaslastmodifiedon10August2016,at23:55.
TextisavailableundertheCreativeCommonsAttributionShareAlikeLicenseadditionaltermsmayapply.
Byusingthissite,youagreetotheTermsofUseandPrivacyPolicy.Wikipediaisaregisteredtrademark
oftheWikimediaFoundation,Inc.,anonprofitorganization.

https://en.wikipedia.org/wiki/Query_optimization

5/5

Вам также может понравиться