Вы находитесь на странице: 1из 3

IncrementalLoadingforDimensionTable

WrittenbyDWBIConceptsTeam

LastUpdated:16September2014

In our previous article (/etl/etl/53methodsofincrementalloadingindatawarehouse.html) we have


discussed the concept of incremental loading in general. In this article we will see how to perform
incrementalloadingfordimensiontables.

Shouldwedoincrementalloadingfordimensions?
In a dimensional model, we may perform incremental loading for dimension tables also. One may argue
thatthiswontbenecessaryasdatavolumeindimensiontablesarenotashighasthedatavolumesinthe
facttables,hencewecansimplydoafullloadeverytime.
I personally do not agree to this argument. This is because during the last few years I have seen
tremendous growth in the data in dimension tables and things can get quite heavy especially if we are
tryingtoloadSCDtype2dimensions.Anyway,withoutmuchado,let'sdelvedeep.

StandardMethodofLoading
Likebefore,forourpurposewewillassumewehavethebelowcustomertableinoursourcesystemfrom
whereweneedtoperformthedataloading.
CustomerIDCustomerNameTypeLastUpdatedDate
1JohnIndividual22Mar2012
2RyanIndividual22Mar2012
3Bakers'Corporate23Mar2012

As discussed in the previous article, a typical SQL query to extract data incrementally from this source
systemwillbelikethis:
SELECTt.*
FROMCustomert
WHEREt.lastUpdatedDate>(selectnvl(
max(b.loaded_until),
to_date('01011900','MMDDYYYY')
)
frombatchb
whereb.status='Success');
Here"batch"isaseparatetablewhichstoresthedateuntilwhichwehavesuccessfullyextractedthedata.
Batch_IDLoaded_UntilStatus
122Mar2012Success
223Mar2012Success

Whichonetouse:"EntryDate"/"LoadDate"or"LastUpdateDate"?
In an incremental load methodology, we should extract the record when it is first created and after that
whenever the record is updated. Therefore, we should always look for "last update date" column for
extracting records. This is because, "entry date" or "load date" columns in the source systems are not
enoughtodetermineiftherecordisupdatedinthelaterpointintime.

Often source systems maintain 2 different columns as load_date and last_update_date. When
extractingdatabasedon"lastupdatedate",ensurethatsourcesystemsalwayspopulate"lastupdated
date"fieldwith"loaddate"whentherecordisfirstcreated.

What are the benefits of incremental loading of dimension


tables?
Once we extract records incrementally based on their last update date, we can compare each record with
thetargetbasedontheirnaturalkeysanddetermineiftherecordisanewrecordorupdatedrecord.
However,ifwedonotextractincrementally(andeverytimeextractalltherecordsfromsource),thenthe
numberofrecordstocompareagainsttargetwillbemuchhigherresultingintoperformancedegradation.If
we are doing incremental loading, records that do not have any change will not come only new or
updatablerecordswillcome.Butifwearedoingfullload,everythingwillcomeirrespectiveofanychange.
Prev(/etl/etl/55incrementalloadingforfacttables)
Next(/etl/etl/53methodsofincrementalloadingindatawarehouse)

Doyouknowtheanswer?
Whichofthefollowingisnotadatabase?
Oracle
MSSQLServer
Hadoop
MySQL
Submit

Popular
Top20SQLInterviewQuestionswithAnswers(/database/sql/72top20sqlinterviewquestionswithanswers)
BestInformaticaInterviewQuestions&Answers(/etl/informatica/131importantpracticalinterviewquestions)
Top50DataWarehousing/AnalyticsInterviewQuestionsandAnswers(/datamodelling/dimensionalmodel/58
top50dwbiinterviewquestionswithanswers)
Top50DWBIInterviewQuestionswithAnswersPart2(/datamodelling/dimensionalmodel/59top50dwbi
interviewquestionswithanswerspart2)
The101GuidetoDimensionalDataModeling(/datamodelling/dimensionalmodel/1dimensionalmodeling
guide)
Top30BusinessObjectsinterviewquestions(BO)withAnswers(/analysis/businessobjects/69top
businessobjectsinterviewquestions)

AlsoRead
MethodsofIncrementalLoadinginDataWarehouse(/etl/etl/53methodsofincrementalloadingindata
warehouse)

CDCImplementationusingFlatfile(/etl/informatica/152cdcimplementationusingflatfile)

Haveaquestiononthissubject?
Askquestionstoourexpertcommunitymembersandclearyourdoubts.Askingquestionorengagingin
technicaldiscussionisbotheasyandrewarding.

AskaQuestion,we'llAnswer

AreyouonTwitter?
Startfollowingus.Thiswaywewillalwayskeepyouupdatedwithwhat'shappeninginDataAnalytics
community.Wewon'tspamyou.Promise.
Follow@dwbic

AboutUs
DataWarehousingandBusinessIntelligenceOrganizationAdvancingBusinessIntelligence
DWBI.orgisaprofessionalinstitutioncreatedandendorsedbyveteranBIandDataAnalyticsprofessionals
fortheadvancementofdatadrivenintelligence
JoinUs(/dwbi.org/component/easysocial/login)|Submitanarticle(/contribute)|ContactUs(/contact)

Copyright
(https://creativecommons.org/licenses/byncsa/4.0/)
Exceptwhereotherwisenoted,contentsofDWBI.ORGbyIntellipLLP(http://intellip.com)islicensedunder
aCreativeCommonsAttributionNonCommercialShareAlike4.0InternationalLicense.
PrivacyPolicy(/privacy)|TermsofUse(/terms)

Getintouch
(https://www.facebook.com/datawarehousing)
(https://www.linkedin.com/company/dwbiconcepts)

(https://twitter.com/dwbiconcepts)
(https://www.youtube.com/dwbiconcepts)

(https://plus.google.com/b/105042632846858744029)

Security
(https://www.beyondsecurity.com/vulnerabilityscannerverification/dwbi.org)

Вам также может понравиться