Академический Документы
Профессиональный Документы
Культура Документы
ThishandoutshowshowtheweeklybeersalesseriesmightbeanalyzedwithStata(thesoftware
packagenowusedforteachingstatsatKellogg),forpurposesofcomparingitsmodelingtoolsandease
ofusetothoseofFSBForecast.Toanalyzetheweeklybeersalesseries,thefirststepistoimportthe
datafromtheExcelfile.AnystatisticalsoftwarepackagecanimportExcelfileseasily.
ThedialogboxforimportingtheExcelfileofferstheoptionofreadingthevariablenamesfromthefirst
row,whichisalsostandard.So,thesamedatafilethatworkedforFSBForecastwillworkhere.
Inordertobeabletousetimetransformationoptionslateron,itisnecessarytodeclarethevariablesto
betimeseries.Todothis,dontgototheDatamenu,whichiswheremostdatadefinitionoperations
areperformed.Instead,gototheStatisticsmenuandlookunderthetimeseriesoptionsthere.
Forsimplicity,letsjustusetheweeknumberasthegenerictimeindex:
Thisexecutesthefollowingcommandwhichisshownintheoutputwindow:
Underthehoodthisisacommandlanguageprogram,asareSPSSandSAS.Choosingoptionsfromthe
menucausestheappropriatecodetobegeneratedandexecuted.Mostserioususersofprogramslike
Statawritetheircodedirectlyratherthanlettingamenusystemdoitforthem.
Thenumericalresultsofyouranalysiswillbewrittentotheoutputwindowalongwiththecodethat
createdthem,intheformofasinglescrollinglogfile.Beforeproceedingwithyouranalysis,youneed
toopenandassignanametothelogfileforyoursession,sothatyoucansaveyourresultslater:
Thelogfileisaplaintextfilethatcontainstheoutputthatyouseescrollingbywhiledoingyouranalysis.
Itcontainsonlythetextoutput,notthegraphs.ItcanbeopenedandeditedlaterwithMicrosoftWord
orothertexteditingsoftware,andyoucanusetheappendoptiontoreopenthelogfileandadd
moreanalysistoitlater.IfyouopenitinMicrosoftWord,youshouldchangethefontsizeto8pointsto
avoidwrappingthelinesanduseafixedwidthfontsuchasCouriersothattablecontentswilllineup.
HereisKelloggscustommenufortheircorestatisticsclass,whichcanbeloadedbytypingthedo
statementshowninthecommandwindowattheverybottomofthescreen:
Theunivariatestatisticscommandprovidessummarystatisticsofsomeorallvariables:
Ifyouchoosethestandardsummarystatisticsreport,whichshowspreselectedstatisticsforall
variablesinthefile,youmustclickthroughthisscreen:
Youthengetthefollowingoutput:
Apparentlythereisnowayaroundthedefaultabbreviationofvariablenamesonsomereports.Bestto
useshortnames(8charactersorless)!
Inacustomunivariatestatisticsreport,youcanchooseasubsetofvariablestoanalyze,andyoucan
chooseupto8statsbyasequenceofstepsinwhichyouuseseparatepulldownmenus:
Hereistheoutputofthisparticularcustomanalysis,whichshows4statsfor6variables:
Howtogenerateacorrelationmatrix:
Thiscommandopensadialogboxinwhichyoucanchoosealistofvariablesbyclickingonthem.As
eachoneisclicked,itisaddedtothelistinthewindow,whichistypicalofallproceduresinStatathat
operateonmultiplevariables.
This6variablecorrelationmatrixissmallenoughtofitintheStataoutputwindow.Alargermatrix
wouldbebrokenintopieceswhenitwasdisplayed,andIdonotthinkitwouldbepossibletocopyittoa
spreadsheetorotherdocumentwhereyoucouldseeitasasingletriangulararray.Again,itisbestto
useshortvariablenames.
Howtogetasinglescatterplot:
Thegraphisdisplayedinaseparategraphwindowsthatopensup:
Themaingraphicsmenuhasotherplotoptions,includingascatterplotmatrix:
Thenewgraphreplacestheoldoneinthegraphwindow.Youcantgetmultipleopengraphwindows
(i.e.,morethanonegraphvisibleatatime)usingonlymenucommands.Youcandoitbywritingcode,
though.Thegraphthatiscurrentlyinthewindowcanbedirectlycopiedandpastedtoother
documentslikethisWordfile.Thisisanicelyformattedplot,although(aswithacorrelationmatrix)you
havetoreadacrossanddowntodeterminewhichvariablesareshowninagivenplot,aswellastheir
axisscalenumbers.Axisscalenumbersareprovidedoneachseparatescatterplotinthematrixin
FSBforecast,alongwiththecorrelationsandtheirsquaredvaluesorregressionslopecoefficients.
Howtoplot2timeseriesonthesamechartbyusingthetimeseriesgraphsoptiononthemain
graphicsmenu:
13
13.5
14
14.5
PRICE CANS_30PK
CASES CANS_30PK
200
400
600
15
800
Hereyoucantrytojudgehowthepeaksandvalleysinthetwoserieslineup:
10
20
30
40
50
Week
CASES CANS_30PK
PRICE CANS_30PK
10
Themultipleregressionprocedure:
Selectthedependentandindependentvariablesfrompulldownlists(besttouseshortvariablenames
ifyouwanttoseethefullnameofthedependentvariablehere):
Hereistheregressionoutputthatyougetbydefault:
11
Youneedtorunsomeseparateprocedurestogetadditionalmodelstatsandcharts:
-200
-100
Residuals
0
100
200
300
Theresidualvs.predictedplotistheonlyresidualplotthatisincludedonthismenu.Othertypesof
residualplotscanbegeneratedfromtheGraphicsmenu.(Moreaboutthislater.)
100
200
300
400
Fitted values
500
600
12
TheResiduals,outliers,andinfluentialobservationscommandstorestheresidualsandstandardized
residualsandafewotherstats(leverage,CooksD)onthedataworksheet.Beforeclickingthroughits
dialogboxyoumightwanttoeditthenewvariablenamesifyouaresavingtheresultsofdifferent
modelsinthesamefilee.g.,Model_1_residuals.
TheJarqueBeranormalitytestisatestthatisbasedonlyontheskewnessandkurtosiscoefficientsof
theresiduals,unliketheAndersonDarlingorKolmogorovSmirnofftestswhicharebasedontheentire
cumulativedistribution.
Theresultofthistestiswrittenbacktotheoutputwindowandlookslikethis:
Inthiscasethepvalueof1.1e05indicatesthatthenormalityhypothesisisstronglyrejected.
13
Foracloserlookattheerrordistribution,aresidualhistogramchartcanbegeneratedbygoingtothe
maingraphicsmenuandusingthehistogramprocedurewiththenewlycreatedresidualsvariableas
theinput:
Hereisthedialogboxforthehistogramprocedure:
Thedefaultbinspecificationsareprettycoarsehereiswhatyougetinthiscase.Youmightwantto
fiddlewiththebinsettingsinordertoshowmorefinedetail.
14
TheDurbinWatsonstatistic(whichrequiresexecutinganotherseparatemenucommandinordertobe
reported)isatestforautocorrelationatlag1intheresiduals.Clickthroughthisdialogbox:
TheresultingreportoftheDWstatlookslikethis.Theuserneedstoknowwhetheravalueof1.4is
significant,becausenopvalueisreportedforit.
IngeneraltheDWstatisticisapproximatelyequalto2(1r1)wherer1isthelag1residualautocorrelation.
Itsrangeisfrom0to4anditapproaches2whenthelag1autocorrelationapproaches0.Thelag1
residualautocorrelationforthismodelis0.281,whichistheteststatisticthatisshowninFSBForecast
output.FSBForecastalsoreportstheresidualautocorrelationsforlags2through7andlag12,along
withtheir95%significancelimits.The95%significancelimitfortestingthelagkautocorrelationis
2/SQRT(nk),wherenisthesamplesize,whichworksouttobe0.280forlag1inthismodel,sothisisa
borderlinesignificantvalue.
Forecastingfromaregressionmodel:
StatageneratesforecastsinamannersimilartoFSBForecast.Ifvaluesfortheindependentvariable(s)
aretypedoralreadyexistinadditionalrowsatthebottomofthedataset,thePrediction,usingmost
recentregression,usingthedatasheetcommandwillcausethecorrespondingforecaststobe
computed.Heresomeadditionalpricevaluesweretypedinatthebottomofthedataworksheet:
15
Thepredictionusingdatasheetoptionwaschosennext:
Youthenhavetoclickthroughthisboxinwhichyoucaneditthenamesoftheforecaststatisticsthat
thatwillbeshown.Heretooyoumightwanttoeditthenamesofthevariablestobecreated,ifyou
arefittingmorethan1modelusingthesamefile:
Theforecastsandtheirstandarderrorsandconfidencelimitsarewrittenbacktothedataspreadsheet,
butnotplotted:
16
MathematicaltransformationscanbeappliedwiththecreatenewvariableoptionontheDatamenu:
Forexample,hereyoucanapplythenaturallogtransformation.Youneedtotypeanameforthenew
variableandthenyouneedtotypetheformulatocomputeit.Thereisalsoanexpressionbuilder
dialogboxthatyoucanbringupbyhittingthecreatebutton.Itshowsthelistofavailable
transformationsandcantypetheirnamesforyouifyouclickonthem.
However,therearenotimetransformationsonthefunctionlist:nolagordifferenceordifferenceof
naturallogorpercentagedifferencerelativetopreviousobservations.
17
Youcangetadifferenceofnaturallogfromoneperiodagotransformationbygoingbacktothetop
levelofthecreatenewvariableprocedureandthenapplyingthedifferenceoperator,whosesyntaxis
D.,totheloggedvariable.Hereagainyoumusttypeanameforthenewvariableaswellasthe
mathematicalexpressionthatcreatesit.
Thecodethatwasexecutedbythecreatevariableprocedureintheprocessofapplyingthelogand
differencetransformationsisshownbelow,alongwiththeresultsoffittingaregressionmodeltothe
loggedanddifferencedvariables.ThisisthesameasModel3thatwasfittedtothesamedatasetwith
FSBForecast.
Fromhereyoucangoonandconstructtherestoftheregressionoutputonepieceatatimeasshown
earlier.
18