Вы находитесь на странице: 1из 14

KCG COLLEGE OF TECHNOLOGY-CHENNAI-97

COMPUTER SCIENCE AND ENGINEERING


VI SEM CSE
CS1352 Principles of Compiler Design Unit-IV Question and answers
UNIT IV CODE GENERATION 9
Issues in the design of code generator The target machine Runtime Storage management
Basic Blocks and Flow Graphs Next-use Information A simple Code generator DAG
representation of Basic Blocks Peephole Optimization.
1) What is the role of code generator in a compiler?
CODE GENERATION

Thefinalphaseinourcompilermodelisthecode generator.Ittakesasinputan
intermediaterepresentationofthesourceprogramandproducesasoutputanequivalenttarget
program.
Therequirementstraditionallyimposedonacodegeneratoraresevere.The
outputcodemustbecorrectandofhighquality,meaningthatitshouldmakeeffectiveuseof
theresourcesofthetargetmachine.Moreover,thecodegeneratoritselfshouldrunefficiently.

fig. 1

2) Write in detail the issues in the design of code generator.
ISSUES IN THE DESIGN OF A CODE GENERATOR
Whilethedetailsaredependentonthetargetlanguageandtheoperatingsystem,issues
suchasmemorymanagement,instructionselection,registerallocation,andevaluationorderare
inherentinalmostallcodegenerationproblems.
INPUT TO THE CODE GENERATOR
Theinputtothecodegeneratorconsistsoftheintermediaterepresentationofthe
sourceprogramproducedbythefrontend,togetherwithinformationinthesymboltablethatisused
todeterminetheruntimeaddressesofthedataobjectsdenotedbythenamesintheintermediate
representation.
Thereareseveralchoicesfortheintermediatelanguage,including:linearrepresentations
suchaspostfixnotation,threeaddressrepresentationssuchasquadruples,virtualmachine
representationssuchassyntaxtreesanddags.
Weassumethatpriortocodegenerationthefrontendhasscanned,parsed,andtranslatedthe
sourceprogramintoareasonablydetailedintermediaterepresentation,sothevaluesofnames
appearingintheintermediatelanguagecanberepresentedbyquantitiesthatthetargetmachine
candirectlymanipulate(bits,integers,reals,pointers,etc.).Wealsoassumethatthenecessary
typecheckinghastakeplace,sotypeconversionoperatorshavebeeninsertedwherevernecessary
andobvioussemanticerrors(e.g.,attemptingtoindexanarraybyafloatingpointnumber)have
alreadybeendetected.Thecodegenerationphasecanthereforeproceedontheassumptionthatits
inputisfreeoferrors.Insomecompilers,thiskindofsemanticcheckingisdonetogetherwithcode
generation.
TARGET PROGRAMS
Theoutputofthecodegeneratoristhetargetprogram.Theoutputmaytakeonavariety
offorms:absolutemachinelanguage,relocatablemachinelanguage,orassemblylanguage.
Producinganabsolutemachinelanguageprogramasoutputhastheadvantagethatit
canbeplacedinalocationinmemoryandimmediatelyexecuted.Asmallprogramcanbecompiled
andexecutedquickly.Anumberofstudent-jobcompilers,suchasWATFIVandPL/C,produce
absolutecode.
Producingarelocatablemachinelanguageprogramasoutputallowssubprogramsto
becompiledseparately.Asetofrelocatableobjectmodulescanbelinkedtogetherandloadedfor
executionbyalinkingloader.Althoughwemustpaytheaddedexpenseoflinkingandloadingif
weproducerelocatableobjectmodules,wegainagreatdealofflexibilityinbeingabletocompile
subroutinesseparatelyandtocallotherpreviouslycompiledprogramsfromanobjectmodule.If
thetargetmachinedoesnothandlerelocationautomatically,thecompilermustprovideexplicit
relocationinformationtotheloadertolinktheseparatelycompiledprogramsegments.
Producinganassemblylanguageprogramasoutputmakestheprocessofcode
generationsomewhateasier.Wecangeneratesymbolicinstructionsandusethemacrofacilitiesof
theassemblertohelpgeneratecode.Thepricepaidistheassemblystepaftercodegeneration.
Becauseproducingassemblycodedoesnotduplicatetheentiretaskoftheassembler,thischoiceis
anotherreasonablealternative,especiallyforamachinewithasmallmemory,whereacompilermust
usesseveralpasses.
MEMORY MANAGEMENT
Mappingnamesinthesourceprogramtoaddressesofdataobjectsinruntimememory
isdonecooperativelybythefrontendandthecodegenerator.Weassumethatanameinathree-
addressstatementreferstoasymboltableentryforthename.
Ifmachinecodeisbeinggenerated,labelsinthreeaddressstatementshavetobe
convertedtoaddressesofinstructions.Thisprocessisanalogoustothebackpatching.Suppose
thatlabelsrefertoquadruplenumbersinaquadruplearray.Aswescaneachquadrupleinturnwe
candeducethelocationofthefirstmachineinstructiongeneratedforthatquadruple,simplyby
maintainingacountofthenumberofwordsusedfortheinstructionsgeneratedsofar.Thiscountcan
bekeptinthequadruplearray(inanextrafield),soifareferencesuchasj:goto iisencountered,
andiislessthanj,thecurrentquadruplenumber,wemaysimplygenerateajumpinstructionwith
thetargetaddressequaltothemachinelocationofthefirstinstructioninthecodeforquadruplei.If,
however,thejumpisforward,soiexceedsj,wemuststoreonalistforquadrupleithelocationof
thefirstmachineinstructiongeneratedforquadruplej.Thenweprocessquadruplei,wefillinthe
propermachinelocationforallinstructionsthatareforwardjumpstoi.
INSTRUCTION SELECTION
Thenatureoftheinstructionsetofthetargetmachinedeterminesthedifficultyof
instructionselection.Theuniformityandcompletenessoftheinstructionsetareimportantfactors.If
thetargetmachinedoesnotsupporteachdatatypeinauniformmanner,theneachexceptiontothe
generalrulerequiresspecialhandling.
Instructionspeedsandmachineidiomsareotherimportantfactors.Ifwedonotcare
abouttheefficiencyofthetargetprogram,instructionselectionisstraightforward.Foreachtypeof
three-addressstatementwecandesignacodeskeletonthatoutlinesthetargetcodetobegenerated
forthatconstruct.
Forexample,everythreeaddressstatementoftheformx:=y+z,wherex,y,andzarestatically
allocated,canbetranslatedintothecodesequence
MOVy,R0/*loadyintoregisterR0*/
ADDz,R0/*addztoR0*/
MOVR0,x/*storeR0intox*/
Unfortunately,thiskindofstatementby-statementcodegenerationoftenproducespoorcode.For
example,thesequenceofstatements
a:=b+c
d:=a+e
wouldbetranslatedinto
MOVb,R0
ADDc,R0
MOVR0,a
MOVa,R0
ADDe,R0
MOVR0,d
Herethefourthstatementisredundant,andsoisthethirdifaisnotsubsequentlyused.
Thequalityofthegeneratedcodeisdeterminedbyitsspeedandsize.
Atargetmachinewitharichinstructionsetmayprovideseveralwaysofimplementingagiven
operation.Sincethecostdifferencesbetweendifferentimplementationsmaybesignificant,anaive
translationoftheintermediatecodemayleadtocorrect,butunacceptablyinefficienttargetcode.For
exampleifthetargetmachinehasanincrementinstruction(INC),thenthethreeaddressstatement
a:=a+1maybeimplementedmoreefficientlybythesingleinstructionINCa,ratherthanbyamore
obvioussequencethatloadsaintoaregister,addonetotheregister,andthenstorestheresultback
intoa.
MOVa,R0
ADD#1,R0
MOVR0,a
Instructionspeedsareneededtodesigngoodcodesequencebutunfortunately,accurate
timinginformationisoftendifficulttoobtain.Decidingwhichmachinecodesequenceisbestfora
giventhreeaddressconstructmayalsorequireknowledgeaboutthecontextinwhichthatconstruct
appears.
REGISTER ALLOCATION
Instructionsinvolvingregisteroperandsareusuallyshorterandfasterthanthose
involvingoperandsinmemory.Therefore,efficientutilizationofregisterisparticularlyimportantin
generatinggoodcode.Theuseofregistersisoftensubdividedintotwosubproblems:
1.Duringregister allocation,weselectthesetofvariablesthatwillresideinregistersatapoint
intheprogram.
2.Duringasubsequentregister assignmentphase,wepickthespecificregisterthatavariable
willresidein.
Findinganoptimalassignmentofregisterstovariablesisdifficult,evenwithsingle
registervalues.Mathematically,theproblemisNP-complete.Theproblemisfurthercomplicated
becausethehardwareand/ortheoperatingsystemofthetargetmachinemayrequirethatcertain
registerusageconventionsbeobserved.
Certainmachinesrequireregister pairs(anevenandnextoddnumberedregister)for
someoperandsandresults.Forexample,intheIBMSystem/370machinesintegermultiplication
andintegerdivisioninvolveregisterpairs.Themultiplicationinstructionisoftheform
Mx,y
wherex,isthemultiplicand,istheevenregisterofaneven/oddregisterpair.
Themultiplicandvalueistakenfromtheoddregisterpair.Themultiplieryisasingleregister.The
productoccupiestheentireeven/oddregisterpair.
Thedivisioninstructionisoftheform
Dx,y
wherethe64-bitdividendoccupiesaneven/oddregisterpairwhoseevenregisterisx;yrepresents
thedivisor.Afterdivision,theevenregisterholdstheremainderandtheoddregisterthequotient.
Nowconsiderthetwothreeaddresscodesequences(a)and(b)inwhichtheonlydifferenceisthe
operatorinthesecondstatement.Theshortestassemblysequencefor(a)and(b)aregivenin(c).
Ristandsforregisteri.L,STandAstandforload,storeandaddrespectively.Theoptimalchoice
fortheregisterintowhichaistobeloadeddependsonwhatwillultimatelyhappentoe.
t:=a+bt:=a+b
t:=t*ct:=t+c
t:=t/dt:=t/d
(a)(b)
fig.2Twothreeaddresscodesequences
LR1,aLR0,a
AR1,bAR0,b
MR0,cAR0,c
DR0,dSRDAR0,32
STR1,tDR0,d
STR1,t
(a) (b)

fig.3Optimalmachinecodesequence

CHOICE OF EVALUATION ORDER

Theorderinwhichcomputationsareperformedcanaffecttheefficiencyofthetarget
code.Somecomputationordersrequirefewerregisterstoholdintermediateresultsthanothers.
Pickingabestorderisanotherdifficult,NP-completeproblem.Initially,weshallavoidtheproblem
bygeneratingcodeforthethree-addressstatementsintheorderinwhichtheyhavebeenproduced
bytheintermediatecodegenerator.

APPROCHES TO CODE GENERATION


Themostimportantcriterionforacodegeneratoristhatitproducecorrectcode.
Correctnesstakesonspecialsignificancebecauseofthenumberofspecialcasesthatcodegenerator
mustface.Giventhepremiumoncorrectness,designingacodegeneratorsoitcanbeeasily
implemented,tested,andmaintainedisanimportantdesigngoal.
3) What are basic blocks and flowgraphs?
BASIC BLOCKS AND FLOW GRAPHS
Agraphrepresentationofthree-addressstatements,calledaflow graph,isusefulfor
understandingcode-generationalgorithms,evenifthegraphisnotexplicitlyconstructedbyacode-
generationalgorithm.Nodesintheflowgraphrepresentcomputations,andtheedgesrepresentthe
flowofcontrol.Flowgraphofaprogramcanbeusedasavehicletocollectinformationaboutthe
intermediateprogram.Someregister-assignmentalgorithmsuseflowgraphstofindtheinnerloops
whereaprogramisexpectedtospendmostofitstime.
BASIC BLOCKS
Abasic blockisasequenceofconsecutivestatementsinwhichflowofcontrolenters
atthebeginningandleavesattheendwithouthaltorpossibilityofbranchingexceptattheend.The
followingsequenceofthree-addressstatementsformsabasicblock:
t1:=a*a
t2:=a*b
t3:=2*t2
t4:=t1+t3
t5:=b*b
t6:=t4+t5
Athree-addressstatementx:=y+zissaidtodefinexandtouseyorz.Anameinabasicblockis
saidtolive atagivenpointifitsvalueisusedafterthatpointintheprogram,perhapsinanotherbasic
block.
Thefollowingalgorithmcanbeusedtopartitionasequenceofthree-addressstatementsintobasic
blocks.
Algorithm1:Partitionintobasicblocks.
Input:Asequenceofthree-addressstatements.
Output:Alistofbasicblockswitheachthree-addressstatementinexactlyoneblock.
Method:
1. Wefirstdeterminethesetofleaders,thefirststatementsofbasicblocks.
Therulesweusearethefollowing:
I)Thefirststatementisaleader.
II)Anystatementthatisthetargetofaconditionalorunconditionalgotoisaleader.
III)Anystatementthatimmediatelyfollowsagotoorconditionalgotostatementisaleader.
2.Foreachleader,itsbasicblockconsistsoftheleaderandallstatementsuptobutnot
includingthenextleaderortheendoftheprogram.
Example3:Considerthefragmentofsourcecodeshowninfig.7;itcomputesthedotproductoftwo
vectorsaandboflength20.Alistofthree-addressstatementsperformingthiscomputationonour
targetmachineisshowninfig.8.
begin
prod:=0;
i:=1;
dobegin
prod:=prod+a[i]*b[i];
i:=i+1;
end
whilei<=20
end
fig 7:programtocomputedotproduct
LetusapplyAlgorithm1tothethree-addresscodeinfig8todetermineitsbasic
blocks.statement(1)isaleaderbyrule(I)andstatement(3)isaleaderbyrule(II),sincethelast
statementcanjumptoit.Byrule(III)thestatementfollowing(12)isaleader.Therefore,statements
(1)and(2)formabasicblock.Theremainderoftheprogrambeginningwithstatement(3)formsa
secondbasicblock.
(1)prod:=0
(2)i:=1
(3)t1:=4*i
(4)t2:=a[t1]
(5)t3:=4*i
(6)t4:=b[t3]
(7)t5:=t2*t4
(8)t6:=prod+t5
(9)prod:=t6
(10)t7:=i+1
(11)i:=t7
(12)ifi<=20goto(3)

fig 8.Three-addresscodecomputingdotproduct
prod := 0
i := 1



4) What are the structure preserving transformations on basic blocks?
TRANSFORMATIONSONBASICBLOCKS
Abasicblockcomputesasetofexpressions.Theseexpressionsarethevaluesofthe
namesliveonexitfromblock.Twobasicblocksaresaidtobeequivalentiftheycomputethesame
setofexpressions.
Anumberoftransformationscanbeappliedtoabasicblockwithoutchangingthe
setofexpressionscomputedbytheblock.Manyofthesetransformationsareusefulforimproving
thequalityofcodethatwillbeultimatelygeneratedfromabasicblock.Therearetwoimportant
classesoflocaltransformationsthatcanbeappliedtobasicblocks;thesearethestructure-preserving
transformationsandthealgebraictransformations.
STRUCTURE-PRESERVINGTRANSFORMATIONS
Theprimarystructure-preservingtransformationsonbasicblocksare:
1.commonsub-expressionelimination
2.dead-codeelimination
3.renamingoftemporaryvariables
4.interchangeoftwoindependentadjacentstatements
Weassumebasicblockshavenoarrays,pointers,orprocedurecalls.
1. Common sub-expression elimination
Considerthebasicblock
a:=b+c
b:=a-d
c:=b+c
d:=a-d
Thesecondandfourthstatementscomputethesameexpression,
namelyb+c-d,andhencethisbasicblockmaybetransformedintotheequivalentblock
a:=b+c
b:=a-d
c:=b+c
d:=b
Althoughthe1
st
and3
rd
statementsinbothcasesappeartohavethesameexpressionon
theright,thesecondstatementredefinesb.Therefore,thevalueofbinthe3
rd
statement
isdifferentfromthevalueofbinthe1
st
,andthe1
st
and3
rd
statementsdonotcomputethe
sameexpression.
2.Dead-code elimination
Supposexisdead,thatis,neversubsequentlyused,atthepointwherethestatement
x:=y+zappearsinabasicblock.Thenthisstatementmaybesafelyremovedwithout
changingthevalueofthebasicblock.
3. Renaming temporary variables
Supposewehaveastatementt:=b+c,wheretisatemporary.Ifwechangethisstatement
tou:=b+c,whereuisanewtemporaryvariable,andchangeallusesofthisinstanceof
ttou,thenthevalueofthebasicblockisnotchanged.Infact,wecanalwaystransform
abasicblockintoanequivalentblockinwhicheachstatementthatdefinesatemporary
definesanewtemporary.Wecallsuchabasicblockanormal-formblock.
4.Interchange of statements
Supposewehaveablockwiththetwoadjacentstatements
t1:=b+c
t2:=x+y
Thenwecaninterchangethetwostatementswithoutaffectingthevalueoftheblockifand
onlyifneitherxnoryist1andneitherbnorcist2.Anormal-formbasicblockpermitsall
statementinterchangesthatarepossible.

5) What are the instructions and address modes of the target machine?
Thetargetmachinecharacteristicsare
l Byte-addressable, 4 bytes/word, n registers
l Two operand instructions of the form
l op source, destination
l Example opcodes: MOV, ADD, SUB, MULT
l Several addressing modes
l An instruction has an associated cost
l Cost corresponds to length of instruction
Addressing Modes & Extra Costs

6) Generate target code for the source language statement
(a-b) + (a-c) + (a-c);
The 3AC for this can be written as
t := a b
u := a c
v := t + u
d := v + u //d live at the end
Show the code sequence generated by the simple code generation algorithm
What is its cost? Can it be improved?

Total cost=12

7) What is an activation record?
Informationneededbyasingleexecutionofprocedureismanagedusingacontiguous
blockofstoragecalledanactivationrecordorframe.Itiscustomarytopushthe
activationrecordofaprocedureontheruntimestackwhentheprocedureiscalledand
topoptheactivationrecordoffthestackwhencontrolreturnstothecaller.
8) What are the contents of activation record?

Вам также может понравиться