Logout succeed
Logout succeed. See you again!

BSTJ 60: 2. February 1981: Compiling Three-Address Code for C Programs. (Reiser, J.F.) PDF
Preview BSTJ 60: 2. February 1981: Compiling Three-Address Code for C Programs. (Reiser, J.F.)
Compiling Three-Address Code for C Programs By. F REISER (haruscrt received Februnry 27,1960) {This paper deveribes post processor that improves the assembly. language code generated bythe portable C compiler. The novel ability to change a sequence of two-address tnatrucion® into an equitalent ‘three-adareasinetruction distinguishes this particular code improver {from other “peephole" inprovers. The combined compiler improver ‘generates good three-address code for the Digital Equipment Cor Jjuration vaxe-12" computer without requiring extensinechenges i ‘he compiler itself, which war designed to accommoelate machine larehitectures with at mast two addresses per instruction. Por ipical ‘programe the improver reduces the nuamber of Bytes inthe instruction sarwc by 10928 pernent. This paper emphasizes the technique used tn tremsform to-addrese code tn howe adress code 1 mTRoDUCTION ‘The portable C compiler ican effective tool for quickly constructing .C compiler’ for a general purpose digital computer, With roasonable ‘lor the resulting compiler generates correct code, and the quality of the tanslaton into assembly language is acceptable. However, users frequently demand better code if they anticipate prolonaed or exten sive use of programs wetten for a particular applienion, A. post prover that reac he aseanbly language generated by the compiler 1 writes better aaerably lnguege having the equivalent effect cat Estey much of the demand. (Here “beter” cade requires fewer byes {or inseretions or lest Lime Lo execu of both) Thi paper deseribes 4 program chat improves code generaved for the Digital Equipment Corporation vax-I1® computer, paying particular attention to che ‘technique used to transform two-address codes into throe-address codes ‘One reason why a code improver can be effect compiler ken generates code in the sasast ‘ven if auch a eode is suboptimal over mide i that the portable ible eanrect ane, rnge of machines, The compiler expects that @ post procedsor will lean up after it, For ample, the compiler translates the C program fragment sate (52) iE (> 0} break; 1 sift were vriton while =) ( <= 0) gow L100, oto L10; Lo, Lio: ‘which contains a conditions jump around an unconditional jump. It ‘hould ot be dificult fo compile the original fougmene as if e were white ==) [ (b> 0) goto Lil; 1 an: ‘wat the compiler does not do this, x0 one ofthe standard tasks fora ode immprover 4 to replace “skipe over jumps” with jumps on the negated conditions. ‘nother reason that «coe Improver can produce beer code ia that the compiler model of code yeneration may ignore or not take fall fdventage of achitacharal features found on « spesific machine. The portable C compiler undersiands one-address instructions and to- Ires instructions bul does not understand (hreeaddrece instruc tions or instructions which uae nn addres ab an immediate opera ‘Similarly, Ue compiler thrives on certain addressing modes (exter, pointer, placement from u renter base) and has difculy folly fxploiing thers (auto inerement, double indexing). "Kode improver can also le efective beenase C lnguage statements ‘or compilation on a stalement-by-satoment basis may be too low level The concept “urn off bit 26” may have a direct hardware implementation, but must he expressed in C language as « Boolean AND operation. ‘The portable C compiler attempts no analysia of interservement information flow, nor daca italwaye cake advantage of 160 THE BELL SYSTEM TECIINICAL JOURNAL, FEBRUARY 1981 hardware idioms. A oode improver can often perform nme flow analysis and recognize more hardware idioms "The idea ofa code improver is not new. "Peephole optimizers” are well Known" One C compiler forthe Pbe-11 computer has had a code iimprover for many years” The vecion of Ref. 5 nthe PINAL compl Ittion pass describes a code muprover used internally by m MR#-1T compiler. "The code improver described here makes the portable C compiler ‘uathle ax the worEhoro compiler in serious production ensironment ‘Measurements indiate tha for eypical programs the improver reduose the numberof byte in the instruction stream by 10 wo 23 percent; the rove technique reported here noeounta for as much as one-hit of the reduction. The time required to execute the code is alao reduced by 4 to 8 percent. The improver produces good three-address ende from the two-ndress code genernted by the compiler. 1, PROVING CODE FOR THE VAX [An existing improver of coe compiled for che ror-11 served as a ‘model and outline forthe YAX-UL code improver. The iemprover reads tile of aesombly language und dsides the fle invo seumvents corre ponding to € procedutes, For each procedure it construcis a doubly- Inked list of the instructions and label deGnitions, with nltional links for references to inbeln The improver then combs the list, repeuidly trying ' apply any one of several incremental rsnsfor- Instiona. ‘The tansformetions satisfy @ principle of optimally: Any Iocal improventent is guranteed to bo a global improvement atleast fs Tare, and conversely the program ata whole can be made smaller tr faster, then there i clletion of local changes which wll account, for the improvement. When no further transformation ean be made, {he improver prints the lit and mover on tothe next procedure, Muy ff the ttanelornations depend lille on the partealar machinw Straightforward adaptation of the old program yielded code to tra tively close jumps to jumps, delete instructions thal. immediately follow uneonditional jum delete jumps to th immediately folowing instruction, remove unreferenoed of redandant labels, merge common lal sequences, move basic blocks tothe point of sle ase and in Change physical order of the consequent and alternative to w lst. ‘Simple modifications alsa producod a program to zo:ate loops co place {single conditional jump at the botlom, handle skipe over jumps, Sliminate redundant setting of the condition code, move common lantacedentsof jumps into the merged cal eliminate constant tests or teats which are subsume by a preceding test, exploit add-compare ror ba ir ademas f Dg Heupment Corporation ‘COMPILING THREE-ADDRESS CODE 161 ‘Table |—Transtatons of a= b+ mide sks 9 Boars moves eee bunch (DO-locp") insiructions, and remember values already fa register. 1, THREE ADDRESSES FROM TWO FFlly uiing the thros-uddress instructions available on the VAX 1 presented a now challenge. Table Tilustrates a common epportanity to-use a three-addrese instruction, In this example the variables 2,2, are assumed to reside in memory (either global or local) and notin reghtors, "The first column gives a tanelaton for the mor.11 Unit ‘anole improved in ether ime ov space. (If come ofthe variables reside in registers, then improvements are poasible) Hoch the produc- tion nd the portable C compiler fo the ro-1l produce this translation ‘rithout the aid of a code improver. The sooond column contains the ode generntad by the portable € compiler forthe Vax. Te come piler saves one instruction by doing the work of the frst bwo PDF-LL Instructions in one Uhre aldrews ax-1iinatruelion. However, it will hot generate the code nthe ght most column, where single instruc tion sufies for the whole statement, Internaliy the portable ( con. pier uses a binary tree to represent cach parsed statoment. ‘The height of a binary tree with three external nodes (exch explicit variable is represented by an external node) must be as leas two. Purthermere, the pettern-matching algorithms need by the compiler aro reatzcted lo nbtzone of height ae (The paltern match fas since bown general ined Wo mntch mublreen of arbitrary height) Thos the eampiler gence star two separate insorctions for this case. It doos have the lexblicy ‘ouse an instruction with three addresses, but the destination operand of a throoaddress instruction must always be one of the compiler’s temporary loetions, wunlly a regitsr. The challenge to the code improver i Wo rocagive silntions Tike this ane and change the code appropriately. ‘Table! ilustrates a complication. Here the addition and astigument are embedded af an expression whose value is parsed af actal frgument in e procedure call. though the came addll and movl instructions appear together, the value in r0 is noeded later and 70 cannot be elded. In standard terminology, the value in ropsterr0 is lice, or altsrnatvely riser ef ia busy. The improver oan ele gia vost only wh Uhe val in the register ie known tobe de, a the ‘egitria fee 162. THE BELL SYSTEM TECHNICAL JOURNAL, FEBRUARY 1961 Foran arbicrary program, determining which reislers are tree at 2 ven point requires fai amount of work. The register usage and flow of control through any part of the program can effect whether or no areebteris buyin any other part uf the progeam, Code generate by the portable © compiler hat a property that makes busy/ree fnolysis much simpler. All reystera are fee any time the compiler {encrates a backward branch iasteuctin. The portable C compiler {enerates code on line, complotely translating the eurrent expression br siatemont before proceeding to the following expression or slat iment. The wwe of a cemporery expression alvaya oceurs physically titer its generetion. Thus the entre bumy/freeanalyie can be done in w single backward rean over the generated cede. The backward scan marl a register busy each tists the ropiatr ied oF used aaa source bperend. Some instruction owurring loser tothe front ofthe fle must hhave put a live value into the regiter, or else the register would contain garbage. Analogously, the backward sean marks a rogister fee tach time the rogister ia written or sed as a destination operand. Since the write destroys whacever used to be inthe register, no one ‘ould have wanted that dead value, "The backward sean must take precautions to record each use of a temporary reste, including che implicit ute. The rotur instruction et implicitly veade 7, the regiter in which © eode returns function ‘Values, Thus 10 is buny just Before och ret. The overall code-guner- cin aczategy ofthe compiler assures that each procedure cal insruc- ion calle writes all the temporary registers, Thus all the temporary relates are ree just before a procedure call "The buses information can ala bo used to eliminate ead code [An instruction that writes onl into free registers does no useful work, ‘neep! possibly for the side effect it exes, If the address compute tionm contain no side effec, then only the condition code could matter. ‘Tho enndition code ir eat by each sonbranch instruction, on the Condition code itself i Crew unless the instruction which logically fellowa ies eoncitional branch, "The backward scan must leo be careful with code generated from conditional expressions, ‘Thene ean be no busy repister a the tie of & backward jump, a noted earlier, Sie the compiler performs no “Tablo N—Tranalations of f(a = B+ el, ase st 20 movie toe sow 0,0) pe feet falls ‘COMPILING THREE-ADDRESS CODE 163 Inlerstatement dataflow analysis (and in particular doesnot recognize ‘common subexpression, Unere can be no buy reyeters ut the time of 2 forward jump generaved from an entire C statement. Since labela ‘exist only because jump instructions brunch to them, theso wo fata right soagest Uhot a rogater eanmot be busy a any Inbel, either. A Fels oa, however, be busy at a forward jump (abd thus at label) ‘with ane of the values ofa conditional expression. "Table II uetrates ‘one euch etaation, ‘Bven though che instruction movl er0 writes 70, the register is bury at the jr because lif a is tro) it contains the value of to be Stored into x. ‘Thus the busy/treo status of each register must be ‘mocinted with rach lel he Inbe i peed during the backward can, and retrieved trom the corresponding label at each jump. This can be done efficiently by keeping abit vecvor associated with wach label, initializing all the bits to "bee," and recording busy regitern a Jabelo are passed, Hecause backward jump have no busy registers and the backward scan encouncers the destination label ofa forward jump, before sosing tho jump ito, the bits will lwaya be correct. ‘In general the code improvements other than insertion of three- address inetructions and elimination of dead oode by eonealting che bbusy/free information destroy the property that no temporary rowstar ishusy sta backward jump. This implies tha using a eingle Dalewarl ‘sweep over the code far the entire procedure to determine husy/ftee {is valid only once, at the boginning before other improvements are ‘ried, Fortunately, once is enough. 1W, OTHER USES OF THE BACKWARD SCAN "The hackwsard pretcun it alana good time tn recognize hardware Sidious. "The vase-LL hae a nur of iatzuctions to el cle, and test single bits, snd to extract contiguous bit fields of arbitrary se, Appropriate uses of these inscructions are often concealed in © with ‘various Boolean orehif-and-maak operntorsor sequences of operstors Computing with the addresting mendes by wing instructions in which tn adress is wed as an immediate operand often saves time and face, Powerful addressing modes often depend heavily on register tage, and the backward pass ie alrendy computing this information, Since the backward scan is performed only once, tine will not be 164 THE BELL SYSTEM TECHNICAL JOURNAL, FEBRUARY 1961 wanted searching for hardwate idioms more than onee, ag part of the Beneral iterative inprovementstratugs. Table TV gives some example Improvements, \. DEVICE DAIvERS On the vaxcll, the control and data reyisers for input/output evoes he in the memory adaress space, Programs manipulate the registers in much the same way as Uney manipulate memory, and the tssomnily-language code fora device driver cannot bo Wentified solely by ite form. However, certain inetructions nnd mldresting modes do nt ark property when addreased o device registers. Cenerally these fre exactly the instructions and addressing modes that the ende improver wants to introduce. For example, neither of the frst wo iimprovernents in Table TV is legal ona dovice register. Thus the code improver must be told whes iti improving the cade for a devies dover, #0 ie con avold those improvements that cause problems. Reading o writing device register typically ns we effects that are Giforent from reading or writing « memory location another har ‘wary considerations eich ax bus widths, circuit board area, a number Of wonls of microcode aze often important, Yet from a software iowpoint such special coven are imitating and error prone, and it ‘would be desirable to get ri ofthe complication. Mi coneLusions. ‘A singls backward sean enshles the code improver to determine register ueage and introduce threeacdruss instructions where appro- Drste The backward scan cakes vantage of cho fact chat all risers tre fice at ench backward jump, a property chat would otherwise be Considered a weakness in Une compile. The single backward scan also Fecognizes hardware idioms at» lower cost than previous algorithms. “Table IV-—Improvements using VAX-H hardware idioms Peer cent Stat poacy ron Siesle one EI al seo COMPILING THREE-ADORESS CODE 165 REFERENCES 8. WEST Kam “Poole Opin Cun ACM, ly 198, 6 AF Sheard > Utes Pricipf Coir Ds Reig Msn 6 w/a Ri don, CB. Welaluck BO Hota, pd CM. Gots, The ‘ate of ae Ootmscet Conn ew Veet Hone, 4165. THE BELL SYSTEM TECHNICAL JOURNAL, FEBRUARY 1961