Академический Документы
Профессиональный Документы
Культура Документы
CONTENTS
In the second approach, all relocation infor- tual, name) address space [6, 7]. The physi-
mation is grouped into a relocation table cal space consists of the set of main mem-
(dictionary) by the translator. The reloca- ory locations where information may be
tion table contains a pointer to each mach- stored. The logical space consists of the set
ine language instruction that must have its of identifiers that may be used by a pro-
address field relocated. gram to reference information. The transla-
It should be noted that in most systems tion or mapping of a logical into a physical
the total number of times an address field is address is called address binding.
relocated must be zero or one. The reason The linking process may be viewed as
being that, in the case of the relocation bit binding (combining) mdependent logical
approach, the translator is only introducing spaces into one composite logical space. In
one additional bit (two states) to indicate essence, the binding of a set of subprograms
"relocation" or "no relocation." The restric- in logical space is equivalent to fixing their
tion that at most one relocation can be as- positions relative to a common base. Let us
sociated with an address field could, if it now discuss the seven cases listed above in
were meaningful, be changed to "at most n more detail.
relocations" by expanding the relocation bit To link at or before translation time im-
from one bit to log2 (n + 1) bits. A similar plies a separate translation for each differ-
reasoning applies in the case of a relocation ent combination of subprograms. This rep-
table. resents an important drawback. The I B M
The relocating loader is responsible for 1401 Autocoder is an example of a system
loading into main memory a program in re- that performed linkage at translation time.
locatable binary form and updating (relo- The approach of linking after translation
cating) all relative addresses. Note that but before loading time is employed in the
when a relocating loader is used, the alloca- I B M System/360 (Linkage Editor) and the
tion of memory to a given program will re- U~IVAC 9400. In the basic case (refer to
main bound (i.e., fixed) for the duration of Figure 2) the input to the linker conmsts of
that program's execution. one or more subprograms in binary sym-
bolic form. (These subprograms can come
from secondary storage a n d / o r directly
3. LINKERS from translation.) This form is similar to
relocatable binary except that an additional
The linking t of subprograms together to table (dictionary) is included with each
form a composite program is of great value subprogram presented to the linker to indi-
in the modular development of software. To cate the definition and use of external sym-
place the linking function in perspective, let bols [8]. External symbols are symbols that
us view Figure 1 as a function of time. are declared to be "public" and, as a result,
Source program coding must be performed can be referenced by other programs. Exter-
first, translation second, loading third, and nal symbols are the primary facility
execution fourth. Linking, however, could through which independently translated
be carried out at seven different times, subprograms communicate. (External sym-
namely: 1) at source program coding time ; bols are discussed in detail in section 4.1.1.)
2) after coding but before translation time; It is the responsibility of the linker to com-
3) at translation time; 4) after translation bine the input subprograms into a single
but before loading time; 5) at loading time; (reloeatable) output module in which all
6) after loading but before execution time; external references have been resolved, if
or 7) at execution time. the module is to be loaded for execution.
At this point it is worthwhile to introduce Carrying out the linking process after
the concepts of physical and logical (vir- translation but before loading time (or
later) allows the composition of a set of
? T h e h n k m g process has also been called binding
(Burroughs), collecting (UNIvac), and building
linked subprograms to change without fore-
( I B M 1800) ing retranslation of the entire collection;
'°'p:° I
I IndependentlyI
T,ansIa(ed
(SECONDARY/'
\
I One Refocatable
I ~ IModule I ~ OneModule
I ------.---~1 LINKER I IComposed of I_.o,,Io_~1 LOAOER I ,~ mMomMemory
| .~I I | Independently I - - -I I ~ Ready for
I I ~1 I I Translated I I I Execuhon
I
i I I Subprograms
IlndependentlyI I
ITranslated
ISubprogrom
only linking has to be performed anew. The names that are resolved at execution time.
output of the linker can be supplied to the The most general implemehtaton of this
loader for immediate loading or it can be concept is that embodied in the Honeywell
stored in secondary memory for future link- 645 MULTICS system (formerly the GE-
ing and/or loading. This alternative pro- 645). For an excellent discussion of segmen-
vides for a flexible system. Furthermore, the tation as well as the MULTICS system the
linker represents a natural base for the in- reader is referred to [7]. Briefly, from the
corporation of subprogram editing facilities. linking point of view the principal advan-
Linking could be performed at loading tages of segmentation include: the binding
time (i e., linking loader) as in many exis- of segments to the composite logical space
tent systems (e.g., Loader in IBM System/ only when required; possible segment
360, XDS-UTS, CDC-SCOPE, and SEL growth during execution since segmentation
810B-BOS). The popularity of linking allows the management of logical space;
loaders is a result of their simplicity since and sharing (of segments) in its most gen-
the loading stage is a natural place to brad eral form since the relative position of a
subprograms together On the other hand, shared segment in one logical space is inde-
since linking implies loading, there is less pendent of its position in other logical
flexibility than when the two functions are spaces. The main problems with segmenta-
implemented as separate modules, as in the tion include: overhead costs, since execution
case earlier discussed. The input to a link- time binding is less efficient although more
ing loader consists of one or more subpro- flexible than earlier binding; and the im-
grams in binary symbolic form. The output portance of an integrated (hardware and
consists of one module in main memory software) design.
ready for execution. Again, all external ref- To treat linking and loading in more de-
erences must be resolved before execution. tail we discuss the implementation of these
Linking after loading but before execution functions on the IBM System/360. The
(case 6) would only be logical if we could System/360 Operating System (OS) pro-
keep a very large number of subprograms vides two alternatives. On one hand there
resident in main memory. In today's sys- exists a linking loader referred to as
tems main memory is a scarce resource; Loader; on the other hand there exists a
thus, this alternative is not attractive. sophisticated hnker, called the Linkage
Finally, it is possible to link (bind) at Editor, and a simple relocating loader re-
execution time. Such an approach is called ferred to as Program Fetch. The linking
segmentation. A segment is a self-contained power and flexibility of Loader is a subset
logical entity of related information defined of that provided by the Linkage Editor; so
and named by the programmer (e.g., sub- that only the function and structure of the
program, data array). All intersegment ref- Linkage Editor and Program Fetch are ex-
erences are achieved through symbolic amined in detail. The case in question is
that outlined in Figure 2. There is an addi- of the machine language code for the trans-
tional facility (LINK) provided in the Sys- lated program, relocation reformation, and
tem/360 OS which allows two separate logi- a table indicating the definition and use of
cal spaces to communicate at execution external symbols. As a result, Object Mod-
time through general registers. This will not ules correspond to the binary symbolic form
be discussed here. The objective of the re- of a program that was discussed in Section
maining sections of this paper is to illus- 3. Each module (see Figure 3) is divided
trate basic implementation techniques and into the following four sections [9]: Exter-
to point out system trade-offs. (Terms that nal Symbol Dictionary, Text, Relocation
refer to the IBM System/360 are capital- Dictionary, and End Record.
ized throughout.)
4.1.1 External Symbol Dictionary (ESD)
ory. Consequently, the relocation table is of to the contents of the base register to obtain
maximum length. the effective memory address.
In the System/360, the size of the reloca- This discussion illustrates some impor-
tion table is greatly reduced because the tant trade-offs that exist at system design
system architecture utilizes a base register time. On the one hand there is the trade-off
approach m calculating the effective ad- between hardware and software as far as
dress. The effective address is formed by contributions to the relocation function are
always adding the contents of a base regis- concerned (e.g., base registers). On the
ter to the contents of the instruction mem- other hand there is the the trade-off be-
ory address field. (When indexing is speci- tween various software modules. For exam-
fied, the contents of an additional index ple, the loader in the System/360 has
register is added in when forming the effec- shifted some of its responsibilities to the
tive address.) Therefore, the address por- language translators which now have to
tion of machine language instructions that output addresses in a base plus displace-
reference memory can be represented by the ment format. Moreover, note that with a
ordered pair: (Base Register, Displace- base register approach the binding of a
ment). The first element of the pair indi- memory address is not completed until the
cates which one of the 16 general-purpose last possible moment---execution time.
registers is being used as a base register; In the System/360 approach the only
the second element is a displacement (in parts of the Text that require relocation are
bytes) from the address contained m the those entries that represent Address Con-
base register. The hardware calculates the stants. It is sufficient for us to think of an
effective memory address at execution time Address Constant as simply a cell that will
by adding the displacement field to the con- contain an absolute memory address at exe-
tents of the appropriate base register (and, cution time. Address Constants are used
if specified, the contents of another gener- primarily for: 1) initializing base registers,
al-purpose register, an index, is also added). and 2) communicating between Control
It is the responsibility of the language Sections.
translator to place the proper base register In discussing Address Constants, we must
and displacement in the address portion of distinguish between" 1) the cell (Text
the machine language instructions being entry) that contains the constant, and 2)
generated. In the System/360 Assembly the value of the constant (Le., the contents
Language, the USING pseudo-operation ex- of the cell).
ists to inform the Assembler: 1) which of In tile 360 Assembly Language, Address
the sixteen general registers is to be used as Constants are normally established with a
a base register, and 2) the relative address DC (Define Constant) pseudo-instruction.
that will be in the base register at execution For example,
time [11] These two pieces of information JW DC A (LP)
enable the Assembler to correctly build the
defines a cell labeled JW which at execution
address portion of each memory reference
time will contain the actual memory ad-
instruction. When writing in 360 Assembly
dress of the cell labeled LP. The Text entry
Language the programmer is responsible for
that contains the Address Constant cannot
including instructions in his program that be completed at translation time since the
at execution time will load the base regis- address of the symbol specified in the refer-
ters with the appropriate addresses. As a ence field of the DC instruction will not be
result of this organization, the address fields known until the corresponding module is
of machine language instructions in the loaded Therefore, Address Constants must
System/360 do not have to be modified be completed (relocated) by the loader.
when a module is loaded. In effect, the nec- There are two principal kinds of Address
essary relocation occurs at execution time Constants that require relocation: A-type
when the hardware adds the displacement and V-type [9].
USING) which base register is to be to reflect the new addresses that were as-
used. signed.
3) Multiple base registers facilitate the
sharing of programs and allow the spht- 4.2.2 Relocatmq Address Constants
ring of programs for loading into noncon-
tiguous areas of main memory. Once contiguous addresses have been as-
signed to the Control Sections in the input
In addition to the static relocation of Ad-
dress Constants there are other features of modules, all A-type and V-type Address
the System/360 architecture that prevent Constants must be relocated relative to the
the dynamic relocation of programs Among beginning of the output module being
these are the fact that working registers cancreated. This relocation is accomplished in
contain absolute addresses (e.g, return ad- the following manner [9]:
dresses from Branch-And-Link instrue- 1) Every entry m the RLD for each indi-
tmns), and the fact that working and base vidual input module is read. The R and P
registers are drawn from the same register pointers are updated to point to the cor-
set. rect CESD entry. The Address field is
updated by adding to it the contents of
4 1 4 End Record the Assembled Origin field of the CESD
entry pointed to by the new P pointer.
The End Record indicates to the Linkage The Flag field is examined to determine
Editor that the end of the Object Module the type of Address Constant represented
has been reached. by the RLD entry.
With the structure of Object Modules in 2) If the RLD entry represents a V-type
mind (Figure 3), we can now discuss in more Address Constant, then the constant cor-
detail the priinary Linkage Editor function responds to an external reference. This
of hnking together one or more Ob)ect means that the symbol referenced is not
Modules. defined in the input module containing
the Address Constant, but is defined (it is
hoped) in one of the other input modules
4.2 Linking Together a Set of Modules being linked together In the ESD of the
In linking together a set of modules, the input module the external reference is
Linkage Editor is primarily responsible for represented by an entry with Type set to
[9]: 1) assigning addresses; 2) relocating ER. When the Composite External Sym-
Address Constants; and 3) creating an out- bol Dictionary is formed during the first
put module (called a Load Module). step of linking, each ER type entry
should be matched by an SD, LD, or CM
type entry that has the same name field.
4.2.1 Ass~gnt~g Addresses
External references that are matched in
Each input Object Module may consist of this way are said to be resolved since the
one or more Control Seetmns To produce a referenced symbol corresponds to either a
single loadable module, the Linkage Editor Control Section Name (SD type entry),
assigns consecutive relative addresses to an Entry Point Name (LD type entry),
each Control Section encountered. This is or a Common area (CM type entry).
done by asmgning an address of zero to the Only one entry (the SD, LD, or CM
first Control Section and then assigning ad- entry) is retained in the CESD. On the
dresses relative to this origin to all other other hand, if there is no matching SD,
Control Seetmns. During this process the LD, or CM entry, the ER entry is placed
External Symbol Dictionaries of the input in the CESD and the external reference is
modules are merged together to form a said to be unresolved. Relocation of a V-
Composite External Symbol Dictionary type Addre~-s Constant is accomplished in
(CESD). The Assembled Origin fields of all the following manner. The constant is ac-
SD, PC, and LD type entries are updated cessed through the Address field of the
tion Dictionary of an input module. The Point), and B (an external reference).
end of the Load Module is indicated by an There is one entry in the RLD for the sin-
EOM (End-Of-Module) record. gle (relocatable) Address Constant present
At this pomt let us discuss an example in in Object Module One. The entry indicates
detail. that .the value of the constant depends on
the address of the external symbol B and
that the constant is defined in Control Sec-
4.4 Linking Example tion A at relative byte location 300. The
value of the constant has been set to zero
In this example two Object Modules, by the Assembler.
each containing one Control Section, are The format of the second input module is
linked together to form one output Load shown in Figure 8. There are two entries in
Module. The format of the first module is the External Symbol Dictionary, one for
shown in Figure 7. the Control Section name B and one for the
Object Module One has one named Con- external reference BILL. As indicated in
trol Section (CSECT A), one Entry Point the Relocation Dictionary, there are two
(the statement labeled BILL), and one V- Address Constants, DC A(JOE) which
type Address Constant (DC V ( B ) ) . As is a local reference, and DC V (BIL L )
shown in the External Symbol Dictionary, which is an external reference. The R and P
there are three external symbols: A (a Con- pointers m the RLD entry for the local ref-
trol Section name), B I L L (an Entry erence, DC A(JOE), are the same since
ASSEMBLED
NAME TYPE ORIGIN LENGTH(OR ESD I0)
A SO 0 500 E
BILL- LD 200 ! S
B D
CSECT A
ENTRY BILL
T
8ILl
BYTES E
50O X
BYTES
T
I ] V-type 300
!
P FLAG ADDRESS
Fla. 7. Object Module One
ASSEMBLED
NAME TYPE ORIGIN LENGTH (OR ESO ID)
0 I 300
BILL ER
CSECT B
Ioo
BYTE.¢
JOE - -
2O0
BYTES T
E
X
B' T
300 DC A (JOE) lOOl.,
BYTES VALUE OF THESE
DC V (BILL) 0 ~ CONSTANTSSETBY
THE ASSEMBLER
A-type I 200
V-type 204
FLAG ADDRESS
FIG. 8. Object Module Two
the Address Constant is contained in Con- fields in the Relocation D]ctionaries have
trol Section B, and the value of the con- been updated to point to the correct CESD
stant depends on the address assigned to entries, and the Address fields have been
Control Section B. changed to reflect the new addresses as-
As said, it is the responsiblhty of the lan- signed. The three Address Constants have
guage translator (e.g., compiler, assembler) been relocated relative to the start of the
to create the Object Module in the format module, and the two constants that repre-
defined. Thus, Object Module One and Ob- sented external references have been re-
ject Module Two would have been produced
solved.
by two previous and independent transla-
Having discussed the primary function of
tion processes.
the Linkage Editor (linking together inde-
Processing by the Linkage Editor yields
one output Load Module, the format of pendently translated modules), and having
which is shown m Figure 9. As indicated in also described one class of Linkage Editor
the Assembled Origin field of the Composite inputs (input modules), let us now briefly
External Symbol Dictionary, Control Sec- discuss the secondary funchons of the Link-
tion B had been assigned an address rela- age Editor and the other class of Linkage
tive to Control Section A. The R and P Editor inputs (control statements).
ASSEMBLED
NAME TYPE ORIGIN LENGTH(OR EGO ID)
,1 ol o 500
C
E
I , S
I] l.i
B I SDI 500 300 D
CSECT A
ENTRY BILL
TEXT OF OBJECT
BILL . . . . . . . . . . . . . . . . . . . . . . . MODULE ONE
500
BYTES
OC VIBI 500
600
BYTES
1 DC
DC
A (JOE)
V (BILL)
600
200
TEXT OF OBJECT
MODULE TWO
R P FLAG ADDRESS
FIG 9. Format of output Load Module
4.5 Linkage Editor Control Statements ate secondary function. Since we are more
interested in concepts than in details we
In general, Linkage Editor control state- will not concern ourselves with the syntax
ments modify or augment the processing of control statements
performed by the Linkage Editor in its pri-
mary function of linking modules. Control
4 5 1 Overlay Processing
statements specify to the Linkage Editor
which of the secondary functions (Overlay The tendency in modern computer archi-
Processing, Program Modification, or Li- tecture is toward systems with hardware-
brary Access) are to be performed. To con- aided (mare) memory management (e.g.,
vey the flavor of what takes place, some paging) [6]. As a result of these hardware
key Linkage Editor control statements will facilities the software effort necessary to
be discussed as they relate to the appropri- implement a dynamic (i.e., execution time)
Linkers and Loaders • 163
Program must find another job in the sys- loader's work area, and the Address Con-
tem (i e., a program in memory other than stant in C S E C T A is updated (relocated)
the one that has just requested main stor- by adding the sl,artmg Text address (2000)
age) and write all the memory allocated to to the value of the constant (Figure
that job (its Regzon) onto secondary stor- 11 (b)). The Text for C S E C T B is then read
age. The space occupied by the rolled out into the next available section of memory
job is then made available to the requesting l location 2500), and the two Address Con-
program stants are relocated (Figure 11 (el).
Once the request for memory is satisfied, It should be noted that the relocating
the appropriate amount of main storage is loader does not reference the Composite Ex-
allocated. This storage will be used to hold ternal Symbol Dictionary of the module
the Text of the module being loaded being loaded As mentmned earlier, the
CESDs are retained to allow the reproeess-
ing of Load Modules by the Linkage Edi-
5.2 Loading and Relocating the Text tor. Actually, for purposes of relocation it is
For each T e x t / R L D pair in the Load only the address field of the R L D entries
Module (see Figure 6) the following actions that is of interest.
Thus, in essence, the reloeatable binary
are performed [12].
1) The Text is read into the next available form of a program m the I B M System/360
consists of the Text and R L D portions of
section of the memory allocated.
2) The R L D (Relocation Dictionary) is the Load Module. As previously described,
read into a buffer in the relocating when a program ~s loaded into memory, the
loader's work area. relocation information ( R L D ) employed to
3l Each Address Constant in the Text just load (relocate) it is discarded Therefore, a
loaded ~s updated m the following man- program that has been rolled out to second-
net. The cell that represents the cor- ary storage cannot be brought back into
responding Address Constant IS accessed main memory (Rollin operation) until the
from the Address field (a pointer) in the ~paee that it previously occupied is made
R L D entry; the starting Text address is available; this represents a serious disad-
then added to the contents of the Address vantage. (Other reasons that require a
Constant. rolled out program to be returned--rolled in
These steps are repeated until an End-Of- - - t o the exact space from which it was re-
Module (EOM) indleatmn is found. At that moved are mentioned in Section 4 1.3 )
point, the Load Module is m main menmry
ready for execution.
6. SUMMARY
2200 81[L . . . . . . . . . . . . . . . . . . . . . . .
2500 CSECT B
2600 d~E . . . . . . . . . . . . . . . . . . . . . . . . . .
MEMORY
LOCATION 2700 DC AIJOE) 2600 3OO
BYTES
2000 2704 DC V (BILL) 2200
CSECT A
ENTRY BILL
2200 BILL . . . . . . . . . . . . . . . . . . . . . . . . . . 5~
BY
2500
m
FzG 11 Loading a module (a). Storage allocated by Control Program
loaded and relocated (c). Module in memory ready for exectmon.
300
BYTE.c
Ed Balkovich, Willy Chiu, Don Dumont, 5 KNUTH, D E "Von N e u m a n n ' s first computer
program " Computzng Smveys 2, 4 (Dec 1970),
Rex Kerley, Dick Mandell, ~nd Ed Pn- 247-260
chard for many helpful comments. 6 DEN:,'I.~G. P J "Virtual memory " Computing
Smveys 2, 3 (Sept. 1970), 153-189
7 WATSON, R. W Tzme-sharing system des~q~t
cot~cepts. McGraw-Hill, New Ymk, 1970
8 McCARTHY. J., CORBATO, F J ; AND DAGGE'I'T,
REFERENCES M M "The h n k m g segment subprogram
language and h n k m g l o a d e r " Comm. ACM 6,
1 PRESSER, L "The han~latmn of programming 7 (July 1963), 391-395
languages" In Computer sczey~ce, CaRDENaS, 9 I B M System/360 operahng svsem hnkage edl°
PRESSER, AND M-',RIN ( E d s ) , J o h n Wiley & tor program logic manual I B M Form No
Son~, New York, 1972 Y28-6667-0
2 BRADEX,R "Operating ~ystems " In Computer 10 I B M System/360 operating system hnkage
sczence, CARDENAS,PRESSER, aND MARIN ( E d s ) , editor and loader I B M Form No C28-6538-8
J o h n W t l e y & Sons, New York, 1972. 11 I B M System/a60 operating system assembler
3 BARRON. D. W Computer operating systems language I B M F o r m No C28-6514-5.
Chapman and Hall. London, 1971 12 I B M System/360 operating system M V T su-
4 BCRRo~, D W. Assemblers and loaders Amer- pervisor program logic manual I B M Form
ican Elsewer, New Ymk. 1969 No GY28-6659-4