Академический Документы
Профессиональный Документы
Культура Документы
David H, Bartley
John C. Jensen
xor AX.AX ; c l e a r AX
Permission to cOW/withoutfee all or part of this material is granted lods byte p t r ES:[SI] ; get next
provided that the copies are not made or dis~ribuUxl for direct mov BX,AX : opcod,
commercial advantage, the ACM copyright notice and the title of
the publication and its date appear, and notice is giventhat copying shl BX,I ; make word index
is by permission of the Association for Computing Machinery. To imp op_tabla*[BX] : and Jump
copy otherwise, or to republish, requires • fee and/or specfic
permission.
Other virtual machine architectures for LISP include PC Scheme memory consists of a linear address space
variations on the MIT LISP Machine [Greenblatt 84] and subdivided into logical pages to support a BIBOP (big bag
SPICE LISP /Wholey 84 I. RABBIT ISteele 78b], MAC- of pages) tagging mechanism. An object is represented by
SCHEME, and an unnamed "small Scheme implementa- a three byte pointer. The page number and page displace-
tion" at MIT [Schooler 84} propose VMs specific to the ment are stored in the top byte and bottom two bytes,
needs of Scheme. Of these, only RABBIT is register-based. respectively.
87
The default page size is computed automatically as The mark and sweep phase causes a pause in execu-
a function of available physical memory each time P C tion of 0.5 to 2.5 seconds for a 512KB 80286 system. The
Scheme is loaded and varies between 3072 and 5136 bytes. longest pause when compaction is needed is about six sec.
A pointer is mapped into its associated physical memory onds. We consider this delay more acceptable than the
address by looking up the base (segment) address for the VM overhead that a real-time collector would incur.
page in a table and adding its displacement component.
This mapping permits pages to be placed arbitrarily in 2.$ Control Model
physical memory. A given page may be made larger than
the default value as needed to accommodate extremely Unlike most LISps, Scheme provides ~escape functions"
large objects such as code blocks and arrays. called continuations which may be called any number of
(Although the release~ product is restricted to 640KB times; the invocation need not be within the dynamic scope
physical memory (768KB for TI PCs), experimental ver- of the binding of the continuation. This behavior rules
sions exploiting extended and expanded memories execute out the use of a simple contiguous control stack for the
somewhat more slowly due to address mapping complica- general case, so a judicious use of heap-allocated structures
tions.) is needed. Likewise, the existence of first-class function
objects, called ¢loeures, dictates heap allocation of some
The type of a referenced object is determined by in- lexical environment bindings.
specting a table of type attributes indexed by the number
of the page in which it resides. A simple bit test or byte The control stack holds the state needed to execute a
comparison suffices for all of the d a t a type predicates and continuation. Since the compiler is generally unable to de-
to distinguish the various representation formats. An in- termine which control frames will be retained indefinitely,
dex byte is also associated with each page entry to allow it optimistically allocates them in a contiguous stack area.
fast dispatching on type. Although each page holds objects When CALL-¥ITH-CURREIIT-COIITINUATIONis invoked, the
of only one type, all objects except list cells, fixnums, and stack is copied into the heap to make a continuation object.
characters have redundant type headers to aid the sweep To reduce the worst-case cost of this copying and to per-
mit arbitrary growth of the stack, a small buffer is used to
phase of garbage collection.
hold the topmost stack frames. When this buffer overflows,
The P C Scheme d a t a types implemented as heap- all frames but the currently active one are copied into a
allocated objects are list cells, arbitrary-precision integers, continuation object in the heap. Thus, a continuation is
64 bit IEEE floating point numbers, symbols, strings, vec- represented by a chain of such control stack segments.
tors, continuations, functions, code blocks, I / O ports, and
environments. The compiler identifies those variables which have in-
definite extent because they are freely referenced from clo-
Integers in the range -16,384 to +16,383 are repre- sures and arranges for them to be heap-allocated at run
sented as immediate values; a page number is reserved time. Thus, the d a t a structure stored in a closure object
as a 'fixnum' tag and the numeric value is stored in the does not reference the stack. Temporaries and variables
displacement field of the reference. Character objects are which are not accessed freely from any closure are allo-
represented similarly. cated to registers or to locations in the stack.
Smalitalk-80"" procedure activation environments (con-
2.4 Garbage Collector texts) correspond closely to Scheme closures and continu-
ations. [Deutech 84] and [Suzuki 84] report that 85% or
Long pauses for garbage collection are confusing and of-
more of Smalltalk-80 contexts may be allocated in a stack-
ten distressing to end-users of applications written in many
like manner since operations that require indefinite reten-
PC LISPs. Although "mark and sweep ~ is acceptably fast
tion of contexts occur infrequently. Deutsch allocates con-
on a nominal 512KB memory, much slower compaction
texts created by normal sends on the stack. When a refer-
algorithms are often employed to avoid the effective loss
ence to a context is created (making it an object), space is
of space to fragmentation. P C Scheme uses a mixed strat-
allocated in the heap but the context remains cached in the
egy. When a memory allocation request cannot be satisfied
from a free memory pool, a m~rk and sweep garbage collec- stack as long as possible. Heap-allocated contexts must be
returned to the stack before they may be executed.
tion is performed. If this doesn't recover enough memory,
a memory compaction phase is run to coalesce small frag- Suzuki also has multiple representations for Smailtalk-
ments of free memory into larger blocks. 80 contexts but permits a context to be executed in the
heap. As with P C Scheme, a small stack buffer behaves
The memory compaction operation copies referenced
objects from pages which are the least full to those which like a cache for heap-allocated environments and limits the
are the most full. Thus, it moves the fewest objects possi- time taken to save it to the heap.
ble to accomplish dense packing of data. Fewer than half Both Smalltalk-80 implementations rely on "lazy" allo-
of tl~e objects in the heap are moved, even in the worst case cation methods at run time rather than static analysis at
(typically much fewer since system code is packed densely compile time. We adopt the same approach for continua-
by the initial load and is never moved). After the objects tions because CALL-MITH-CORRENT-CONTINUATIONhas dy-
have been moved, a sequential pass is made through mem- namic effects on the retention of environments. We pre-
ory to relocate all pointers to moved objects. fer static analysis of free variable references from closures,
88
however, since it allows individual variables to be selec-
tively heap-allocated. Although we make no attempt to
delay transferring such variables from the stack (or regis- Compiler Phase Tsrn¢
ters) to the heap, it would be possible to do so. macro expansion 18 %
10cal optimization 5 %
closure and any. analysis 6 %-"
code generation 23 %'"
3 The Compiler apeephole" optimization ] 29 %
assembly [ 19 %
PC Scheme relies solely on its compiler for the evalua- Figure 2: Relative time spent in each compder phase
tion of Scheme programs. An interpreter was considered
but was not implemented for the following reasons:
89
( d e f i n e ( f i b n)
( i f (<? n 2)
n
(÷ ( f i b (- n 1)) ( f i b (- n 2 ) ) ) ) )
FIB
LOAD R3, (QUOTE 2) ; r3 := 2
LOAD R2, R1 ; r2 := n
<? R2, R3 ; r2 := (<? n 2 )
JUMP L5, MULL?. R2 ; (go L5) i f ( n u l l ? r2)
EXIT ; return value in r l
L5
PUSH R1 ; save n
~*IlO! R1, (QUOTE "1) : r l := ( - n 1)
CALL-0PEN FIB. 0, 0, (1) ; Jar t o FIB
PUSH R1 ; save (fib (- n I))
LOAD R3, (STACK 0 O) ; r3 :ffi n
LOAD El. R3 ; rl : - n
~*IMM RI, (QUOTE -2) ; rl := (- n 2)
CALL-OPEN FIB. O. O. (I) ; J s r to FIB
P0P R2 : r e s t o r e ( f i b (- n 1))
* RI, R2 ; (* ( f i b . . . ) ( f i b . . . ) )
EXIT ; r e t u r n value in r l
A variable has indefinite extent if it is freely referenced every lambda expression to be traced, and permits debug-
from a funarg or from a function which is called directly or ger access to all variable bindings. In this mode, programs
indirectly from a funarg. Rather than compute the transi- typically execute at about one fourth the normal speed.
tive closure of function calls, we consider a lexicai variable
to have indefinite extent if onp fun~rg is defined within 3.3 Register Allocation
its scope and it is freely referenced by any function. This
assumes that any function may be called by any other. The code generator uses a stack-like approach to allo-
(In retrospect, the decision not to compute the transi- cate variables and temps to ~ registers. Registers RI
tive closure of calls from funargs appears misguided. As through R n hold function arguments at function entry. As
shown in Figure 2, additional time spent in this phase LET and LETP.EC expressions are entered, the next avail-
would still be overshadowed by the costs of code gener- able registers in ascending numerical order are assigned
ation, peephole optimization, and assembly.) to hold the new bindings. (Registers are not allocated to
heap-allocated variables or to identifiers bound to open
Variables are heap-allocated only if they must exist at functions.) The temporary results from in-line expression
run time---identifiers which are bound to open functions
evaluation normally occur at this simulated ~top of stack,"
are never accessed as variables and so are not allocated
although most operations frequently access their operands
any storage at run time. More precisely, a iexical variable
directly without stack-like pushes and pops. The ~peep-
is heap-allocated if it has indefinite extent and must exist
hole" optimizer performs copy propagation and targeting
at run time; it must exist at run time if it is modified by
across entire blocks at a time, so redundant register moves
SET! or is initialized to some value other than an open
are infrequent.
function.
This straight-forward approach is an example of the
This analysis is complicated slightly by the existence in
"best simple" design philosophy. Its success is due to sev-
P C Scheme of the special form THEoFJVII~01MENT,which
¢ral factors:
reifies the lexicai environment at a point as a first-class
* The availability of 64 general VM registers alleviates
data object. The appearance of THEo£MVIlt0104FJT in a
the need to minimize the total number of registers in
variable's scope causes the variable to exist at run time
use at any one time.
with indefinite extent.
When the compiler is operating in debug mode (speci- . The peephole optimizer does well at rectifying locally
fied by the value of a flag variable), it skips most optimiza- inept instruction selection and register allocation.
tions and marks all lambda expressions for closure and all . Several common functions (e.g., MEMQ,*, HOT) are im-
variables for heap allocation. This preserves a closer corre- plemented as ~ instructions, avoiding register saves
spondence between the object code and its source, allows and restores around out-of-line calls
00
. Tail-recursive calls do not require register saving. 4 Performance
• Many functions have high-probability execution paths
that do not involve further function calling (e.g., Table I compares P C Scheme's execution speed against
(lambda (n)(if (<? n 2) n ...))). other PC-class LISP systems~Golden C o m m o n Lisp,(~)
Large M e m o r y (GCLISP L M TM) version 2.1;IQLISP TM ver-
Improved code quality at the cost of longer compilation sion 1.7; TLCT"-LISP version 1.51; and MacScheme/"
times could be obtained with a global register allocation Most of these benchmarks are taken from the Gabriel
strategy coupled with more inter-procedural variable anal- benchmark set [Gabriel 85[ and were run without modi-
ysis. R A B B I T ISteele78b] employs a stack-like allocator fication, except to account for differences in the LISP di-
similar to ours that works across procedure boundaries and alects. The 80286 runs were made on an I B M P C / A T TM
would work well with our V M . IMurtagh 84] describes an with 640K bytes of system memory, except for G C L I S P
algorithm for allocating environment frames and display LM, which requires additional protected memory to run
entriesso they may be shared among severalmutually non- the interpreter (I.5MB) and compiler (3MB). MacScheme
recursive procedures. We are investigating the application was tested on a 512K byte Apple~ Macintosh. T M
of these ideas to a register allocator for Scheme.
The numbers show the execution time in seconds av-
eraged over three test runs in a clean, garbage collected,
$.4 Function Calls system. The first run of a series was retested whenever it
appeared to be distorted by autoloading or other initial-
The VM registers are a globally allocated resource and ization effects.
must be preserved across function calls. PC Scheme uses
The timings show that the compiled systems con-
the ~caller saves ~ protocol in order to reuse the regis-
sistently outperform the interpreted ones and that PC
ters for argument passing and so that only active registers
Scheme is intermediate in performance between TLC-
are saved. The compiler minimizes the total number of
LISP, another "pseudo-coded" VM, and the native code
pushes and pops by delaying them until the last possible
version of GCLISP LM. These results confirm our ex-
moment. Thus, between out-of-line calls, saved variables
pectations about our register-based virtual machine and
are accessed directly from the stack instead of popped into
threaded code emulator. We have also observed that our
registers and then accessed. Indeed, pops are delayed until
generated code is considerably more compact than both
the two branches of an IF expression are rejoined, since the
interpreted S-expressions and native object code. Our ap-
stack height must be made consistent at such points. Tail
proach appears to be a good compromise between space
recursive calls use the same argument passing mechanism
and speed for small PCs, as intended.
but do not require register saving.
A key benefit to passing arguments in global registers Each system has its own unique strengths and weak-
is that subsequent calls with the same arguments in the nesses. W e note that P C Scheme's performance with in-
same positions in an argument listmay not require register tegers larger than + 2 " sufferson benchmarks with larger
shuffling. As a trivialexample, the code generated for the values; a benchmark that counts a million iterationsof an
function (LAMBDA (A B) (F00 (* it I) B) ) issimply the expression is dominated by bignum arithmetic.
following.
5 The Development Environment
Z, I194 R1,'1 ; Argt :ffi (1* A)
LOAD R3. (GLOBAL F00) : fetch closure The PC Scheme system includes an editor, debugger,
CALL-TR R3 ; Arg2 is unchanged and object-oriented program_ruing facility called SCOOPS.
Lisp mode Of host BROWSE DERIV DIV2 DIV2 FACT FIB TAK
system execution processor iter recur 1000 20 18 12 6
P C Scheme bytecode 80286 581.51 101.21 25.03 s8.~4 6.76 6.47 19.17
MacScheme bytecode 68000 (1) 243.36 60.17 143.69 (1) 22.50 72.24
TLC-LISP "P-code" 80286 743.45 146.34 88.00 110.47 (4) 18.46 55.11
GCLISP LM native code 80286 248.33 39.46 16.50 29.88 (3,4) 2 . 0 4 6.65
TLC-LISP native code 80286 (2) (2) 40.85 77.81 (4) ! 12.56i 34.0o
GCLISP LM interpreter 80286 2008.37 219.301253.68 221.09 (3,4) 35.30 I 144.36
IQLISP interpreter 80286 (1) 293.62 348.60 273.73 (3) i 37.08 154.10
TLC-LISP interpreter 80286 3107.76 369.74 J 325.62 (3) (4) 49.38 162.67
Notes: (I) This test was not run.
(2) Stack overflow during compilation.
(3) Stack overflow during execution.
(4) G C L I S P L M and TLC-LISP restrictinteger values to 32 bits.
Table L Comparatlve execution times for several PC LISPs
91
PC Scheme's editor is a substantially modified version code emulator. The principal difficulty for COMMON LISP,
of Edwin, an EMAC$-style editor which was originally de- the immense size of a complete implementation, is amelio-
veloped by the Scheme project at MIT. As a Scheme pro- rated by our compact encoding. We see no fundamen-
gram itself, Edwin was easily integrated into our develop- tal problems in extending our VM to encompass multiple
ment environment, although it was necessary to cut its size values, complex procedure calls, and other aspects of the
considerably. Edwin's Scheme mode provides parenthesis language. Indeed, our results so far corroborate Steele's
matching, indenting comm~mds, and the ability to eval- conjecture [Steele 78b] that Scheme is an excellent basis
uate an expression, region, or the entire buffer. Moving for compiling other languages--particularly other LISPs.
between the Scheme listener and the editor is facilitated
by the ability to split the screen into separate windows for
Acknowledgements
each.
Both the size and speed of the editor were considerable We would like to thank the following members of the
concerns at the onset of the port. The m~jority of the code CSC Computer Architecture Branch who assisted in the
size reduction came through careful selection of features to creation of PC Scheme: Rusty Haddock, Paul Kristoff,
omit. Achieving acceptable speed hinged on identifying a Don Oxley, Amitabh Srivastava, John Gateley, Mark
few key functions which could be made VIVl "instructions." Meyer, and Glen Slick. Keith Carlson, Terry Candill, and
Although PC Scheme does not provide extensive source- Jennifer Hso were responsible for its release by our product
level debugging support, it has extensive trace and break- group. Jan Stevens guided the development of the manu-
point facilities and an interactive Inspector with'com- als. Bill Cohagan prototyped an experimental translator
mands to display and manipulate call stack frames and from Pascal to Scheme and experimented with COMMON
lexicsl environments, edit variable bindings, trace back LISp with the help of Ken Lewis.
through a chain of procedure calls, and evaluate expres-
sions in the environment of a breakpoint. All nser-
correctable errors trap to the Inspector. References
SCOOPS is an experimental object-oriented program-
ming system with multiple and dynamic inheritance based [Bell 73] Bell, J. R., "Threaded Code." Communications
on first-class environments. Although it is similar in con- of the ACM, XVI, (1973) pp. 370-372.
cept and syntax to the LOOPS [Bobrow 83] and Flavors
{Bobrow 83] Bobrow, D.G.; and Stefik, M.J.. The LOOPS
[Weinreb 83] systems, the implementation of SCOOPS re-
Manual. Pslo Alto, CA: Xerox Corporation, 1983.
lies heavily on the features of the Scheme language.
Other development features include a pretty-printer, IDeutsch 84] Deutsch, L. Peter, and Schiffman, Allan M.
window system, color graphics, autoloading, structure ed- "Efficient Implementation of the Smalltalk-80 Sys-
itor, and a utility that converts compiled object files into tem." In Conference Record of the 11th Annual
fast-load format. ACM Symposinm on Principles of Programming
Languages. January 1984, pp. 297-302.
[Gabriel 85} Gabriel, Richard P., Performance and Evalu-
6 Concluding Remarks ation of Lisp Systems. The MIT Press, 1985.
IGreenblatt 84] Greenblatt, R. D., et al. "The LISP
PC Scheme has met our goals for high performance and Machine," Interactive Programming Environments,
compact code for AI applications on PC-class machines. D, R. Barstow, H. E. Shrobe, E. Sandewall, eds.
The byte-encoded VM and relatively simple compiler ap- McGraw-Hill, 1984.
pear to offer a better compromise in these respects than
either traditional interpreters or native code compilers. IMurtagh 84] Murtagh, Thomas P. "A Less Dynamic
Although we have emphasized performance over richness Memory Allocation Scheme for Algol-like Languages."
of the development environment, hundreds of users have In Conference Record of the l l th Annual A CM Sym-
found PC Scheme to be a productive and friendly system. posium on Principles o? programming Languages.
Our experience with PC Scheme has vindicated our se- January 1984, pp. 283-289,
lection of Scheme over COMMON LISP as our application IPowel184] Powell, Michael L. "A Portable Optimizing
language for small PCs. Scheme's radically simpler struc- Compiler for Modula-2." In Proceedings of the ACM
ture and absence of expensive features make it much eas- SIGPLAN '84 Symposium on Compiler Construc-
ier to compile and execute efficiently. Unfortunately, it is tion. June 1984, pp. 310-318.
harder to make many of COMMON LlSP's expensive fea-
tures "pay their own way." Moreover, COMMON LISP IRRRS 85] Clinger, W., ed. "The Revised Revised Report
lacks first-class continuations, engines, and environments, on Scheme." Massachusetts Institute of Technology
which are important to many users. AI Memo No. 848 (August 1985).
The focus of our work is now shifting towards support- !Schooler 84] Schooler, Richard, and Stamos, James W.
ing both Scheme and COMMON LISP with a portable byte- "Proposal for a Small Scheme Implementation."
92
Massachusetts Institute of Technology Laboratory
for Computer Science Memo No. TM-267 (October
1984).
[Steele 7~] Steele, Guy Lewis, Jr., and Sussmem, Get-
aid J. ~The Revised Report on Scheme, a Dialect
of Lisp." Massachusetts Institute of Technology AI
Memo No. 452 (January 1978).
[Steele 78b] Steele, Guy L. "RABBIT: a Compiler for
Scheme." Massachusetts Institute of Technology AI
Technical Report No. 474 (May 1978).
ISuzuki 84] Suzuki, Norihisa and Terada, Minoru. "Cre-
ating Efficient Systems for Object-Oriented Lan-
guages." In ConEerence Record of the l l t h An-
nual A CM Symposium on Principles of Programming
Languages. January 1984, pp. 290-296.
[Ungar 84] Ungar, David, et al. ~Architecture of SOAR:
Smalltalk on a RISC." In Proc. l l t h Annual Interna-
tional Symposium on Computer Architecture. 1984,
pp. 188-197.
[Weinreb 83] Weinreb, D.; Moon, D.; and Stallman, R.
Lisp Machine Manual. Cambridge, MA: Mas-
sachusetts Institute of Technology, 1983.
[Wholey 84] Wholey, Skef. "The Design of an Instruction
Set for Common Lisp.~ In Conference Record of the
1984 A CM Symposium on Lisp and Functional Pro-
gramming. August 1984, pp. 150-158.
Trademark Information
93