Вы находитесь на странице: 1из 5

Benefits of Compilation Server

Mrs. M. S. Nimbarte Prof. A. P. Bodkhe


PG Student Professor & Head
PRMIT and Research, Badnera PRMIT and Research, Badnera
mrudula_nimbarte@rediffmail.com
optimizing compilation of “hot” methods at run time
Abstract or combine a cheap compiler with an optimizing
compiler used only for hot methods. While these
Modern JVM implementations interleave combined approaches result in more efficient
execution with compilation of “hot” methods to execution of Java programs, they still incur the time,
achieve reasonable performance. Since compilation memory, and pause time overhead of optimizing
overhead impacts the execution time of the compilation.
application and induces run-time pauses, it is better This optimizing compilation overhead is already
to offload compilation onto a compilation server. quite significant in desktop systems, and the
Compilation server is the server which compiles and situation is worse in devices where one has to
optimizes Java bytecodes on behalf of its clients. It balance computing power with other attributes such
provides the following benefits: (i) lower execution as battery life. With the advent of mobile computing,
and pause times due to reducing the overhead of there is an increasing number of networked (wired or
optimization; (ii) lower memory consumption of the wireless) devices with limited CPU speed, small
client by eliminating allocations due to optimizing memory size, and/or limited battery life, where the
compilation and footprint of the optimizing optimization overhead may be prohibitive.
compiler; (iii) lower execution time of the Compilation Server (CS) is a server-assist
application due to sharing of profile information mechanism to eliminate or reduce the compilation
across different runs of the same application and overhead. CS can compile and optimize code on
runs of different applications.. Compilation server is behalf of clients. CS provides the following benefits
able to handle more than 50 concurrent clients while compared to the best performing Adaptive
still allowing them to outperform best performing configuration: (i) lower execution and pause times
adaptive configuration. by reducing the overhead of optimization; (ii) lower
. memory management load on the client by
eliminating allocation coming from optimizing
1. Introduction compilation and by reducing the footprint of the
optimizing compiler; (iii) lower execution time of
One of the key features behind the enormous the application due to sharing of profile information
success of the Java programming language is its across different runs of the same application and
network- and platform-independence: one can runs of different applications. CS can support over
compile Java programs into platform neutral 50 concurrent clients while still allowing them to
bytecode instructions and ship those instructions outperform the Adaptive configuration. The client
across a network to their final destination where a performance is only slightly affected by slow
Java Virtual Machine (JVM) executes them network speeds, and that profile-driven
[Lindholm and Yellin 1996]. Java’s “write once, run optimizations are feasible in the context of CS.
everywhere” paradigm is especially useful for
mobile computing, where applications may be
downloaded on demand. 2. Motivation
There are at least two basic methods of executing
Java bytecode instructions: (i) by interpreting the 2.1 Execution Model of Java Virtual
bytecode instructions; and (ii) by first compiling the Machines
bytecode instructions into native machine code
instructions and then executing the resulting native Running Java programs normally consists of two
code. The interpretation-only approach often results steps: converting Java programs into bytecode
in poor performance because of interpretation instructions (i.e., compiling Java source to .class
overheads, while the compilation-only approach files), and executing the resulting class files [Gosling
introduces additional run-time overhead because et al. 2000]. Because the compiled class files are
compilation occurs at run time. Thus, many JVMs network- and platform-neutral, one can easily ship
use hybrid methods: combine interpretation with
them across a network to any number of diverse 3. Design and Implementation of
clients without having to recompile them. Compilation Server
JVMs then execute these class files, and to
achieve reasonable performance, state-of-theart The primary design goal of CS is to minimize
JVMs, such as HotSpot [Paleczny et al. 2001] and client execution time. CS clients may include
Jikes RVM [Burke et al. 1999], also perform desktop PCs, laptops, and PDAs, and thus are likely
dynamic native code generation and optimization of to be limited in one form or another compared to CS,
selected methods. Instead of using an interpreter, which would be equipped with plenty of memory,
Jikes RVM [Arnold et al. 2000] includes a baseline fast disk drives, and fast network connection(s).
(non-optimizing) and an optimizing compiler. The Therefore, it would be beneficial to allow the server
baseline compiler is designed to be fast, and easy to to perform clients’ work to the extent possible.
implement correctly, while the optimizing compiler
is designed to produce more efficient machine code
3.1 Overview
by performing both traditional compiler
optimizations (such as common subexpression
Figure 1 shows the overall client and CS
elimination) and modern optimizations designed for
architecture. Broadly speaking, there are three steps
object-oriented programs (such as pre-existence-
involved in using CS:
based inlining [Detlefs and Agesen 1999]). In
(1) Client requests download of class files via CS.
addition to providing flags to control individual
In this respect, CS acts as a proxy server that
optimizations, Jikes RVM provides three
relays client’s request to the outside world.
optimization levels: O0, O1, and O2. The lower
However, unlike a regular proxy server, CS
levels (O0 and O1) perform optimizations that are
may choose to annotate the class files with
fast (usually linear time) and offer high payoff. For
profile directives. These directives instruct the
example, O0 performs inlining, which is considered
client as to which methods or which basic
to be one of the most important optimizations for
blocks within a method it should instrument
object-oriented programs. O2 contains more
and gather profile information for profile-
expensive optimizations such as ones based on static
driven optimizations.
single assignment (SSA) form [Cytron et al. 1991].
(2) Client compiles and executes downloaded
methods with the baseline compiler. The
2.2 Compilation Pause Times modified adaptive subsystem identifies “hot”
methods and sends optimization requests to
In addition to affecting overall running time of CS. These requests are accompanied by various
applications, dynamic compilation also affects their information.
responsiveness because it induces pauses in program (3) The front-end for CS, which we call the CS
execution. The use of pause times as a performance driver, stores individual client state data. The
metric is popular when evaluating garbage collection driver is also responsible for managing profile
algorithms [Cheng and Blelloch 2001], and it should information, reusing optimized code, driving
be equally important in evaluating dynamic the optimizing compiler, and sending
compilation systems. For example, H¨olzle and optimized machine code back to the clients.
Ungar [H¨olzle and Ungar 1994] uses the concept of The installer threads on clients are responsible
absolute pause times to evaluate the responsiveness for receiving and installing optimized code.
of the SELF programming system.
Since CS also acts as a proxy server as depicted
2.3 Memory Usage in Figure 1, make another simplifying assumption,
namely that CS has access to all client class files.
Performing code optimizations consumes memory Therefore, client requests for method optimization
and thus may degrade memory system performance. do not need to send the bytecode instructions to CS.
Since Java programs will be optimized at run time, However, since the size of the bytecode instructions
there are two memory costs for optimizations: (i) the is usually small, and network speed does not affect
data space cost, i.e., the space required by the client performance very much. This assumption is
optimizer to run; and (ii) the instruction space cost, not likely to affect client performance.
i.e., the footprint of the optimizer. (The final size of
optimized code may be larger or smaller than 3.2 Client Design and Implementation
unoptimized code, but this size effect is much
smaller than the other two.)
3.2.1 Method Selection. One of the most important
aspects that affects client performance is selecting
Fig 1. Overview of CS-client interaction

callees to CS. The reason for sending state


which methods to optimize on CS. Although it information of m’s callees to CS is because the
would be less demanding on clients to have CS
select which methods to optimize, having the server optimizing compiler may decide to inline them into
choose does not work well because CS does not have m and thus needs to know about their state
enough run-time information to make the best information. The collection of client state relies on
decision. the baseline compiler, and thus if a callee has not yet
While it is more efficient to optimize methods been compiled by the baseline compiler, its state
using CS rather than on the client, there still is some information is not sent to CS. CS needs to know
cost associated with using the server, such as some things about the state of the client VM. To
collection of client state information, the overhead of keep it simple, client state information is sent as
communication, and installation of compiled name (string), kind (byte), and value (integer) triples,
methods. Optimizing too many methods, i.e., too except for profile information. This representation is
eagerly, may increase this cost so significantly that it not optimized for size but for simplicity. It is
will offset any benefit gained by optimization. In any possible for CS and the client to agree upon a more
case it is impractical to optimize all methods on CS compact representation such as an index into the
since it would overload the server. On the other constant pool table in a class file. The size of client
hand, optimizing too few hot methods may eliminate state information is small and does not impact client
any performance benefit. performance much. To avoid resending information,
Thus the benefit model remains constant CS maintains per-client state information.
throughout client execution. The cost model is
dynamic. The cost is expressed in terms of the 3.2.3 Mode of Communication. Another important
number of bytecodes compiled per unit time, and design dimension that can affect client performance
each individual client maintains their own cost is the way it communicates with CS. Both the
model. amount of transmitted data and the mode of
communication (synchronous or asynchronous) can
3.2.2 Client State. Once a method m is identified as impact how clients perform. Offloading compilation
being “hot” using our modified cost-benefit model, to a server means that clients can continue executing
the adaptive controller issues a compilation request unoptimized machine code while they are waiting for
to CS (step 2 in Figure 1). In addition to sending a optimized code to arrive from the server. It may be
description of m, the adaptive controller also collects beneficial to wait for the server for long running
and sends client state corresponding to m and its methods since waiting and executing optimized code
may be shorter than executing unoptimized code
followed by optimized code. Furthermore, the may later be recompiled using a higher optimization
problem with long running methods can also be level, if that is estimated to be beneficial. Since using
addressed by performing on-stack replacement [Fink the same multi-level recompilation strategy in CS
and Qian 2003] of compiled methods. The use of may result in higher server load and increased
asynchronous communication also has the benefit of network traffic. These optimizations have high
reducing client pause times since a client needs to potential for benefit with low to moderate
pause only to send the request and to receive and compilation cost.
install compiled code. Split transaction is to split an
optimization request in two transactions as described 3.3.2 Speculative Optimizations. One may need to
below. The idea behind split transaction is reduce apply some speculative optimizations, such as
wait time for optimized code by sending the request guarded inlining, in order to generate best quality
as early as possible. In the first transaction, client code. However, speculative optimizations make
would identify “getting-warm” methods and send some assumptions about the state of the world, and
requests for these methods to CS. In the second these assumptions need to hold for correct execution
transaction, the client would identify “actually-hot” of programs. The Adaptive system maintains and
methods and send requests for these methods to CS. checks the validity of these assumptions, and when
Since the set of methods in “gettingwarm” would these assumptions are violated, it re-optimizes the
include most, if not all, of the methods in “actually- method in question. The same approach is used in
hot”, CS would simply send previously optimized the CS setting: in order to accommodate speculative
code back to the client without additional delay. In a optimizations, the server sends a list of class names
setting where CS is more capable and relatively idle, and method descriptions back to the client that affect
or in situations where optimization or analysis is the validity of a compiled method. The client is then
much more expensive, split transactions may be responsible for checking that these classes and
effective at reducing client wait time. methods are not sub classed and overridden,
respectively. Even though CS has access to all the
3.2.4 Security and Privacy. Any client-server classes a client may execute, CS cannot perform
architecture is affected by security and privacy these checks since it does not know which classes a
issues. Questions such as “How can clients trust client may load dynamically in the future. CS could
CS?” or “What kinds of privacy policies should be assume that any class that a client has access to could
implemented in CS?” are interesting but beyond the potentially be sub classed and its methods
scope of study. Assume that clients can trust CS to overridden, but that would result in code that is too
send correctly optimized code and that CS can trust conservative. If a client finds that these assumptions
its clients to send valid profile information. Also are violated, the server sends non-speculatively
assume that other appropriate security and privacy optimized code to the client upon request.
policies are in place, which is a reasonable
assumption to make if both client and server are 3.3.3 Inlining. One of the most effective
behind a firewall, for example. optimization method is inlining [Lee et al. 2004],
and CS sometimes is not as aggressive as the
3.3 Server Design and Implementation Adaptive system in inlining, due to its limited
knowledge about client states. In order for CS to
As shown in Figure 1, CS implementation inline a method m1 into another method being
consists of a driver that mediates between clients and compiled, m2, CS needs to have full information
the optimizing compiler. The CS driver receives and about m1. Unless m1 has been baseline compiled on
stores per-client state information, segregated by the client already, it is not possible for CS to know
kind so as to speed lookups by the optimizing about its state. Because clients ask only for “hot”
compiler. It is also responsible for combining profile methods to be compiled by CS, there is a high
information received from its clients. The CS driver probability that most of the frequent method
also drives the optimizing compiler and relays invocations contained within these “hot” methods
optimized code back to the clients. In addition to have already been executed (and therefore compiled
machine code, it also sends GC maps, inline maps, by the Baseline compiler) at least once, and thus all
bytecode maps, and the list of classes that must not of the frequent call sites can be inlined by CS. The
be subclassed and methods that must not be fact that CS cannot inline as aggressively as the
overridden back to its clients. optimizing compiler in Opt or Adaptive may seem
like a disadvantage, but in fact, it is not, because the
3.3.1 Optimization Selection. The Adaptive system filtering of call sites that happens on clients works as
allows multi-level recompilation using the various a primitive feedback-directed adaptive inlining
optimization levels. For example, a method which mechanism. Due to this filtering, CS does not bother
has been recompiled using optimization level O0
to inline infrequently executed call sites; inlining
them may result in worse instruction cache behavior. 5. References
3.3.4 Linking. While producing generic machine Arnold, M., Fink, S., Grove, D., Hind, M., and Sweeney, P. 2000.
Adaptive optimization in the Jalape˜no JVM. In Proceedings of
code that can be used on different clients has its
the Conference on Object-Oriented Programming, Systems,
benefits, it can be costly because each individual Languages, and Applications.ACM Press, Minneapolis, MN,
client must link the machine code before executing 47–65.
it. Therefore CS produces optimized machine code Burke, M., Choi, J.-D., Fink, S., Grove, D., Hind, M., Sarkar, V.,
Serrano, M., Sreedhar, V. C.,and Srinivasan, H. 1999. The
that clients can simply download, install, and execute
Jalape˜no dynamic optimizing compiler for Java. In ACM Java
without any modification. For example, if a static Grande Conference. ACM Press, San Francisco, CA, 129–141.
field is referenced in a method, the server will use Cheng, P. and Blelloch, G. E. 2001. A parallel, real-time garbage
the offset of the corresponding static field obtained collector. In Proceedings of the ACM SIGPLAN 2001
from the client to generate code for that particular Conference on Programming Language Design and
Implementation. ACM Press, Snowbird, UT, 125–136.
method. Cytron, R., Ferrante, J., Rosen, B. K., Wegman, M. N., and Adeck,
F. K. 1991. Efficiently computing static single assignment form
3.3.5 Compiled Code Reuse. The issue of code and the control dependence graph. ACM Transactions on
reuse is to reduce compilation time and server load Programming Languages and Systems (TOPLAS) 13, 4, 451–
490.
even further. To reuse code, CS makes a scan of Detlefs, D. and Agesen, O. 1999. Inlining of virtual methods. In
generated machine code to find patch points: patch Proceedings of the 13th European Conference on Object-
points are name (string) and machine code offset Oriented Programming. Springer-Verlag, Lisbon, Portugal,
(integer) pairs that describe which bytes have to be 258–278.
Fink, S. J. and Qian, F. 2003. Design, implementation, and
patched before compiled code can be delivered to evaluation of adaptive recompilation with onstack replacement.
another client. When a request for the same method In Proceedings of the International Symposium on Code
comes in, CS simply walks over the list of patch Generation and Optimization.IEEE Computer Society, San
points modifying the compiled code as necessary. Francisco, California, 241–252.
Gosling, J., Joy, B., Steele, G., and Bracha, G. 2000. Java
However, it is sometimes not enough to patch Language Specification, Second Edition: The Java Series.
compiled code in this manner because there may be Addison-Wesley Longman Publishing Co., Inc., Boston, MA.
other differences between clients. For example, H¨olzle, U. and Ungar, D. 1994. A third-generation SELF
depending on whether a static field referenced in a implementation: reconciling responsiveness with performance.
In Proceedings of the Ninth Annual Conference on Object-
method has been resolved on the client, the server oriented Programming Systems, Language, and Applications.
may need to generate additional code to resolve the ACM Press, Portland, OR, 229–243.
field before accessing it. In the case where clients’ Lee, H. B., Von Dincklage, D., Diwan, A., and Moss, J. E. B.
states differ, the server simply re-optimizes the 2004. Understanding the behavior of compiler optimizations.
Tech. Rep. CU-CS-972-04, University of Colorado, Dept. of
method in question. Another approach would be to Computer Science, CB 430, Boulder, CO 80309-0430. Mar.
generate lowest common denominator code that Lee, H. B., Von Dincklage, D., Diwan, A., and Moss, J. E. B.
could be used universally across different clients, but 2007. Design, implementation, and evaluation of a compilation
such code would perform relatively poorly due to server. ACM Trans. Program. Lang. Syst. 29, 4, Article 18 (
August 2007 ).
additional checks.
Lindholm, T. and Yellin, F. 1996. The Java Virtual Machine
. Specification. Addison-Wesley, Boston, MA.
4. Conclusion Plezbert, M. P. and Cytron, R. K. 1997. Does ”just in time” =
”better late than never”? In Proceedings of the 24th ACM
SIGPLAN-SIGACT Symposium on Principles of Programming
To achieve reasonable performance, many Languages. ACM Press, Paris, France, 120–131.
modern virtual machines resort to Just-in-time
compilation. However, the cost of dynamic Prof. A.P.Bodkhe is Professor & Head of
compilation even in state-of-the-art virtual machines Information Technology Department at PRMIT
& Research, Badnera and Dean, Faculty of
is moderately high. A compilation server compiles Engineering & Technology at Sant Gadge Baba
client code at the granularity of methods to reduce or Amravati University, Amravati. He did B.E.
eliminate the cost associated with dynamic (EXTC) from Karnataka Uni.,Dharwad in 1984
compilation. Compilation server is able to handle and M.E.(Adv.Electronics) from Amravati
University,Amravati in 1990. He has got 24
more than 50 concurrent clients while still allowing years of teaching experience.
them to outperform best performing adaptive
configuration. CS is also effective at reducing clients Mrs. M. S. Nimbarte did B.E.(CSE) from GCOE
end-to-end execution times, pause times, and Amravati and currently working as a lecturer
since last six years in Bapurao Deshmukh college
memory consumption. Being able to migrate of Engineering, Sevagram and doing part time
compilation onto a remote server using CS approach M.E. from PRMIT and Research, Badnera. Her
will have significant impact on the way virtual area of interests are Compilers, System Software,
machines and optimizations are designed and Theory of Computation and Computer
Architecture and Organization.
implemented.

Вам также может понравиться