Вы находитесь на странице: 1из 71

Eclipse

OMR
Reliable and reusable components for building
high performance language runtimes

“How does all this compare to LLVM?”

Mark Stoodley, project co-lead


1
Eclipse OMR
A Brief Refresher

2
Eclipse OMR Mission
Build an open reusable language runtime foundation for cloud platforms

• To accelerate advancement and innovation

• In full cooperation with existing language communities

• Engaging a diverse community of people interested in language runtimes


• Professional developers
• Researchers
• Students
• Hobbyists

3
Eclipse OMR technology components
contributed from IBM SDK for Java
port platform abstraction (porting) library
thread cross platform pthread-like threading library
vm APIs to manage per-interpreter and per-thread contexts
gc garbage collection framework for managed heaps
compiler extensible compiler framework
jitbuilder project to simplify bring up for a new JIT compiler
omrtrace library for publishing trace events for monitoring/diagnostics
fvtest language independent test framework built on the example glue so that
components can be tested outside of a language runtime,
uses Google Test 1.7 framework
+ a few others
~800KLOC at this point, more components coming!

4
Eclipse OMR technology components
contributed from IBM SDK for Java
port platform abstraction (porting) library
thread cross platform pthread-like threading library Roughly
vm APIs to manage per-interpreter and per-thread contexts
Comparable
gc garbage collection framework for managed heaps
compiler extensible compiler framework
to
jitbuilder project to simplify bring up for a new JIT compiler LLVM
omrtrace library for publishing trace events for monitoring/diagnostics
fvtest language independent test framework built on the example glue so that
components can be tested outside of a language runtime,
uses Google Test 1.7 framework
+ a few others
~800KLOC at this point, more components coming!

5
OMR Compiler
• Originates from Testarossa Java Just In Time compiler created circa 1998
• Now used as compiler for static and dynamic languages as well as binary emulation
• Native code generation for X86, POWER, Z, ARM-32 & historically several others
• Hundreds of developer years of effort
• About 70 optimizations and analyses in OMR project
• Complete dataflow analysis engine and other analyses
• Local and global transformations and optimizations (not a complete list)
• Common subexpression elimination, expression simplification, dead tree and dead
store elimination, copy propagation, partial redundancy elimination, On Stack
Replacement (OSR) optimizations, local and global value propagation, global register
allocation, …
• Loop canonicalization, loop unrolling, loop versioning, loop replication, strip mining,
loop striding, ...
• Inlining, block ordering, coldness propagation, async check insertion/removal, store
sinking, …
• Long term focus on code quality, compile time, footprint, adaptive optimization
6
OMR JitBuilder project
• Simplified interface to the OMR compiler technology
• Designed to bootstrap a native code JIT compiler for interpreted methods
• Provides mostly sensible defaults for compiler configuration and setup
• Also a general cross platform native code generation toolkit
• Can be used on OSX as well as Linux for 64-bit X86, POWER, Z and 32-bit ARM

• Easy to get going:


$ git clone https://github.com/eclipse/omr
$ cd omr && ./configure
$ cd jitbuilder && make # builds release/libjitbuilder.a
$ cd release && make test # builds and runs examples
• Many (23) code samples in omr/jitbuilder/release/src 7
How does all this compare to LLVM?

8
First Answer: Project level (hand wavy)

LLVM is a subset of
Eclipse OMR function
e.g. no GC

Eclipse OMR is a subset of


LLVM function
e.g. can’t generate object files

9
Second Answer: Compiler level

Two different compiler technologies


So let’s measure them

10
Second Answer: Compiler level

Two different compiler technologies


So let’s measure them

This Talk:
Part I: Compare LLVM and Eclipse OMR JitBuilder
implementations of
LLVM’s Kaleidoscope tutorial language
11
Third Answer: Big Picture

Technologies are compatible and even complementary

12
Third Answer: Big Picture

Technologies are compatible and even complementary

This Talk:
Part II: Swift + OMR JIT leverages LLVM for static compiler and
Eclipse OMR JitBuilder for JIT compile at runtime
13
Part I

Compare LLVM and Eclipse OMR JitBuilder


Using
Kaleidoscope

14
Kaleidoscope
• http://llvm.org/docs/tutorial/LangImpl01.html
• From tutorial introduction:
“Kaleidoscope is a procedural language that allows you to define
functions, use conditionals, math, etc. [...including…] if/then/else
construct, a for loop, user defined operators, JIT compilation with a
simple command line interface, etc.”

• Has a Read-Eval-Print-Loop (REPL), all values are 64-bit


floating point, all expressions compiled to native instructions
15
Example: Fibonacci numbers
# Compute the x'th fibonacci number.
def fib(x)
if x < 3 then
1
else
fib(x-1)+fib(x-2);

# This expression will compute the 40th number.


fib(40);
16
Simple, self-contained implementation
$ cd llvm/examples/Kaleidoscope/BuildingAJIT/Chapter2
$ wc -l toy.cpp KaleidoscopeJIT.h
1233 toy.cpp
134 KaleidoscopeJIT.h
1367 total

# Kaleidoscope function define/execute and LLVM compilation control:


$ wc -l toyCompile.cpp KaleidoscopeJIT.h
67 toyCompile.cpp
134 KaleidoscopeJIT.h
201 total ~15%
# Kaleidoscope Abstract Syntax Tree (AST) to LLVM IR translation:
$ wc –l toyCodegen.cpp
403 toyCodegen.cpp ~29% 17
Setting up a JIT compiler: LLVM
//===----- KaleidoscopeJIT.h - A simple JIT for Kaleidoscope ----*- C++ -*-===//
//
// The LLVM Compiler Infrastructure
//
// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.
//
//===----------------------------------------------------------------------===//
//
// Contains a simple JIT definition for use in the kaleidoscope tutorials.
//
//===----------------------------------------------------------------------===//

18
Setting up a JIT compiler: LLVM (page 1)
//===----- KaleidoscopeJIT.h - A simple JIT for Kaleidoscope ----*- C++ -*-===//
//
// The LLVM Compiler Infrastructure
//
// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.
//
//===----------------------------------------------------------------------===//
//
// Contains a simple JIT definition for use in the kaleidoscope tutorials.
//
//===----------------------------------------------------------------------===//

19
Setting up a JIT compiler: LLVM (page 2)
#ifndef LLVM_EXECUTIONENGINE_ORC_KALEIDOSCOPEJIT_H
#define LLVM_EXECUTIONENGINE_ORC_KALEIDOSCOPEJIT_H

#include "llvm/ADT/STLExtras.h"
#include "llvm/ExecutionEngine/ExecutionEngine.h"
#include "llvm/ExecutionEngine/JITSymbol.h"
#include "llvm/ExecutionEngine/RuntimeDyld.h"
#include "llvm/ExecutionEngine/SectionMemoryManager.h"
#include "llvm/ExecutionEngine/Orc/CompileUtils.h"
#include "llvm/ExecutionEngine/Orc/IRCompileLayer.h"
#include "llvm/ExecutionEngine/Orc/IRTransformLayer.h"
#include "llvm/ExecutionEngine/Orc/LambdaResolver.h"
#include "llvm/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.h"
#include "llvm/IR/DataLayout.h"
#include "llvm/IR/LegacyPassManager.h"

20
Setting up a JIT compiler: LLVM (page 3)
#include "llvm/IR/Mangler.h"
#include "llvm/Support/DynamicLibrary.h"
#include "llvm/Target/TargetMachine.h"
#include "llvm/Support/raw_ostream.h"
#include "llvm/Transforms/Scalar.h"
#include "llvm/Transforms/Scalar/GVN.h"
#include <algorithm>
#include <memory>
#include <string>
#include <vector>

namespace llvm {
namespace orc {

21
Setting up a JIT compiler: LLVM (page 4)
class KaleidoscopeJIT {
private:
std::unique_ptr<TargetMachine> TM;
const DataLayout DL;
RTDyldObjectLinkingLayer<> ObjectLayer;
IRCompileLayer<decltype(ObjectLayer)> CompileLayer;

typedef std::function<std::unique_ptr<Module>(std::unique_ptr<Module>)>
OptimizeFunction;

IRTransformLayer<decltype(CompileLayer), OptimizeFunction> OptimizeLayer

22
Setting up a JIT compiler: LLVM (page 5)
public:
typedef decltype(OptimizeLayer)::ModuleSetHandleT ModuleHandle;

KaleidoscopeJIT()
: TM(EngineBuilder().selectTarget()), DL(TM->createDataLayout()),
CompileLayer(ObjectLayer, SimpleCompiler(*TM)),
OptimizeLayer(CompileLayer,
[this](std::unique_ptr<Module> M) {
return optimizeModule(std::move(M));
}) {
llvm::sys::DynamicLibrary::LoadLibraryPermanently(nullptr);
}

23
Setting up a JIT compiler: LLVM (page 6)
TargetMachine &getTargetMachine() { return *TM; }

ModuleHandle addModule(std::unique_ptr<Module> M) {
// Build our symbol resolver:
// Lambda 1: Look back into the JIT itself to find symbols that are part of
// the same "logical dylib".
// Lambda 2: Search for external symbols in the host process.

24
Setting up a JIT compiler: LLVM (page 7)
auto Resolver = createLambdaResolver(
[&](const std::string &Name) {
if (auto Sym = OptimizeLayer.findSymbol(Name, false))
return Sym;
return JITSymbol(nullptr);
},
[](const std::string &Name) {
if (auto SymAddr =
RTDyldMemoryManager::getSymbolAddressInProcess(Name))
return JITSymbol(SymAddr, JITSymbolFlags::Exported);
return JITSymbol(nullptr);
});
25
Setting up a JIT compiler: LLVM (page 8)
// Build a singleton module set to hold our module.
std::vector<std::unique_ptr<Module>> Ms;
Ms.push_back(std::move(M));

// Add the set to the JIT with the resolver we created above and a newly
// created SectionMemoryManager.
return OptimizeLayer.addModuleSet(std::move(Ms),
make_unique<SectionMemoryManager>(),
std::move(Resolver));
}

26
Setting up a JIT compiler: LLVM (page 9)
JITSymbol findSymbol(const std::string Name) {
std::string MangledName;
raw_string_ostream MangledNameStream(MangledName);
Mangler::getNameWithPrefix(MangledNameStream, Name, DL);
return OptimizeLayer.findSymbol(MangledNameStream.str(), true);
}

void removeModule(ModuleHandle H) {
OptimizeLayer.removeModuleSet(H);
}
private:

27
Setting up a JIT compiler: LLVM (page 10)
std::unique_ptr<Module> optimizeModule(std::unique_ptr<Module> M) {
// Create a function pass manager.
auto FPM = llvm::make_unique<legacy::FunctionPassManager>(M.get());

// Add some optimizations.


FPM->add(createInstructionCombiningPass());
FPM->add(createReassociatePass());
FPM->add(createGVNPass());
FPM->add(createCFGSimplificationPass());
FPM->doInitialization();

// Run the optimizations over all functions in the module being added to
// the JIT.
for (auto &F : *M)
FPM->run(F);

return M;
}
28
Setting up a JIT compiler: LLVM (page 11)
};

} // end namespace orc


} // end namespace llvm

#endif // LLVM_EXECUTIONENGINE_ORC_KALEIDOSCOPEJIT_H

29
Setting up a JIT compiler: LLVM (Summary?)
• All that code (137 lines) is in KaleidoscopeJIT.h

• Another 67 lines of code needed in the main .cpp file


• Invoke initialization code, compile methods, call generated code, etc.
• Does not count translation from Kaleidoscope to LLVM IR (that’s coming)

30
Setting up a JIT compiler (OMR JitBuilder)
#include “Jit.hpp” // header file from OMR

• Plus the code to initialize:


{

initializeJit();

}

31
Setting up a JIT compiler (OMR JitBuilder)
#include “Jit.hpp” // header file from OMR

• Plus the code to initialize:


{

initializeJit();

}

32
Compiling a Kaleidoscope method
• Parser examines string the user typed in

• Make sure it’s well formed

• Generates an Abstract Syntax Tree (AST)

33
Example: Abstract Syntax Tree for fib(x)
Function
if
“fib”
Cond Else
Then
Binary Num Binary
< 1.0 +

Var Num Call Call


“x” 3.0 “fib” “fib”

Binary Binary
- -
Var Num Var Num
“x” 1.0 “x” 2.0
34
Compiling a Kaleidoscope method
• Each type of AST node has a class associated with it
• NumberExprAST, VarExprAST, CallExprAST, IfExprAST, etc.
• codegen() method generates LLVM IR for that node
• Calls codegen() for children at appropriate places

• Once expression is parsed, generate code for it:


• Call codegen() method on root node of the AST
• Compile the full IR using LLVM to get a native method
• Call that native method to compute result 35
Simple codegen() for constants (64-bit float)

Value *NumberExprAST::codegen() {
return ConstantFP::get(TheContext, APFloat(Val));
}

Const
3

TR::IlValue *NumberExprAST::codegen(TR::IlBuilder *b) {


return b->ConstDouble(Val);
}

36
More complicated codegen(): if expression
Value *IfExprAST::codegen() { Builder.CreateBr(MergeBB);
// Codegen of 'Then' can change the current block, update
Value *CondV = Cond->codegen(); ThenBB for the PHI.
if (!CondV) ThenBB = Builder.GetInsertBlock();

return nullptr;
// Emit else block.
TheFunction->getBasicBlockList().push_back(ElseBB);
// Convert condition to a bool by comparing equal to 0.0.
Builder.SetInsertPoint(ElseBB);
CondV = Builder.CreateFCmpONE(

CondV, ConstantFP::get(TheContext, APFloat(0.0)), Value *ElseV = Else->codegen();


"ifcond"); if (!ElseV)
Function *TheFunction = Builder.GetInsertBlock()->getParent(); return nullptr;

// Create blocks for the then and else cases. Insert the
'then' block at the end of the function. Builder.CreateBr(MergeBB);
// Codegen of 'Else' can change the current block, update
BasicBlock *ThenBB = BasicBlock::Create(TheContext, "then",
ElseBB for the PHI.
TheFunction);
ElseBB = Builder.GetInsertBlock();
BasicBlock *ElseBB = BasicBlock::Create(TheContext, "else");

BasicBlock *MergeBB = BasicBlock::Create(TheContext, // Emit merge block.


"ifcont"); TheFunction->getBasicBlockList().push_back(MergeBB);
Builder.CreateCondBr(CondV, ThenBB, ElseBB); Builder.SetInsertPoint(MergeBB);
PHINode *PN = Builder.CreatePHI(Type::getDoubleTy(TheContext),
2, "iftmp");
// Emit then value.

Builder.SetInsertPoint(ThenBB);
PN->addIncoming(ThenV, ThenBB);
PN->addIncoming(ElseV, ElseBB);
Value *ThenV = Then->codegen(); return PN;
}
if (!ThenV)
return nullptr; 37
More complicated codegen(): if expression
Value *IfExprAST::codegen() { Builder.CreateBr(MergeBB);
// Codegen of 'Then' can change the current block, update
Value *CondV = Cond->codegen(); ThenBB for the PHI.
if (!CondV) ThenBB = Builder.GetInsertBlock();

return nullptr;
// Emit else block.
TheFunction->getBasicBlockList().push_back(ElseBB);
// Convert condition to a bool by comparing equal to 0.0.
Builder.SetInsertPoint(ElseBB);
CondV = Builder.CreateFCmpONE(

CondV, ConstantFP::get(TheContext, APFloat(0.0)), Value *ElseV = Else->codegen();


"ifcond"); if (!ElseV)
Function *TheFunction = Builder.GetInsertBlock()->getParent(); return nullptr;

// Create blocks for the then and else cases. Insert the
'then' block at the end of the function. Builder.CreateBr(MergeBB);
// Codegen of 'Else' can change the current block, update
BasicBlock *ThenBB = BasicBlock::Create(TheContext, "then",
ElseBB for the PHI.
TheFunction);
ElseBB = Builder.GetInsertBlock();
BasicBlock *ElseBB = BasicBlock::Create(TheContext, "else");

BasicBlock *MergeBB = BasicBlock::Create(TheContext, // Emit merge block.


"ifcont"); TheFunction->getBasicBlockList().push_back(MergeBB);
Builder.CreateCondBr(CondV, ThenBB, ElseBB); Builder.SetInsertPoint(MergeBB);
PHINode *PN = Builder.CreatePHI(Type::getDoubleTy(TheContext),
2, "iftmp");
// Emit then value.

Builder.SetInsertPoint(ThenBB);
PN->addIncoming(ThenV, ThenBB);
PN->addIncoming(ElseV, ElseBB);
Value *ThenV = Then->codegen(); return PN;
}
if (!ThenV)
return nullptr; 38
More complicated codegen(): if expression
{ /* then branch */ }
Value *IfExprAST::codegen() { Builder.CreateBr(MergeBB);

cond = <expression>
// Codegen of 'Then' can change the current block, update
Value *CondV = Cond->codegen(); ThenBB for the PHI.
if (!CondV) ThenBB = Builder.GetInsertBlock();

return nullptr;
// Emit else block.
TheFunction->getBasicBlockList().push_back(ElseBB);
// Convert condition to a bool by comparing equal to 0.0.
Builder.SetInsertPoint(ElseBB);
CondV = Builder.CreateFCmpONE(

CondV, ConstantFP::get(TheContext, APFloat(0.0)), Value *ElseV = Else->codegen();


"ifcond"); if (!ElseV)

{ /* else branch */ }
Function *TheFunction = Builder.GetInsertBlock()->getParent(); return nullptr;

if (cond) { } else { }
// Create blocks for the then and else cases.
'then' block at the end of the function.
Insert the

BasicBlock *ThenBB = BasicBlock::Create(TheContext, "then",


Builder.CreateBr(MergeBB);
// Codegen of 'Else' can change the current block, update
ElseBB for the PHI.
TheFunction);
ElseBB = Builder.GetInsertBlock();
BasicBlock *ElseBB = BasicBlock::Create(TheContext, "else");

BasicBlock *MergeBB = BasicBlock::Create(TheContext, // Emit merge block.


"ifcont"); TheFunction->getBasicBlockList().push_back(MergeBB);
Builder.CreateCondBr(CondV, ThenBB, ElseBB); Builder.SetInsertPoint(MergeBB);

/* result of if */
PHINode *PN = Builder.CreatePHI(Type::getDoubleTy(TheContext),
2, "iftmp");
// Emit then value.

Builder.SetInsertPoint(ThenBB);
PN->addIncoming(ThenV, ThenBB);

{ /* then branch */ }
PN->addIncoming(ElseV, ElseBB);
Value *ThenV = Then->codegen(); return PN;
}
if (!ThenV)
return nullptr; 39
LLVM-isms: basic blocks, insertion points and SSA
Value *IfExprAST::codegen() { Builder.CreateBr(MergeBB);
// Codegen of 'Then' can change the current block, update
Value *CondV = Cond->codegen(); ThenBB for the PHI.
if (!CondV) ThenBB = Builder.GetInsertBlock();

return nullptr;
// Emit else block.
TheFunction->getBasicBlockList().push_back(ElseBB);
// Convert condition to a bool by comparing equal to 0.0.
Builder.SetInsertPoint(ElseBB);
CondV = Builder.CreateFCmpONE(

CondV, ConstantFP::get(TheContext, APFloat(0.0)), Value *ElseV = Else->codegen();


"ifcond"); if (!ElseV)
Function *TheFunction = Builder.GetInsertBlock()->getParent(); return nullptr;

// Create blocks for the then and else cases. Insert the
'then' block at the end of the function. Builder.CreateBr(MergeBB);
// Codegen of 'Else' can change the current block, update
BasicBlock *ThenBB = BasicBlock::Create(TheContext, "then",
ElseBB for the PHI.
TheFunction);
ElseBB = Builder.GetInsertBlock();
BasicBlock *ElseBB = BasicBlock::Create(TheContext, "else");

BasicBlock *MergeBB = BasicBlock::Create(TheContext, // Emit merge block.


"ifcont"); TheFunction->getBasicBlockList().push_back(MergeBB);
Builder.CreateCondBr(CondV, ThenBB, ElseBB); Builder.SetInsertPoint(MergeBB);
PHINode *PN = Builder.CreatePHI(Type::getDoubleTy(TheContext),
2, "iftmp");
// Emit then value.

Builder.SetInsertPoint(ThenBB);
PN->addIncoming(ThenV, ThenBB);
PN->addIncoming(ElseV, ElseBB);
Value *ThenV = Then->codegen(); return PN;
}
if (!ThenV)
return nullptr; 40
If Expression code generation with JitBuilder
TR::IlValue *IfExprAST::codegen(TR::IlBuilder *b) {

cond = <expression>
TR::IlValue *CondV = Cond->codegen(b);

if (!CondV)
return nullptr;

TR::IlBuilder *thenB = nullptr, *elseB = nullptr;

if (cond) { } else { }
b->IfThenElse(&elseB, &thenB,

b-> EqualTo(

CondV,
b-> ConstDouble(0.0)));

TR::IlValue *ThenV = Then->codegen(thenB); { /* then branch */ }


TR::IlValue *ElseV = Else->codegen(elseB);
{ /* else branch */ }
/* result of if */
// store "else" result into the same local variable as the "then" result

elseB->StoreOver(ThenV, ElseV);

// result represented by ThenV regardless which path executes


return ThenV;
}

41
If Expression code generation with JitBuilder
TR::IlValue *IfExprAST::codegen(TR::IlBuilder *b) {

cond = <expression>
TR::IlValue *CondV = Cond->codegen(b);

if (!CondV)
return nullptr;

TR::IlBuilder *thenB = nullptr, *elseB = nullptr;

if (cond) { } else { }
b->IfThenElse(&elseB, &thenB,

b-> EqualTo(

CondV,
b-> ConstDouble(0.0)));

TR::IlValue *ThenV = Then->codegen(thenB); { /* then branch */ }


TR::IlValue *ElseV = Else->codegen(elseB);
{ /* else branch */ }
/* result of if */
// store "else" result into the same local variable as the "then" result

elseB->StoreOver(ThenV, ElseV);

// result represented by ThenV regardless which path executes


return ThenV;
}

42
Let’s be
a bit more
scientific

43
Comparing Kaleidoscope implementations
Source code lines: compare using “cloc” Final Executable Size (MB)
1000 928 OMR 22.5
19.932236
900 825
JitBuilder 20
800
~11% less
17.5 OMR
700 OMR 15
JitBuilder
600
500
JitBuilder 12.5
~64%
400
373 ~25% less 10
smaller
7.106188
280 7.5
300
200 5

100 2.5

0 0
All Source CodeGen+JIT
LLVM OMR JitBuilder
LLVM OMR JitBuilder

Measured on my MacBook Pro, x86-64


44
Compile time and performance?
• Ran experiments on my MacBook Pro
• Exit all applications, disabled all networks, disabled backup service
• Studied a few benchmarks, settled on a stripped down Mandelbrot
• Mandelbrot was one of the original tutorial’s examples
• Easily modified into a purely computational workload (original spends most of
its time outputting characters)
• Multiple functions gives more opportunities to measure compile time
• Instrumented Kaleidoscope to measure and report exactly compile time and
execution time
• Ran 10 times, results were quite stable

45
Mandelbrot code: 9 functions + final benchmark
def unary!(v) def binary| 5 (LHS RHS)

if v then if LHS then

0 1 mandelconverger(-2.3, -1.3, 0, -1.3, -1.3);

else else if RHS then

1; 1 def mandelconverge(real imag)

!1.0; else mandelconverger(real, imag, 0, real, imag);

0; mandelconverge(-2.3,-1.3);

def unary-(v) 1.0 | 0.0;

0-v; def mandelhelp(xmin xmax xstep ymin ymax ystep)

-1.0; def binary& 6 (LHS RHS) for y = ymin, y < ymax, ystep in (

if !LHS then (for x = xmin, x < xmax, xstep in

def binary> 10 (LHS RHS) 0 mandelconverge(x,y))

RHS < LHS; else );

1.0 > 2.0; !!RHS; mandelhelp(-2.3, -2.3+(-1.3)*78, 0.05, -1.3, -1.3+0.07*40, 0.07);

1.0 & 2.0;

def mandel(realstart imagstart realmag imagmag)

def mandelconverger(real imag iters creal cimag) mandelhelp(realstart, realstart+realmag*78, realmag,

if iters > 255 | (real*real + imag*imag > 4) then imagstart, imagstart+imagmag*40, imagmag);

iters mandel(-2.3, -1.3, 0.05, 0.07);

else

mandelconverger(real*real - imag*imag + creal,


for i=1,i<2500 in
(real+real)*imag + cimag,

iters+1, creal, cimag);


mandel(-2.3, -1.3, 0.05, 460.07);
Compile Time: JitBuilder averages 3.3X faster
LLVM: 5 opts JitBuilder: 35 opts
(note: all opts are active but only a few perform significant transformations in this simple benchmark)

Compile Time (milliseconds)


4.5 3.91
4
3.5 3.13
2.91
3
2.5
1.86 1.86 1.89
2 1.52 1.53 1.47 1.65
1.46 1.30
1.5
0.83 0.80
1 0.45 0.58 0.55
0.21 0.30 0.36
0.5
0

LLVM OMR JitBuilder


Measured on my MacBook Pro, x86-64
47
Performance: OMR JitBuilder ~3% faster
Time (milliseconds)
3000
2613 2540
2500

2000

1500

1000

500

0
2500 iterations of mandel

LLVM OMR JitBuilder

Measured on my MacBook Pro, x86-64


48
So how does all this compare to LLVM?
• Nowhere near a complete evaluation, but…
• For this dynamic compilation scenario, OMR JitBuilder has:
• 25% less code generation source code that’s more straight-forward to read
• 64% smaller executable
• 3.3X faster compilation time performing 7X the number of optimizations
(which all have compile time impact even though only a few have code impact)
• 3% faster generated code

• Some caveats:
• Procedural language, very small functions, most optimizations do little because
there aren’t opportunities (still takes time to discover that :) )
• As with all compilers and performance comparisons: Your Mileage May Vary 49
Part II

Prototype JIT Compiler for


Swift on the Server
using OMR JitBuilder

50
Swift: modern native language for mobile
Modern Native Language
• “C replacement”
• Inferred typing
• Implicit memory management

Small and Fast: designed for mobile environment


• Low memory usage
• Fast startup
• Immediate performance

Mobile “constraints”: memory, battery, CPU


• Compiled statically and no JIT

51
Swift: also for the server!
Mobile + server can be better than mobile alone
• Centralized data
• Computation heavy algorithms consume battery
• Server-based services enhance mobile experience

Swift on both mobile device and server makes sense


• Single developer ecosystem
• Better leverage on language investments
• Interoperability

Server tolerates very different trade-offs than mobile


• Maybe JIT more feasible? Hybrid approach?

52
Chris Lattner at IBM Programming Languages Day 2016

http://researcher.watson.ibm.com/researcher/files/us-lmandel/lattner.pdf

53
Project to prototype a Swift JIT compiler

1. Implement integration points needed to JIT compile


(compiler technology agnostic)

2. Implement a prototype JIT compiler using JitBuilder

54
Compiling a Swift program

swiftc

Linker
.swift Object Libraries,
file can be
dynamic
libraries
Familiar “static” language flow:
Executable
Run swiftc to compile Swift source code to object files
Link object files against libraries to create executable
Run executable!

55
Compiling a Swift program (one level deeper)

swiftc

llvm
Linker
.swift .sil .sil LLVM Object Libraries,
(Swift IL) Optimized bitcode file can be
SIL dynamic
libraries
Familiar “static” language flow:
Executable
Run swiftc to compile Swift source code to object files
Link object files against libraries to create executable
Run executable!

Under covers, swiftc uses Swift Intermediate Language (SIL)


Optimize code with knowledge of Swift semantics
Convert to LLVM IR, use LLVM to do low level optimization, generate native code 56
1. Integration points for a Swift JIT compiler
• swiftc mods controlled by option (e.g. –enable-jit)
1. Modify main() to load JIT compiler and pass it control
before continuing execution
2. Ability to atomically redirect any method entry point to
the JIT compiler (patchable functions + 32B prefix)
• Link time change:
• Add optimized Swift IL (SIL) for all functions to the final
executable in an otherwise unused section
57
2. Implement Prototype JIT Compiler for Swift
• Compile Swift methods on first invocation:
1. Load optimized Swift IL for target method from executable
2. Use JitBuilder API to compile method
• buildIL() walks SIL making JitBuilder calls to generate OMR compiler IL
3. Once compiled, replace original call target with compiled code entry
point
4. Invoke the compiled code which will return to original caller

• Entire Swift JIT Runtime is ~ 3,850 lines of mostly C++ code


• ~2000 lines for JitBuilder code to translate SIL
58
Current status of the prototype
• Implements about 1/3 of SIL instructions
• Primitive data types, optionals (for reference types), alloc/dealloc, reference
counting
• Limits Swift programs that can be compiled
• Targeted a moderately complex benchmark (richards)

• Patches every Swift function to be recompiled on first invocation


• Simple, but not the right long term approach

• Work happened primarily last year, so somewhat older technology levels


• Swift 3 and and a version of OMR from September 2016 59
richards benchmark
• The classic C benchmark simulating OS task scheduling
• Lots of pointer manipulation, data shuffling, integer
arithmetic & logical ops
• We translated it to ~400 lines of Swift
• Specifically avoided complex Swift-isms that we don’t
yet handle
• e.g. translated C structs to Swift classes to work around value
type issues
60
More details about richards
• Optimized swiftc binary: 17 functions
• 7 global initializers, 10 benchmark functions
• Lots of code has already been inlined
• We compile all 17 prior to their 1st invocation
• Keen interest to prototype JIT-specific opts
• Not yet concerned about startup & compilation cost
• Known inefficiencies in SIL translation to OMR compiler
IL (e.g. handling of tuples)
61
Early results: 4 measurements
1. Initial functional
• We got richards to run and generate the right result!
2. Caller patching
• Straightens calls to functions that have been recompiled
• Redirects (patches) original call to directly call recompiled Expect to be slower
code entry point than static compiler

3. Lazy initialization patching Should be getting


• Patch calls to swift_once() to NOP after first call better than static
compilation
4. Inlined retain/release
• Accelerate reference counting by inlining from Swift runtime
library
62
Early results: startup & compilation overhead
JIT Recompiled v.s. Static Binary
1st Iteration of richards
80

70
67.74X
Relative Time Taken v.s. Static

60

50
40.81X 39.64X
40
37.59X

30

20

10

0
Initial functional Caller patching Lazy init patched away Inlined retain/release
JIT Prototype Milestones

Measured on Ubuntu 16.04, x86-64


63
Early results: steady state performance
JIT Recompiled v.s. Static Binary
Next 1000 Iterations of richards
2.5

2.053X
Relative Time Taken v.s. Static

2.0

1.474X
1.5 1.343X
0.994X
1.0

0.5

0.0
Initial functional Caller patching Lazy init patched away Inlined retain/release
JIT Prototype Milestones

Measured on Ubuntu 16.04, x86-64


64
Lots still to do!
• How should JIT compilers integrate into the Swift ecosystem?
• Will work with Swift community via mailing list to establish integration points
• Modernize to Swift 4 and latest OMR
• Updating to latest OMR brought more than 50% compile time improvement
• Expect to make some baseline performance improvements as well
• Continue to expand SIL instruction support
• Enable more complicated benchmarks
• Consider more aggressive optimizations
• More cross-module inlining is first target (JitBuilder work will help here)

65
Wrap Up

66
Eclipse OMR and LLVM
• Eclipse OMR project built from and for dynamic language environments
• Some overlap with LLVM, natural to ask how the two compare and relate

67
Part I: Compiler vs. Compiler
• For a dynamic compilation scenario, OMR JitBuilder did pretty well:
• 25% less source code that’s more straight-forward to read
• 64% smaller executable
• 3.3X faster compilation time performing 7X the number of optimizations
(which all have compile time impact even though only a few have code impact)
• 3% faster generated code

No such thing as a “final” answer: lots more work and evaluation to do


Competition is healthy!

68
Part II: Static + dynamic compiler synergy
• Hybrid static – dynamic compiler model may have promise for Swift on Server
• Start with high performance statically compiled code using LLVM
• Optimize across module boundaries at runtime where it makes sense
• Remain fully compatible with Swift for mobile platforms

• Early prototype results both disappointing and promising


• Current 50% slowdown due to poor code quality handling tuples
• BUT lazy init patching and inlined retain/release made up for that 50%
• No obvious reasons Eclipse OMR could not make up the 50% gap

• Look forward to further explore JIT techniques with the Swift community

69
Want to learn more?
https://www.eclipse.org/omr
• Eclipse OMR https://github.com/eclipse/omr
https://developer.ibm.com/code/open-source-ibm/eclipse-omr/
• JitBuilder blog
https://developer.ibm.com/open/2016/07/19/jitbuilder-library-and-eclipse-omr-just-in-time-compilers-made-easy/
https://developer.ibm.com/open/2017/03/08/build-more-complicated-methods-using-the-jitbuilder-library/
https://developer.ibm.com/open/2017/04/14/building-jitbuilder-library-source-code/

• Language ports
• Base9 https://github.com/youngar/Base9
• SOM++ (Smalltalk) https://github.com/charliegracie/SOMpp/tree/jitbuilder_vmstate
• Lua Vermelha https://github.com/Leonardo2718/lua-vermelha
• Rosie Pattern Language JIT https://github.com/mstoodle/rosie-pattern-language
https://github.com/mstoodle/rosie-lpeg
• Ruby https://github.com/rubyomr-preview/rubyomr-preview
https://github.com/rubyomr-preview/rbjitglue
• OpenJ9 (Java) coming soon 70
Questions?

71

Вам также может понравиться