Вы находитесь на странице: 1из 10

High Performance Graphics On The CPU Using ispc

Matt Pharr Intel


Beyond Programmable Shading Course, ACM SIGGRAPH 2011

ispc: Goals
Deliver excellent performance to programmers who
want to run SPMD programs on the CPU

Provide a thin abstraction layer: programmer can


cleanly reason about what the compiler will do

Allow close-coupling and fine-grained interactions


between C/C++ code and ispc code

Beyond Programmable Shading Course, ACM SIGGRAPH 2011

ispc Execution Model


Program instances are executed in n-wide SPMD when
control transfers from C/C++ application code to ispc code (parallel_for)

n is typically 4 or 8 for 4-wide vector units (SSE) You can use your own task/threading system to run over
concurrent execution contexts

Or, use launch and sync in ispc to express concurrent


tasks
Beyond Programmable Shading Course, ACM SIGGRAPH 2011

ispc: Key Features


C-based syntax Pointers, data structures shared with C/C++ code
(no driver/data reformatting)

Only a function call boundary between C/C++ and ispc code

Recursion, externally-defined functions just work Rich standard library: vectorized transcendentals, atomics, ...

Beyond Programmable Shading Course, ACM SIGGRAPH 2011

Building Applications with ispc


ispc Source ispc Source

C/C++Source Source C/C++


C/C++ Compiler
Object File

ispc Compiler Object File

Linker Executable
Beyond Programmable Shading Course, ACM SIGGRAPH 2011

Hello ispc
C++ Application Code
int nVertices = ...; float *x = new float[nVertices]; float *y = new float[nVertices]; float *z = new float[nVertices]; // fill in x[], y[], z[] float matrix[3][3] = { { ... }, ... }; transform3x3(x, y, z, matrix, nVertices);

ispc Code

export void transform3x3(uniform float xarray[], uniform float yarray[], uniform float zarray[], uniform float m[4][4], uniform int nVertices) { uniform int i; for (i = 0; i < nVertices; i += programCount) { float x = xarray[i + programIndex]; float y = yarray[i + programIndex]; float z = zarray[i + programIndex]; float xt = m[0][0]*x + m[0][1]*y + m[0][2]*z; float yt = m[1][0]*x + m[1][1]*y + m[1][2]*z; float zt = m[2][0]*x + m[2][1]*y + m[2][2]*z; xarray[i + programIndex] = xt; yarray[i + programIndex] = yt; zarray[i + programIndex] = zt; }

Beyond Programmable Shading Course, ACM SIGGRAPH 2011

A Ray Tracer in ispc


C++ Application Code
int width = ..., height = ...; const float raster2camera[4][4] = { ... }; const float camera2world[4][4] = { ... }; float *image = new float[width*height]; Triangle *triangles = new Triangle[nTris]; LinearBVHNode *nodes = new LinearBVHNode[nNodes]; // init triangles and nodes raytrace(width, height, raster2camera, camera2world, image, nodes, triangles);

ispc Code
export void raytrace(uniform int width, uniform int height, const uniform float raster2camera[4][4], const uniform float camera2world[4][4], uniform float image[], const LinearBVHNode nodes[], const Triangle triangles[]) { // ... // map program instances to rays // ... for (y = 0; y < height; y += yStep) { for (x = 0; x < width; x += xStep) { Ray ray; generateRay(raster2camera, camera2world, x+dx, y+dy, ray); BVHIntersect(nodes, triangles, ray); int offset = (y + idy) * width + (x + idx); image[offset] = ray.maxt; id[offset] = ray.hitId; } } }

Beyond Programmable Shading Course, ACM SIGGRAPH 2011

Performance
Dual Xeon X5680 (12 cores)
Serial C Ray Tracer SH Radiance Probe Gen. Deferred Shading 1x 1x 1x ispc SPMD + tasks 102.25x 65.71x 39.40x
Beyond Programmable Shading Course, ACM SIGGRAPH 2011

Integration With Regular Debuggers

Beyond Programmable Shading Course, ACM SIGGRAPH 2011

Try It Yourself!
ispc is available in open-source from
http://ispc.github.com

Codegen uses the (excellent) LLVM compiler toolkit Supports Linux, Windows, Mac OS X x86 and x86-64 targets, SSE2 and SSE4
(AVX soon)

Beyond Programmable Shading Course, ACM SIGGRAPH 2011

Вам также может понравиться