Вы находитесь на странице: 1из 30

Performance optimization techniques for Java code

Who am I and why should you trust me?

Attila-Mihly Balzs http://hype-free.blogspot.com/ Former malware researcher (low-level guy) Current Java dev (high level dude) Spent the last ~6 monts optimizing a large (1 000 000+ LOC) legacy system Will spend the next 6 months on it too (at least )

Question everything!

What's this about


Core principles Demo 1: collections framework Demo 2, 3, 4: synchronization performance Demo 5: ugly code, is it worth it? Demo 6, 7, 8: playing with Strings Conclusions Q&A

What this is not about


Selecting efficient algorithms High level optimizations (architectural changes) These are important too! (but require more effort, and we are going for the quick win here)

Core principles

Performance is a balence, and endless game of shifting bottlenecks, no silver bullets here!
Your program CPU CPU Memory Memory

Disk Disk

Network Network

Perform on all levels!

Performance has many levels:


Compiler (JIT): 5 to 6: 100%(1) Memory: L1/L2 cache, main memory Disk: cache, RAID, SSD Network: 10Mbit, 100Mbit, 1000Mbit

Until recently we had it easy (performance doubled every 18 months) Now we need to do some work

(1) http://java.sun.com/performance/reference/whitepapers/6_performance.html

Core principles

Measure, measure, measure! (before, during, after). Try using realistic data! Watch out for the Heisenberg effect (more on this later) Some things are not intuitive:

Pop-question: if processing 1000 messages takes 1 second, how long does the processing of 1 message take?

Core principles

Troughput Latency Thread context, context switching Lock contention Queueing theory Profiling Sampling

Feasibility numbers everyone should know (2)


L1 cache reference 0.5 ns Branch mispredict 5 ns L2 cache reference 7 ns Mutex lock/unlock 100 ns Main memory reference 100 ns Compress 1K bytes with Zippy 10,000 ns Send 2K bytes over 1 Gbps network 20,000 ns Read 1 MB sequentially from memory 250,000 ns Round trip within same datacenter 500,000 ns Disk seek 10,000,000 ns Read 1 MB sequentially from network 10,000,000 ns Read 1 MB sequentially from disk 30,000,000 ns Send packet CA->Netherlands->CA 150,000,000 n

(2) http://research.google.com/people/jeff/stanford-295-talk.pdf

Feasability

Amdahl's law: The speedup of a program using multiple processors in parallel computing is limited by the time needed for the sequential fraction of the program.

Course of action

Have a clear (written?), measourable goal: operation X should take less than 100ms in 99.9% of the cases Measure (profile) Is the goal met? The End Optimize hotspots go to step 2

Tools

VisualVM JProfiler YourKit Eclipse TPTP Netbeans Profiler

Demo 1: collections framework

Name 3 things wrong with this code:

Vector<String> v1; if (!v1.contains(s)) { v1.add(s); }

Demo 1: collections framework

Wrong data structure (list / array instead of set), hence slooow performance for large data sets (but not for small ones!) Extra synchronization if used by a single thread only Not actually thread safe! (only exception safe)

Demo 1: lessons

Use existing classes Use realistic sample data Thread safety is hard! Heisenberg (observer) effect

Demo 2, 3, 4: synchronization performance

If I have N units of work and use 4, it must be faster than using a single thread, right? What does lock contention look like? What does a synchronization train(wreck) look like?

Demo 2, 3, 4: lessons

Use existing classes


ReadWriteLock java.util.concurrent.*

Use realistic sample data (too short / too long units of work) Sometimes throwing a threadpool at it makes it worse! Consider using a private copy of the variable for each thread

Demo 5: ugly code, is it worth it?

Parsing a logfile

Demo 5: lessons

Sometimes yes, but always profile first!

Demo 6: String.substring

How are strings stored in Java?

Demo 6: Lesson

You can look inside the JRE when needed!

Demo 7: repetitive strings

Demo 7: Lessons

You shouldn't use String.intern:


Slow You have to use it everywhere Needs hand-tuning

Use a WeakHashMap for caching (don't forget to synchronize!) Use String.equals (not ==)

Demo 8: charsets

ASCII ISO-8859-1 UTF-8 UTF-16

Demo 8: lessons

Use UTF-8 where possible

Conclusions

Measure twice, cut once Don't trust advice you didn't test! (including mine) Most of the time you don't need to sacrifice clean code for performant code

Conclusions

Slides:

Google Groups http://hype-free.blogspot.com/ x_at_y_or_z@yahoo.com http://code.google.com/p/hypefree/source/browse/#svn/trunk/javaperfopt-201003

Source code:

Profiler evaluation licenses

Resources

https://visualvm.dev.java.net/ http://www.ej-technologies.com/ http://blog.ej-technologies.com/ http://www.yourkit.com/ http://www.yourkit.com/docs/index.jsp http://www.yourkit.com/eap/index.jsp

Thank you! Questions?

Вам также может понравиться