I just got home from a wonderful two days at JFokus in Stockholm. The first day of the event were half day tutorial sessions. I decided to pick on Jeff Geneder's ESB talk. It was well attended and quite interesting. Jeff's talk on the next day did get wacked by the demo gods. He was connected to the wireless network when doing the demo and for some strange reason it completely messed up the demo. The whole thing worked perfectly the day before in the tutorial when he wasn't connected. Go figure. No matter, the talk very complete and the demo would just have been icing on the cake.
My talk went ok. I started in on a something Canadians and Swedes have in common, national pride in our respective hockey teams. Of course there were a few tongue in cheek digs on how the wimpy European/Swedish players couldn't hack the NHL. Now the league is filled with Swedes and they all seem to know how to look after themselves.
During the talk I made reference to a benchmark that had been prepared by Jeroen Borgers from Xebia. It was a simple single threaded bench that appended strings together using StringBuilder and StringBuffer. I ran Jeroen's benchmarks and found that the overhead for using the sychronized version of append() was 168%. Conclusion, grabbing a lock is an expensive operation. In jumps a keen Googler who comments that lock acquisition wasn't expensive unless it was contended. After a bit of back and forth I didn't really get where he was coming from and if I had of been thinking I would have rolled the slides back to the benchmark. But then I hadn't really checked the benchmark as well as I should have so at that point I'm wondering, did a screw myself by 1) listening to common wisdom that wasn't so wise and 2) the benchmark was measuring something other than lock acquisition costs.
On the train home I pulled up the bench and started trolling through the JDK source code. As I'd remembered, append is only a wrapper for the method implemented in the super class AbstractStringBuilder. The only difference is that StringBuffer.append() is tagged with the synchronized modifier. So, the extra 168% run time is tied to that keyword right??? Well.... maybe.... and maybe not. The Google guy said that his calls in C were not expensive unless the thread was parked. This means that there are one of two things going on. The Google guy's bench is some how not correct or, it is correct and the JVM overhead is responsible. Either way, I think I can safely say that acquiring a lock in Java is not a cheap operation.
Next step is to go surfing through the JVM source code to see what nuggets are hidden there. According to David Dice's blog , I should find the use of a CAS. While a CAS is not so cheap, it should be better than using Kernel based spin-locks. It will be an interesting troll.
Uncontended locks are cheap. Failure to optimize across locking boundaries
is expensive. Broken string-append benchmarks featuring O(n^2) work are
broken - I've caught well-published Java experts in this mistake before.
Be very very wary that the benchmark is really doing what you think it is.
Cliff