Java Performance Services
Training, Seminars, Benchmarking, Tuning

Java Performance Tuning Course


Chania Crete, May 17-20, 2010


Sun Extreme Learning EXL-2025

Houston, December 1-4,2009
New York, December 8-11, 2009
Washington DC, January 5-8, 2010



San Francisco, January 11-14

Anti-if

I have joined Anti-IF Campaign

Calendar

««Nov 2009»»
SMTWTFS
1234
5
67
891011121314
15161718192021
22232425262728
2930

Performance Anti-Patterns

My Top Tags

                                       

Mailing List

My RSS Feeds








Latency is a performance bug

posted Thursday, 19 March 2009

In last months newsletter , Jack wrote about how he spends most of his time looking for latency. I could be wrong but I think he got his inspiration for the piece from an early discussion here where we both said that execution profiling doesn't seem to be as important as it had been. After reading what Jack has written tempered by my experiences, both old and recent, I've come to the conclusion that latency is bug that instead of being exposed as an incorrect result, it is exposed as a long pause.

Systems that we build today differ from those that we've built in the past in that there are many more opportunities for our threads of execution to be doing nothing. The systems we used to build tended to use run on one machine and that machine tended to have 1 CPU. Even if an application was running on a mainframe that contained many CPUs, the process model was such that each application would run in a single thread and that pretty much effectivily confined it to a single CPU. Sure you could fork. But a fork created an entirely new single threaded process.

Though fork was useful, it didn't change the model. It was only with the introduction of posix threads did we start to see applications break out of this single threaded mode. Once that happened, applications started putting pressure on the operating systems. Now operating systems needed to be threaded if not at first, thread-safe. One example of threading maturty back then (or lack of it) was an apparent decision by the Solaris team to put a single thread around the entire kernel. While you'd guess that application threads  would be tripping over themselves fighting for the OS lock, the reality was, it didn't seem to make that much of a difference. We were still very much bound by processor speeds and very much focused on execution hotspots, not lack of them.

Fast forward and today we have pleny of CPUs and plenty of machines and they are all fast enough for most problems. So we should be sitting pretty. But as is so elegantly expressed in Heinz's "Law of Sudden Riches ", more isn't always helpful and in some cases can be harmful. At the heart of the matter is our well known friend Amdalh's Law. To get useful work done, we must often must cooperate. It is this cooperation that reduces are ability to scale. When we cooperate, one party is invariably waiting on the other party. In other words, the amount of time our threads spend doing nothing is a function of how long it takes the other cooperating thread to come to the table. Since threads are still necessarily a limiting resources, a thread doing nothing means that some other piece of useful work isn't getting done. The longer the thread has to wait, the less that gets done.

Threads that are doing nothing are not using the CPU. When that translates into threads are not making forward progress fast enough, you've got a problem that execution profiling isn't designed to handle. Instead one needs to look at thread dumps. Thread that are parked waiting for something to happen will be clearly visible in a thread dump.

Another set of very useful tools in the war on latency is Firebug and YSlow. I have found Firebug invaluble in tracing latency. The results from using this tool have resulted in clients canceling scheduled work that we determined was unneccessary. This has resulted in 100 of thousands of $$$ saved. More over, it freed resources to work on tasks that drove the business forward rather than sideways. I can only assume that others that have used this tool has experienced very similar results.

What I'm not saying is that execution profiling is not useful. Nor am I saying that counting instructions in order to minimize execution timing is a waste of time. Each of these techniques are still valid in todays world. What I am saying is that they are no longer near the top of my list of things that I do when I go to performance tune an application. What I am suggesting is that profiling for nothing is almost always the first activity I find myself engaged in.

tags:            




1. William Louth left...
Monday, 23 March 2009 9:56 am :: http://williamlouth.wordpress.com

I am curious how does one go about "profiling for nothing" on a production server without any prior knowledge of the software & system execution models and daily/hourly workload/activity patterns. It would be rare to have every thread "parked" in a thread dump and as it typically in production systems into is the small but frequent service waits that kill the overall throughput and possibly the performance. Also thread dumps are pretty much useful in the current format because one cannot easily determine whether a series of call frames repeated across each dump are indeed the same request or another following the same execution path (very common with business applications).

I am not saying latency analysis is not important actually the opposite considering that I was the first to introduce to the Java profiling the collection of multiple metrics for each profiling/traced interval including: thread monitor waiting/blocking (latency) and gc (latency). But I do not think it is possible to do latency analysis without some prior understanding of the (expected) execution behavior of a various activities within a system which is the point of execution profiling if performance at a high enough level of abstraction.

Instead of marking things black and white maybe trying framing the data collection technologies and techniques with a process.


2. Kirk Pepperdine left...
Monday, 23 March 2009 12:41 pm

William,

Do you mean to say that JXInsight can't profile nothing? You're going to have to fix that! Actually, I got interrupted with this posting and instead of hitting save, I hit post. There really needs to be more said here for it to be useful. I'll get to it when I get home.


3. William Louth left...
Monday, 23 March 2009 2:37 pm :: http://williamlouth.wordpress.com

Naturally JXInsight does this "profiling nothing" and so much more, ;-).

Honestly this type of runtime analysis is performed within our timeline analysis mode and to some extent the metrics (time window) mode though this will become much more important within our metrica release that is in the works.

I will post a blog entry this week showing how this is performed in our tool and across multiple distributed servers.

William