Each time I go into a round of preparing for course delivery I try to drop in information about any recurrent problem that I've run into. This time round the recurrent problem is object creation.
Included in the list of latest places where I've seen object creation become a hinderance to performance was the Scala compiler and a few applications that have been aggressively logging activity. In the former case the problem was due to frequent resizing of the main symbol table. Hopefully the next version of the Scala compiler will fix that problem. The later case had to do with premature construction of debug messages.
Most logging looks something like...
log.debug( "Some tag " + variable.toString() + ", " + someMoreInformation);
The problem is that log.debug(String msg) causes all of the stuff in the brackets to be executed. Not only can this coding style create a significant number of object, it is very hard to detect and very time consuming to fix in an application of significant enough size.
Why is object creation difficult to detect? It just looks as if your application is slow but for no measurable reason. Objects created by the String concatenation will most likely *never* escape from young generational space. Since the young generational spaces use evacuating collectors, the cost of collecting the object is close to zero. However, the cost of creating it isn't. More over object creation involves updating shared structures which implies it involves some sort of concurrency control. This implies a degree of serialization which implies an upper bound on the rate at which objects can be created.
In the case of log messages there is an easy solution, wrap each call in an if statement that checks the current log level. If the log message will never get logged, there isn't any point in creating all of those extra objects. The overall improvements may vary but I've never found them to insignificant.
Log4J
if ( log4j.isDebugEnabled())
log.debug(message + “ “ + otherMessage);
JDK Logging
if ( logjdk.isLoggable( Level.WARNING))
logjdk.warning(message + “ “ + otherMessage);
It's a simple trick and certainly not a new one. But I think it's worth mentioning again.
Come join me in Chania Greece at my performance tuning course if you are interested in more details on GC or any other performance related topic.
FYI - slf4j helps a bit here. It uses parameter substitution to format log
messages for you, and skips the formatting if the message won't be getting
output. e.g.:
It's sort of funny to see the "anti-if campaign" badge on this page.
I am familiar with the logging paradigm you advocate, but every time I see
it it leaves me thinking "there has to be a better way than this".
Michael, your comment just cracked me up.
Hi Kirk,
We had exactly the same rule, until we switched to the printf style
mentioned in the comment :)
BTW closures would help here again, to remove the if statement.
Regards,
Markus
Is it not so obvious that the solution is not to log in the first place and
most certainly not to write the actual statement in the code itself.
At my job I wrote wrappers for Log and LogFactory such that developers can
write complex MessageFormat style log statements without performance
penalties. It creates an anonymous Object (closure) for each formatted log
call with a toString() that creates and runs the MessageFormat. The only
space cost if the log level is off is the Object array the developer sends.
The big benefit at runtime is avoiding the quadratic order time cost of
string concatenation.
If only we had closures we could factor the if-statement away nicely.
>More over object creation involves updating shared structures
Dimitris I believe Kirk was referring to the actual object construction
(executing code) and not the memory allocation.
It surely doesn't solve the problem neither, but it allows sparring a
function call and makes the code "less unreadable": you can always declare
a static variable together with the Logger, like:
boolean debug = logger.isDebugEnabled();
if (debug) {
}
Dimitris, I won't say that I fully understand what is going on but there is
clearly a benefit to reducing object allocation (aside from GC benefits)
that goes beyond fast path allocations. It gives me the feeling of some
lock contention which is why used implies instead of a more definitive
word. Clearly the effect of thread local allocation is significant. Try
running with -XX:-UseTLAB and you'll see the difference.
But there are some possibilities to overcome the object creation thing.
Look at the logback project of the(or one of them dunno) founder of log4j.
There he uses a printf like possibility to parse arguments without parsing
the String. Ok it will create an array, because of the varargs and it will
execute all you parse to it. But it's one possibility to partly remove the
double check if no real execution is taken place.
Couldn't we use some AOP here using aspects to guard the calls with if
statements?