Java Performance Services
Training, Seminars, Benchmarking, Tuning

Java Performance Tuning Course


Chania Crete, Sept 21-24, 2009


San Francisco June 8-11, 2009 - full

Stockholm Sweden, June 15-18, 2009


Speaking @

I'm Speaking At JavaOne

Calendar

««Jul 2009»»
SMTWTFS
    1234
567891011
12131415161718
19202122232425
262728293031

Performance Anti-Patterns

My Top Tags

                                       

Mailing List

My RSS Feeds








Concurrency, Databases, and Airports

posted Tuesday, 29 January 2008

I'm in Stockholm getting ready to speak at JFokus about concurrency and performance. Seems like it should be an obvious talk yeah? If you want better performance use more processors. Unfortunately I still see a number of applications that look like they should be able to achieve high levels of concurrency. That is until you take a closer look and you see that they have been, for one reason or another, tightly coupled to the database. In some cases so tightly coupled that practically user requests requires the application to touch the database, in some cases 100s if not 1000s of times.

If you happen to use a JPA compliant implementation such as Hibernate or TopLink, you maybe, maybe able to inject some caching to short-circuit many of the calls. However this isn't always an option, especially if the database group is in charge of your architecture. At this point you maybe thinking, we are most certainly in charge of our architecture so this isn't a problem. But the catch is, that DB groups have become so powerful in many organizations that they've been able to do things that have boxed you in, cut off your choices or other wise limited what you can do. In effect, the are in charge of your architecture.

Traditionally database technology has been leading the way in scalability. They've been able to deliver the goods and this is why they've gain so much influence. However much of the gain has been on the back of faster and faster clock speeds. It is old news, that gravey train has reached the end of the line end. Even prior to the tailing off faster clocks, databases interactions were starting to become the primary bottleneck in an increasing number of systems. All I can say is this is about to much worse quicker than most people believe.

The new reality is that some machines that contain hardware which you may now consider to be exotic is now moving to the main stream. Companies like Azul systems are producing hardware that absolutely crushes database technology. Although I know Azul to be a world class company that has world class engineers working for them, they are a small company not in the main stream of conciseness. So while this has limited their penetration into the market, the message in many of the result that they've achieved in many standard benchmarks have not been lost on those that have cared to listen. The question is; what happens when more mainstream hardware vendors such as Sun start releasing the same type of technology. For example, Sun's Rock will include a hybrid form of transactional memory.

Why is this important? I can show you all kinds of graphics that demonstrate the effects of overloading a point in a system that is already known to be a bottleneck. I can tell you, it isn't a pretty sight. Azul (and potentially Rock) have the capability to take an ordinary existing Java application and have it run significantly faster by simply using transactional memory to remove the locks, those things that accidentally take pressure off of the the already overwored database. Azul will crush your database!

In other words, all you have to do to unbalance you seemingly well oiled application is upgrade your hardware. And don't think for one minute that Oracle and IBM don't realize this. They are fighting what I believe will be a losing battle against this new trend in hardware. It is coming at them too quickly for them to alter the fundamential architecture of their technology. After all, databases were always about ACID and not speed. I believe (and please comment if you think I'm wrong) that Oracle bought Tangosol maybe as a quick entry into the grid game. They needed a response to IBM's ObjectGrid and Cameron and crew had done a very nice job of grabbing market share in the caching/grid space. However what they really bought was a technology that if used properly, could take a lot of pressure off of their database.

The key is, if it is used properly. And this is where developers and architects have to rethink everything they know. They need to some how normalize in bound data flows so that they can avoid the database by running through the cache first rather than tacking it onto the back of the O/R mapping tool.

Don't get me wrong, databases are not dead. They are just as important as they ever where and will remain important long into the future. They will continue to scale and there well maybe a time when they will return to the dominate role they've long enjoyed and deserved in many enterprise architectures. In fact, this important technology has not only set the stage for what is to come, it has helped up prepare for the on-coming multicore onslaught. In other words if you think quad-core is something, Azul's chips currently contain 20 cores and even that number isn't large when you consider the possibilities. So while the RDB has and will continue to be important, for the moment, the current trends in hardware are not in favor of a database centric architectures. We need to recognize this, adjust or we will be crushed.

How does all of this relate to airports? Well, on my journey I got re-routed due to some mechanical difficulties which lead me to Berlin. I'd not been in Berlin's airport before and for some strange reason this triggered a question, just how many airports had I visited. Turns out that Tegel is the 100th airport on my list. I'm not sure if this is a good thing ;-) 




1. William Louth left...
Tuesday, 29 January 2008 11:37 am :: http://blog.jinspired.com

Hi Kirk,

Oracle did not buy Tangosol just to make a "quick entry into the grid game". I believe they bought the technology built by Cameron and crew because they realized that data grids ( and compute grids) will become fundamental building blocks for every single product or service Oracle delivers over the coming years. I do not see transactional memory systems replacing similar software based solutions. Software systems will always provide higher level of abstractions that transactional memory systems will not be able to inherently support. We will more than likely see some kind of marriage between the software and hardware based solutions for particular specialized fields.

If the database days were numbered then why would Sun which is a hardware vendor buy MySQL? Sun seems very excited by the plan upgrades to the MySQL product that are currently being developed by a previous employee of Borland (Interbase).

William


2. Kirk Pepperdine left...
Wednesday, 30 January 2008 11:17 am

Hi William,

As always it is fun to have you disagree. I don't believe that databases are going away any time soon. This isn't a "this is the end of the database" blog by any means. What I'm trying to suggest is that current database technology is old and that is may not be able to keep up with the changes that are coming in the hardware environment. That said, DB2 and Oracle are not standing still, they are investing on updating the technology for the new hardware reality. What I'm suggesting is that the change is happening faster than they can react. This isn't a slam on Oracle or IBM as I think that all of use need time to adapt to changes.

What they are facing is Heinz's concurrency las, "Sudden Riches". The consequence of "Sudden Riches" are often unpredictable. You may think that you're going to be better only to find that you are breaking in unexpected ways. IOWs you can't test until you've got the hardware and the hardware isn't here just yet.

That said, the indications from testing on hardware such as Azul is that the database will become an even bigger bottleneck than they are today. Which mean if we are to take advantage of the new possibilities, we need to adjust our designs. IMHO, applications should avoid reading and writing directly to a database. I know, this is easier said than done because databases have grown up and offer a lot of functionality that makes it easy to do things there. But the cost of this ease of use is that we maybe obliged to visit the database on every single transaction. This activity in my experience, has been a serious limit to scalability. It also takes out of the hands of us some techniques that can help take pressure off of the database. Using a product such as Tangosol is but one example.

The reason why compute/data grids are becoming important is that they are taking over the role that has been traditionally found in the database. However there are many applications that will not tolerate the level latency that RDB imposes on them. Matching engines are a fine example of this. They have to performance tens of thousands of transactions per second with time bugests that are falling below 10ms. Obvious answer, setup a data grid with some remoting. It is still important that the data be persisted so we still need the database. It just that all of that activity needs to happen out of band with our business process. So, a different architecture with a different way of thinking.

As for why Sun bought MySQL.. who really knows. It fits with their experience that being involved with OSS has been good for business. MySQL isn't going away either ;-)

Kirk


3. Peter H left...
Monday, 4 February 2008 6:27 pm :: http://blogs.azulsystems.com/peter

Kirk,

I suppose "crushing the database" is a loaded term... many of our customers are using Azul technology to be its salvation!

I have written a little more about this in my blog... http://blogs.azulsystems.com/peter/2008/02/use-cases-for-a.html


4. William Louth left...
Monday, 11 February 2008 1:22 pm :: http://blog.jinspired.com

Hi Kirk,

It was so easy to disagree with you on this one. There was so much wishful thinking and oversight of the ** many ** other reasons why companies use database technology other than just transactional ** row ** access.

I am also very interested in transactional/temporal/hierarchical memory systems but I am not drunk with excitement to consider such technologies (and supporting products) will replace the database technology underlying many business applications today. But I do agree that these technologies have the potential to greatly impact the architecture of future (green field) information management systems.

At the end of the day our technical analysis might be largely irrelevant as being a winner in this industry has less to do with the technology itself and more to do with "volatile" and "soft" issues.

William


5. Kirk Pepperdine left...
Monday, 11 February 2008 7:23 pm

William,

I didn't want to suggest that databases are going away or are some how less important. All that I'm saying is that databases became the hammer to IT departments and consequently every problem looked like a nail. And for quite some time they've been successful. But the landscape is changing faster than the RDBs can and those that continue to view every problem to be something that the DB will solved will be dusted by those that realize that in low latency, high throughput environments, the database is a boat anchor. BTW, every boat needs one.


6. Cliff Click left...
Tuesday, 12 February 2008 5:55 pm :: http://blogs.azulsystems.com/cliff/

Hi Kirk! I've been wondering if we could use an Azul box to make the whole DB-is-the-bottleneck thing go away. The basic idea is to use the Azul box as an in-memory-DB. Downside: limited to ~500G in size and expensive (relative to 500G of disk and a el-cheapo linux blade anyways), and not disk-coherent. Upside: Using my NonBlockingHashTable technology I can sustain a billion (simple) read-only DB ops/sec concurrently with 10 million (simple) write DB ops/sec, or a mix of say 300 million read-ops/sec with 100 million write-ops/sec. Obviously larger DB transactions run proportionally slower, but I can certainly make all the DB transactions non-blocking (hence very high throughput under load) no matter the size. (also, the DB can be slowly checkpointed such that it's never more than e.g. 60 secs out-of-date relative to the disk).

The question for you experts is: what is the market for such a beast? Who needs a DB of size 500G & 1 billion ops/sec? Or scale it down proportionally using a smaller Azul box: 60G & 200 million ops/sec - again is there a market? Should I bother trying to make this thing (ultra-fast in-memory DB)?

Cliff