I'm in Stockholm getting ready to speak at JFokus about concurrency and performance. Seems like it should be an obvious talk yeah? If you want better performance use more processors. Unfortunately I still see a number of applications that look like they should be able to achieve high levels of concurrency. That is until you take a closer look and you see that they have been, for one reason or another, tightly coupled to the database. In some cases so tightly coupled that practically user requests requires the application to touch the database, in some cases 100s if not 1000s of times.
If you happen to use a JPA compliant implementation such as Hibernate or TopLink, you maybe, maybe able to inject some caching to short-circuit many of the calls. However this isn't always an option, especially if the database group is in charge of your architecture. At this point you maybe thinking, we are most certainly in charge of our architecture so this isn't a problem. But the catch is, that DB groups have become so powerful in many organizations that they've been able to do things that have boxed you in, cut off your choices or other wise limited what you can do. In effect, the are in charge of your architecture.
Traditionally database technology has been leading the way in scalability. They've been able to deliver the goods and this is why they've gain so much influence. However much of the gain has been on the back of faster and faster clock speeds. It is old news, that gravey train has reached the end of the line end. Even prior to the tailing off faster clocks, databases interactions were starting to become the primary bottleneck in an increasing number of systems. All I can say is this is about to much worse quicker than most people believe.
The new reality is that some machines that contain hardware which you may now consider to be exotic is now moving to the main stream. Companies like Azul systems are producing hardware that absolutely crushes database technology. Although I know Azul to be a world class company that has world class engineers working for them, they are a small company not in the main stream of conciseness. So while this has limited their penetration into the market, the message in many of the result that they've achieved in many standard benchmarks have not been lost on those that have cared to listen. The question is; what happens when more mainstream hardware vendors such as Sun start releasing the same type of technology. For example, Sun's Rock will include a hybrid form of transactional memory.
Why is this important? I can show you all kinds of graphics that demonstrate the effects of overloading a point in a system that is already known to be a bottleneck. I can tell you, it isn't a pretty sight. Azul (and potentially Rock) have the capability to take an ordinary existing Java application and have it run significantly faster by simply using transactional memory to remove the locks, those things that accidentally take pressure off of the the already overwored database. Azul will crush your database!
In other words, all you have to do to unbalance you seemingly well oiled application is upgrade your hardware. And don't think for one minute that Oracle and IBM don't realize this. They are fighting what I believe will be a losing battle against this new trend in hardware. It is coming at them too quickly for them to alter the fundamential architecture of their technology. After all, databases were always about ACID and not speed. I believe (and please comment if you think I'm wrong) that Oracle bought Tangosol maybe as a quick entry into the grid game. They needed a response to IBM's ObjectGrid and Cameron and crew had done a very nice job of grabbing market share in the caching/grid space. However what they really bought was a technology that if used properly, could take a lot of pressure off of their database.
The key is, if it is used properly. And this is where developers and architects have to rethink everything they know. They need to some how normalize in bound data flows so that they can avoid the database by running through the cache first rather than tacking it onto the back of the O/R mapping tool.
Don't get me wrong, databases are not dead. They are just as important as they ever where and will remain important long into the future. They will continue to scale and there well maybe a time when they will return to the dominate role they've long enjoyed and deserved in many enterprise architectures. In fact, this important technology has not only set the stage for what is to come, it has helped up prepare for the on-coming multicore onslaught. In other words if you think quad-core is something, Azul's chips currently contain 20 cores and even that number isn't large when you consider the possibilities. So while the RDB has and will continue to be important, for the moment, the current trends in hardware are not in favor of a database centric architectures. We need to recognize this, adjust or we will be crushed.
How does all of this relate to airports? Well, on my journey I got re-routed due to some mechanical difficulties which lead me to Berlin. I'd not been in Berlin's airport before and for some strange reason this triggered a question, just how many airports had I visited. Turns out that Tegel is the 100th airport on my list. I'm not sure if this is a good thing ;-)
Hi Kirk,
Hi Kirk,
Hi Kirk! I've been wondering if we could use an Azul box to make the whole
DB-is-the-bottleneck thing go away. The basic idea is to use the Azul box
as an in-memory-DB. Downside: limited to ~500G in size and expensive
(relative to 500G of disk and a el-cheapo linux blade anyways), and not
disk-coherent. Upside: Using my NonBlockingHashTable technology I can
sustain a billion (simple) read-only DB ops/sec concurrently with 10
million (simple) write DB ops/sec, or a mix of say 300 million read-ops/sec
with 100 million write-ops/sec. Obviously larger DB transactions run
proportionally slower, but I can certainly make all the DB transactions
non-blocking (hence very high throughput under load) no matter the size.
(also, the DB can be slowly checkpointed such that it's never more than
e.g. 60 secs out-of-date relative to the disk).