Designing, coding and debugging Java Rich Client and Java EE Applications

Monday, 21 April 2008

Desktop versus Web

During the last ten years we saw an impressive rise of the web and web applications. And I have to admit: I really couldn't live without Google, online restaurant critics and hotel bookings any more. On the other hand this also meant the decline of Java desktop applications.
And when I talk about desktop applications here, then it's not meant in the MS Word or Photoshop sense, but rather in the sense of of rich client to some sort of enterprise backend.
While working as a consultant I had to realize, that most of the time the decision to build a new application based on web technology has already been made in advance. This makes a lot of sense in case of, for instance, an online banking application, because that way you can start sinking money into the stock market without having to download anything. It's fast it's, convenient and you can start loosing money right away, no mater where you are. However, if we talk about an application for editing master data that is used by a handful of people on the companies intranet, does it still make sense?
Well, if you asked that question to the person that made the decision in favor for the web application then the answer usually will revolve around the huge problems that supposedly arise from supporting a desktop application on a handful of PCs. While incompatibility issues have been completely unknown to HTML+Javascript clients (as long IE 5.5 on Windows 2000 is used), ensuring the presence of a certain Java version on all 7 desktop computers what were supposed to run a Java desktop application is often regarded as impossible.
Maybe everybody was so excited about web technology that they forgot entirely about Java Webstart that could take care of deployment issues and the richer experience and increased productivity that a desktop application could deliver. Unfortunately this sure was true of Sun. While they did great things with the J2EE platform during the last couple of years, little changes where made to Swing right until Java 6!
So, what is the state of desktop applications right now?
With the newly released Netbeans 6 ans 6.1 which incorporates early versions of the beans binding and swing application frameworks and an excellent GUI builder, it finally seems possible to develop Swing applications in a fast and convenient way. Java 6 update 10 will
feature alongside other things a very nice and clean new look and feel (called Nimbus) and improved performance for GUI rendering. Is this enough for the rebirth of desktop applications?
Well, I think most of the improvements made where really overdue. And if we talk a look at the state of web application now, we find that these are going stronger than ever with the arrival of AJAX and so on. The new type of web applications that uses Flash, GWT or other AJAX frameworks posses desktop like features like drag&drop, are graphically rich and responsive. All trades, that once set desktop applications apart (if they were done right). And with (Google) Gears web application running within the browser even gain offline capabilities. Then there is Adobe AIR: It allows for applications that look just like desktop apps, are programmed in Flex to run in the browser or (dramatic pause) on the desktop.
Does this mean the end of desktop apps then? Hmm, maybe the death of desktop applications AS WE KNOW THEM. But if you take an application written in Flex and an application written in Swing or JavaFX, then both may run inside your browser or outside of your browser in their own runtime. The code will be the same in either case, safe for some restrictions imposed by the browser sand boxing.
So if you ask me: If it looks like a desktop application, feels like a desktop application and is programmed like a desktop application the it IS a desktop application.
Even if it's embedded in a web page, the programming model will be very desktop like.
This means for future Java applications there will be two main options: A simple HTML fronted, maybe enhanced with some AJAX, for mass market applications like a reservation engines or restaurant listings and a web/desktop hybrid approach powered by either Flash/Flex or Swing/JavaFX for more versatile applications.

Sunday, 23 March 2008

Paralellizing image filter operations, part 2

Continued from my last post, let's take a closer look at how the parallelizing of image filter operations is done in the demo.

How to parallelize image operations?

First for the general idea: The image is split up into two or four tiles. Then each of these tiles is processed by it's own thread. When all threads have finished execution the tiles
are joined together to form a whole image again.
This approach has two advantages:
Fist, there are no issues with resource locking, since each thread operates on it's own piece of image data.
And second, from the callers perspective the multithreading is completely transparent: A simple image is passed in and a simple image is returned.

When splitting up and joining the image we have to deal with the fact that some image filter operation like sharpening or blur (which I used for this demo) have a radius of operation.
This means, when the image is split up, this has to be taken into account as an overlap between the different tiles.
Also we must create a frame with the width of the radius around the image, so we still have something to calculate when we reach the edge.

Splitting an image up into four tiles looks like this:




Joining the tiles again is just some copy operation, taking care of the overlap.

Class design

I took the time to think a little bit about the structure of the demo application, instead of just slamming all of the code into a JPanel.
First, as I wrote earlier, Beans Binding and Swing Application Framework features where used to get a clean separation of view and model. I don't want to go into detail now concerning the view part, though.

For now we focus on the model part.

The image to filter, the resulting image and all the settings to use for this operation (such as how many threads to use and the radius) are encapsulated in a Domain Model style ImageOpModel class.
The filter operation is encapsulated by the Interface ImageOperation. That way the "operation" can be just about any kind of image operation that can be performed on a BufferedImage.
For this demo GaussianBlurImageOp is the only implementation. It performs, well of course, the Gaussian blur filter.
Now, ImageOpModel executes the ImageOperation directly if no multithreading is to be used. Otherwise it uses an implementation of ImageOpExecuter to perform the ImageOperation.
The point here was having the overhead associated with the multithreaded approach only in case that multiple threads are really used.
After all we want to play fair with the single threaded approach and retain it's advantage: Less overhead.
For ImageOpExecuter there is a single implementation: MultithreadedImageOpExec.
This class takes care of parallelizing the ImageOpertion into 2 or 4 threads. And that's pretty much what the whole demo was about in the first place.

So, the design of the model is straightforward:

Implementation
Let's take a look at the implementation.
First I'd like to go a little bit into the necessary utility (or rather boilerplate?) code of the demo, starting with the Domain Model style ImageOpModel.

Here are the attribute declarations of ImageOpModel:

public class ImageOpModel {
//Path to the image resource to load
private String imagePath;
//Running in threaded mode?
private boolean multithreaded=false;
//The number of threads to use in multithreaded mode
private int numThreads=2;
//The radius for the image operation, if any
private int radius=25;
//The source image to perform the operation on
private BufferedImage sourceImage=null;
//The finished, filtered image
private volatile BufferedImage processedImage=null;
//Should the processed image or the source image be visible in the view(s)
private volatile boolean processedImageReady=false;
//How long did the last image operation take?
private double lastProcessingTime=0;
//The image operation to perform on sourceImage resulting in
//processedImage
private ImageOperation imageOp;
//The Executer for the multithreaded variant
private ImageOpExecuter executer;


Ok, thanks to the unusual generous inline commentary the attributes should be more or less clear.
Multithreaded, numThreads and radius are coupled to the GUI Form via Beans Binding. The sourceImage gets initialized by a method called loadImage() which is as straightforward as the name suggests.
ImageOp and executer are initialized the old fashioned way in the constructor:

public ImageOpModel(String imagePath) {
this.imagePath=imagePath;
this.imageOp=new GausianBlurImageOp(radius);
this.executer=new MultithreadedImageOpExec();
}

This would have been a good point to try out Google Guice as well, but I left this to next version of the demo.
The interesting part is what happens when the user presses the "Execute Operation" button in the GUI. Through the magic of an Application Framework Action the method execOp() in our model class gets executed.
Inside execOp() the following is going on:

public void execOp(){
//Show the resulting image only when we are finished
this.processedImageReady=false;
//Keep track of the time we need to execute the operation
long tStart=System.nanoTime();
//Create a copy of the image with a "radius" wide border for the operation
this.processedImage=new BufferedImage(sourceImage.getWidth() + 2 * this.getRadius(),
sourceImage.getHeight() + 2 * this.getRadius(),
BufferedImage.TYPE_INT_ARGB);
Graphics2D g2 = this.processedImage.createGraphics();
g2.drawImage(
sourceImage, this.getRadius(), this.getRadius(), null);
//If we are running in multithreaded mode, we use the executer
if(this.isMultithreaded()){
this.executer.executeOperation(this.processedImage, this.getImageOp(),
this.getNumThreads(), this.getRadius());
}else{
//otherwise we just perform the operation directly
this.processedImage=this.imageOp.executeOperation(this.processedImage);
}
//We are finished. Showtime!
this.processedImageReady=true;
//The last thing to do: See how long it took.
long tDelta=System.nanoTime()-tStart;
this.lastProcessingTime=(tDelta/1000000000d);
}
So basically what we do here is executing the ImageOp directly if we don't want to do multithreading, or using the executer to do so for the multithreaded mode.
Around these calls we have some utility code to keep track of the time used for the operation and setting A flag to indicate if an processed image is available.

Implementation of the multithreading
Now for the real interesting part of the demo: And I have to say it looks obvious as well. But that's not always a bad thing, right?

Here's the class with it's only attribute:

public class MultithreadedImageOpExec implements ImageOpExecuter {
private static ExecutorService executor = Executors.newCachedThreadPool();
...
The executor is an ExecuterService which is a concept introduced with Java 5 to create a new level of abstraction for multithreading.
An ExecuterService, as the name suggests, is a service that executes a piece of code. That "piece of code" has to be wrapped by either a Runnable or Callable interface.
Runnable has been around since Java 1.0 and it contains just one simple method to implement: void run().
Callable however is a new addition since Java 5 and a lot more useful. It's parametrized with generics
(Callable )and has only one method as well: V call().
Okay, that doesn't look too spectacular, but it means, that a Callable can actually return a value in a (type)safe way.
We use the Factory style Executors class to obtain a CachedTreadPool. This very sophisticated implementation not only takes care of creating new threads as necessary, but also
caches the threads for reuse. Creating a new thread is an expansive operation, so caching makes a lot of sense. It's a neat thing that this taken care of entirely by the Executer
framework.

Here's the method that executes the ImageOperation using the executor:
    public BufferedImage executeOperation(final BufferedImage sourceImage,
final ImageOperation operation, final int numThreads, final int overlap) {

try {
//Split the source image in a list of tiles.
List tiles = splitIntoTiles(sourceImage, numThreads, overlap);
//Create a worker object that executes the operation for every tile
List workers = new ArrayList(numThreads);
//Intitialize the workers with their coresponding tiles
for (BufferedImage curTile : tiles) {
workers.add(new ImageOpWorker(curTile,operation));
}
//Let the executor framework invoke all workers and handle the
//task of coordination: When invokeAll returns, all Futures will
//have been processed
List<future> results=executor.invokeAll(workers);
//Clear the List of source tiles before reusing it for the processed tiles
tiles.clear();
//Transfer the processed tiles to the simple list of BufferedImages
//so we don't have to deal with the nature of Futures during joining
for(Future\ curFuture:results){
tiles.add(curFuture.get());
}
//rejoin the processed tiles to the sourceImage
joinTiles(tiles, sourceImage, overlap);
} catch (InterruptedException ex) {
..
} catch (ExecutionException eE) {
..
}
return sourceImage;
}

The ImageOperation and the image to process are passed into the method. First the image is split into 2 or 4 tiles by the method splitIntoTiles().
This method is actually quite ugly and long at the moment, so I won't show it here (You can check it out in the source code, though).
Next for every tile an instance of ImageOpWorker is constructed.
ImageOpWorker is an inner class implementing Callable:

    public class ImageOpWorker implements Callable {

private volatile BufferedImage sourceTileImage;
private final ImageOperation operation;

public ImageOpWorker(BufferedImage image, final ImageOperation operation) {
this.sourceTileImage = image;
this.operation = operation;
}

public BufferedImage call() throws Exception {
return this.operation.executeOperation(sourceTileImage);
}
}


As you can see it's really a thin wrapper around the code that's supposed to be executed be a separate thread.

Back to the executeOperation() Method:
Now that we constructed a list of ImageOpWorkers for all the tiles we want to process, we just call invokeAll() on the executer and
pass the list of
ImageOpWorkers. What the executer does for us now is the following:
  • For each ImageOpWorker it either creates a new thread or reuses an existing one
  • It executes the call() method on each worker and wraps the returned image into a Future
  • It returns only when all workers have terminated execution

This means that the executor not only caters for the creation of threads, but also does the thread synchronisation for us. Otherwise we
would have to use a CountDownLatch or something similar to wait for all threads to complete.

The returned Future Objects wrap computational results that, well, come from THE FUTURE! Unfortunately it doesn't have anything to
do with Michael J. Fox's flux compensator. Rather it's purpose is simply to return the result of an asynchronous computation through it's
get() method. The trick is, that get() will block until the asynchronous computation is finished and the result is available. In our case however we
already know that all our Futures are ready, anyway.
The next step is to pull the results out of the Future(s) and store the images in a simple list.
We do that because we want to keep the amount of multithreading aware code as small as possible. The joinTiles() method should not
need to deal with Futures. After joining the tiles again we return the manipulated image as a normal BufferedImage again and
we are done!


Remember to try the webstart version of the demo.
Also you can have a closer look at the sourcecode: MultithreadingDemoSrc.zip (1.49Mb)

Friday, 21 March 2008

Paralellizing image filter operations, part 1

On last December's Javapolis I attended the "filthy makeover" track by Chet Haase.
Chet showed some cool performance hacks for Swing, among them a nasty one for speeding up an image blur filter. While the trick he used was a cool way of cutting corners a little bit for performance sake, I kept on wondering:
Isn't image processing exactly the kind of computation that would hugely benefit from a multithreaded approach? And after all parallelizing such an operation should be fairly straight forward: Just split the image up into tiles and process each one independently. Plus: The new (well not that new anymore) concurrency features of Java 5 should make the job quite a bit easier.
I decided to try it out and developed a little demo in form of a Swing application.
For building this demo I used NetBeans 6. Unfortunately I couldn't resist using some the new features of NetBeans for building Swing Applications: I used
the new Swing Application Framework and Beans Binding. I found that these new frameworks around Swing and the support for them in NetBeans show great potential for making a Swing developers life easier. Even if there are still some rough edges. But I leave that for some other blog entry.
For the demo I used a Gaussian blur filter that I sto...hmm...cop...aah.. well that I got from a Filthy Rich Client example by Romain Guy. I sure hope he doesn't mind that I used his code snipped instead of digging it up myself. Anyway, the filter is applied to a really large image, so the benefits from the parallelizing will be clearly visible and the involved overhead plays less of a role.
Here's what the (for now) finished demo looks like:

The image used is a 3836x2560 pixel photo that a friend of mine took at Frankfurt airport. It is shown either in "scale to fit" view, or in a "1:1" view (the view can be switched by double clicking on it).
Also, the radius for blur operation can be changed through the GUI. The larger the radius is, the more the image gets blurred and the more time-consuming the operation becomes.
If the blur is applied with a 25 pixel radius the results are the following when viewed in the "1:1" zoomed in view:

Okay, but what about the performance now?
To be honest, I wasn't 100 percent sure if the multithreading would really boost the performance, considering the added overhead of the tiling.

So, I was delighted to see the following results on my 2.2 Ghz Core Duo 2 Windows XP Laptop:
MultithreadingRadius 10Radius 25Radius 50
None2.75 sec5.986 sec11.121 sec
2 Threads2.082 sec3.515 sec6.597 sec
4 Threads2.202 sec3.601 sec7.151 sec

While the operation benefits just a little for a small radius of 10 pixels, the performance boost is really there for a 25 and 50 pixels radius! Using 2 Threads the performance is 1.7 times higher than using no multithreading at all. A twofold increase in in performance would be the theoretical maximum. Considering that the approach with one thread avoids the overhead of tiling and creating threads all together, the 1.7 fold increase is not so bad, really.
Using 4 threads is slower than using 2 on a machine with 2 cores, but the performance hit seems not so severe.
So, what happens if I let the demo run on my new 2.33 Ghz quad core desktop machine?
Here are the results:
MultithreadingRadius 10Radius 25Radius 50
None2.561 sec5.706 sec10.829 sec
2 Threads1.624 sec3.17 sec5.826 sec
4 Threads1.42 sec2.652 sec4.786 sec


Damn! It is a little bit faster with 4 threads but not as much as I hoped for. I guess I have to investigate that a bit in profiling.
Oh, here are also some results from my old single core 1,5 Ghz Pentium M
laptop running Ubuntu 7.10:
MultithreadingRadius 10Radius 25Radius 50
None8.581 sec18.968 sec35.674 sec
2 Threads9.128 sec19.745 sec36.257 sec
4 Threads9.089 sec19.949 sec36.888 sec


If you want to try the demo out for yourself you can webstart it up directly from here!

The complete source code is available here: MultithreadingDemoSrc.zip (1.49Mb)
It's in the form of NetBeans 6 project.
The rather large download size is mostly due to the contained high resolution photo, that weights in at 1.33 Mb.

In the next post I'll go further into the details of the demo, posting some code fragments and a little bit of design background...

Thursday, 3 January 2008

Welcome

Hello everybody and welcome to my blog!
I decided to call my brand new blog "design, code, debug" because for me, that's the eternal circle of software development.
Or the three seasons of programming, if you will. Or maybe, the cosmic ballet of...all right, all right...you get the picture.
Well, I really do believe that these three activities should be viewed as a whole:
Designing the architecture, hacking the code and debugging the results really seems to be the natural flow of things.

I mean, hey: How often did you come up with a good design, written some code, really nice, really straight forward.
You are really content with your work, then you try it out and...it all fails miserably.
Days later you'll then find out that you had some typo in some of the (way to many) XML config files.
See, that's all part of the experience of developing software process.
In video games you have boss battles, in programming you have debugging ;)

So when you see a post with the "debug" tag in this blog, it will most likely be about configuration issues,
seeing how an application behaves in the profiler or how to overcome common pitfalls.

I also thought the name was a clever choice as it basically allows me to talk about everything that has at least
remotely to do with "computer stuff". Nothing is off topic for me!

At the moment I'm experimenting a bit with Swing Rich Clients and NetBeans 6.
So that's probably, what the next couple of Posts will we about!