Sorting tables with Java
Tuesday, January 09, 2007
One of the main functions of computer applications is the visualization of data. The most common visualization format is a table with rows and column.
I find that the most desirable function in a table is that of sorting by specific column types. Whether sorting alphanumerically, dates, or object types, it helps the user find the information faster.
Not all data needs sorting, though. For example, sorting the following is not really needed:
|Gaius Julius Caesar||2106|
(This is such a silly example as we rarely encounter so little data to analyze.)
The solution to this almost trivial problem depends on how the system has actually being designed: what is the architecture of the system.
After looking back at the sorting algorithms I've used in the past, I started thinking of the different ways table sorting can actually be implemented.
Every case I present here does the same thing (sort a table), however, each case deals with the actual system at hand differently. In other words, the solutions presented are part of an existing web framework.
Note that there is nothing really earth shattering or new about any of these solutions. I just thought they would make a good example of different web architectures, and simple curiosity for some, for an almost trivial but necessary requirement in most web applications: sorting by column type.
page to see the versatility of his solution.
The issue with a client only sorting implementation is that the data set can change on the database at any time and if we don't refresh the page with the new data important decisions could be made with incomplete information. Hence, in theory, every sorting request should check for new data and just keep sorting away.
But what about when the data needs to be refreshed for every sort request?
The simple act of sorting can put quite a load on resources if we have thousands of users doing the same thing at the same time and if every sorting action means a trip to the whole system. Therefore, any solution needs to take into account how it will behave under heavy load in the existing architecture.
As end users, we only care that our applications work. We don't want to know how complex, or simple the solution is. All we want to see is our tables sorting instantaneously. Hence, I present a few solutions (some better than others).Case 1
Go directly from the web page to the database and use SQL to return the sorted items (essentially, using "ORDER BY").
This is how web application were coded circa 1996: C or Perl application used to talk directly to databases. We still find this development pattern being implemented with newer scripting languages like PHP or Ruby.Case 2
Add a middle tier to handle the business requirements of your application.
This is preferable to Case 1 above, however, it is probably not the most scalable solution. Nothing surprising here. All that has been done differently is that I've moved the logic embedded in the web page (JSP) into some middle tier, which is still part of the Web layer.
In this realm, you can find MTS/COM+ components with ASP pages and Servlet with JSP interaction, where you have massive middle tier servers doing all the work, i.e., concurrency, load handling, etc.Case 3
This is a typical MVC pattern implementation.
This way of doing things just separates code into logical chunks to do stuff. Even though it is a full MVC pattern solution, is not a very smart one. Every time sorting needs to be done, the whole system needs to be put into action--could be expensive with thousands of concurrent users.
Note that this is an advantage over the case 1 and 2, as I'm not coupling data to my actual business logic. And this separation is what leads to scalability of Software Development: you can have teams of 50 or 100 developers to work on the same code tree without any worries of duplicate or lost work.Case 4
Still and MVC solution, with some minor caching added.
What I mean by caching is that for every sorting request the set is sorted in some web layer object. In Java, the data set can be stored in a collection object in the HttpSession.
I think you can see how sorting at the Web layer would be much more efficient when thousands of users are hitting the same server, i.e., the ball stops at the Controller layer and business and DB objects are not doing any of the work.Case 5
Now the bleeding edge solution (meant to be funny).
This is what is happening and why this is more efficient than Case 3 or 4:
- The first time the table is displayed, there are no two ways about it, you must hit the DB to get the data set.
Now, what if the data changed on the back end after the first request and you need to have the latest data set at all times?
- This is where a bit of AJAX can be used if you don't want your users to keep refreshing the web page, i.e., Case 3 or 4.
Any solution presented here would end up sorting tables by column, however, each one is different depending on the architecture of the solution and the specific user requirement.
Case 5 is a combination of solutions and may seem a bit over done with so many layers doing stuff. However, if the requirement is part of an existing architecture, we must code to accommodate the design, i.e., the architecture is there for a reason, whether is to take advantage of load balancing or logical layering services.
Your last solution will not work if you have to use paging, because then you cannot sort the data on the client anymore. If you need to show thousands of objects (paged) then the only logical solution is to sort on the business layer.
Markus(java performance blog)
Hi Markus. Thanks for your comment. I wish you had left an email address, but hope you come back.
Actually, the last solution DOES work with paging.
I hope that helps.