Limiting the Number of Query Results With ORDER BY Clauses

Posted on July 14, 2005 by Scott Leberknight

Many times you need to limit the number of results returned by a query, for example if you have the potential for a user-created query to return a very large number of results. Since this pretty much always happens in applications, you have to deal with it somehow. But most books and articles don't really talk about the gory details, since many times the mechanisms you must use are database and vendor-specific. The two databases I've worked most with are MySql and Oracle. Limiting results in MySql is ridiculously simple, and I really wish their syntax was part of standard SQL. For example, suppose you have a person table and you want to search on first and last name, and you need to order by last name then first name. In addition, you need to limit the results to 25 results. In MySql you can issue the following query:

select * from person where last_name = 'Smith' order by last_name, first_name limit 25

The key in the above is the limit clause which MySql provides. MySql applies the limit after it has retrieved the results and applied the order by clause as well. Trivial. Now let's consider how to do this in Oracle.

Oracle provides a pseudo-field called rownum which seems like a promising way to specify limits on result sets. Suppose we issue the following query in Oracle:

select * from person where last_name = 'Smith' and rownum <= 25 order by last_name, first_name

When you run that query, you find much different results than you expect. This is because Oracle performs the query and applies the rownum to each row in the results before applying the order by clause. So the question is how in the world do you use Oracle's rownum in concert with an order by clause and get the correct results. I cheated. I have been using Hibernate for a while on an Oracle database and know it is able to handle this exact situation - that is, limiting the number of results and applying an order by clause to a query. So I set the hibernate.show_sql property to true and looked at what it generated for a query. It turns out to be very simple and makes use of a subselect:

select * from (
  select * from person where last_name = 'Smith' order by last_name, first_name
)
where rownum <= 25

So Hibernate is simply wrapping the original query in another query which limits the results. Since the subselect is executed first, the rownum property of the outer select works properly. Pretty slick the way Hibernate does that but it would be much easier if Oracle had a syntax more like the elegant MySql limit clause.

Unfortunately, limiting the number of results for a query is only part of the equation, since you normally also need to provide some type of paging functionality. There are tools, such as the very popular DisplayTag custom tag written by Fabrizio Giustina, to help with the display of data in a nice table that provides paging and sorting. Unfortunately, most of them want to have all the results completely in memory for the paging to work properly. I hate doing this since you are just taking up memory by putting data in the user session that the user will probably never even access. Thus you are just wasted memory by storing the complete result set in memory. I have always tried to implement paging so that you go back to the database for each page of information and not store anything in the session. You think Google stores a couple million results in memory for each search? Doubt it.

So how can you accomplish paging in the database? Again you normally have to use vendor-specific syntax. For MySql you simply use the limit clause, but you specify two parameters: the start row and the number of rows to retrieve. For example, suppose we want the first page from the person search assuming a page size of 25:

select * from person where last_name = 'Smith' order by last_name, first_name limit 0, 25

Simple. Suppose you want the second and then third pages. The queries are:

select * from person where last_name = 'Smith' order by last_name, first_name limit 25, 25
select * from person where last_name = 'Smith' order by last_name, first_name limit 50, 25

Again, in MySql this is really simple to accomplish. In Oracle, not as simple since you now have to get additional operators involved, specifically row_number() and over() functions. More on that later...for now I suggest you migrate all Oracle databases to MySql. Wouldn't that be nice? :-)

One last thing. The really nice thing about Hibernate is that it understands how to limit result sets and perform paging operations in all the databases it supports. Your code does not need to care. You simply do something like:

Criteria criteria = session.createCriteria(Person.class);
criteria.addOrder(Order.asc("lastName"));
criteria.addOrder(Order.asc("firstName"));
criteria.setFirstResult(50);
criteria.setMaxResults(25);
List results = criteria.list();

Then, Hibernate will perform the appropriate vendor-specific query based on the dialect you configure Hibernate with. Thus, using a tools like Hibernate allows you to write queries at a higher level and completely abstract the actual query that is performed on your specific database at runtime.



Post a Comment:
Comments are closed for this entry.