Using Spring with Hibernate3's Lazy-Load Everything Approach

Posted on October 22, 2005 by Scott Leberknight

I am writing this so that I never, ever, ever forget how the new default lazy-load behavior in Hibernate3 can completely mess with your head for hours and hours of debugging when you are expecting Spring's HibernateTemplate's load method to throw a HibernateObjectRetrievalFailureException in a unit test, only to find the reason is quite simple but subtle! Oh, and of course if anyone else is reading, or more importantly, Googling, then hopefully this will help you too.

I have an implementation of a DAO that extends Spring's HibernateDaoSupport class which has a finder method for an entity given the unique identifier, which is of type Long. That finder method basically does this:

public Entity findEntity(Long id) {
    return (Entity)getHibernateTemplate().load(Entity.class, id);
}

Note specifically that I am using the load() method which is defined to "Return the persistent instance of the given entity class with the given identifier, throwing an exception if not found." Specifically, this method should catch any HibernateException subclass thrown by the internal HibernateCallback and convert it into the appropriate class in the Spring DataAccessException hierarchy, e.g. in this case it should be converted to a HibernateObjectRetrievalFailureException if I pass in an identifier for which there is no persistent entity. So far, so good.

Next I have a simple unit test that calls my finder method using an invalid identifier, and then it asserts that the appropriate exception was thrown. Basically it looks something like this:

public void testFindEntityUsingInvalidIdentifier() {
    final Class expectedExceptionType = HibernateObjectRetrievalFailureException.class;
    try {
        dao.findEntity(-9999L);
        fail("Should have thrown an " + expectedExceptionType.getName());
    } catch (Exception e) {
        assertEquals(expectedExceptionType, e.getClass());
    }
}

So I ran this test thinking it was a no-brainer. It failed. In fact, it failed with the message "junit.framework.AssertionFailedError: Should have thrown a org.springframework.orm.hibernate3.HibernateObjectRetrievalFailureException." Um, what? I was about 100% sure there was no such row in the database, since I was using Spring's AbstractTransactionalDataSourceSpringContextTests test class which ensures transactions are rolled back after each test, and because I knew I didn't put any data in the database with that identifier. So that meant the findEntity method did not throw any exception. I started adding the old fallback System.out.println() statements all over the place to see what exactly was going on. The finder method actually was returning a non-null object, but when I tried to call any method on it, like toString(), it then threw a raw Hibernate ObjectNotFoundException, which as of Hibernate3 is unchecked not checked. Hmmm. Performed some initial debugging using a real debugger no less and found that proxy objects were being returned, and then looked in the stack traces and saw some CGLIB stuff in there, meaning the entity object was in fact proxied.

Since the object was actually a proxy, that explained why no exception was thrown by HibernateTemplate.load() - since no methods had been called on the proxy yet, Hibernate3 was happily returning a proxy object for an identifier which does not really exist. The second you call any method on that proxy, you get the Hibernate ObjectNotFoundException. Did a bunch of research and finally found that the Hibernate3 reference guide, in Section 11.3 (Loading an object) mentions that "load() will throw an unrecoverable exception if there is no matching database row. If the class is mapped with a proxy, load() just returns an uninitialized proxy and does not actually hit the database until you invoke a method of the proxy." I then did more research and arrived at a resource I should probably have looked at much sooner, as this is my first time using Hibernate3 (I've been using Hibernate2.x for a while now). That resource is the Hibernate3 migration guide which mentions that Hibernate3 now defaults to lazy="true" for everything, both classes and collections. I even remember reading this a while back but it didn't occur to me that the load() method in the Hibernate Session would behave like that.

In any case, with lazy loading set to "true" because of the changed default value in Hibernate3, the Spring HibernateTemplate.load() basically is useless for lazily initialized objects, since it will not catch any Hibernate exceptions and thus won't ever do the conversion into one of the DataAccessException subclasses. There are a few solutions or workarounds or hacks or whatever you want to call them. First, you could set default-lazy="false" in your root hibernate-mapping element, which will switch the behavior back to what it was in Hibernate2.x. That is what I did to verify that my test would work properly when there was no proxying involved. Second, you can use the HibernateTemplate.get() method instead of load() (which delegates to the Hibernate Session.get() method) because get() "hits the database immediately and returns null if there is no matching row" and then if the result is null, throw the appropriate exception yourself. Third, you can leave Hibernate3's default behavior alone and override it for specific classes and/or collections. For collections I almost always use lazy loading but I had never done it for classes. So you could set lazy="true" for specific mapped classes in your Hibernate mappings.

I fully understand why you want to lazily initialize your collections, but I am not sure why I would want the default behavior for normal classes to be lazy load. When I perform a find for a specific object, I pretty much know I want that object since I am going to do something with it, like display it in a web page or something. I suppose one use case for lazy initializing classes would be if you perform a query and get back a list of objects, and you don't want to initialize all of them until the user actually needs them. But even in that case I generally am going to display the search results and I will be using those objects. So I am still somewhat at odds for when I would realistically want this behavior; even for search results I can retrieve only the results I need by using Hibernate's ability to set the start row and number of rows returned for a query.

So, the main point of all this is that the default lazy loading of everything in Hibernate3 may cause some unexpected problems, and if you start seeing CGLIB in stack traces and weird things are happening, chances are you've got an object proxy rather than the actual objct you thought you were getting, and that the proxy is causing some weird behavior. Whew!

Don't Forget To fail() Your Tests

Posted on October 05, 2005 by Scott Leberknight

Often when writing JUnit tests, I use the fail() method immediately after an Exception should have been thrown, in order to ensure that invalid input or use of the method results in the exception you expect. However, earlier this week I was updating a test case and noticed one test method that was expecting an exception to occur did not have a call to fail(). So this is the code that should have been in the test:

public void testSomethingFailsThatYouExpectToFail() {
    try {
        SomeObject someObject = new SomeObject();
        someObject.someMethodThatShouldFail("invalid input");
        fail("Should have thrown IllegalArgumentException");  // need this line!
    }
    catch (Exception ex) {
        assertEquals(IllegalArgumentException.class, ex.getClass());
    }
}

But the code in the test did not have the fail() method call. Without fail(), this test works as expected only when the method actually throws the exception and the assertion validates the type of exception thrown. If, on the other hand, the exception is not thrown then the test passes but is not actually a valid test!

So as a result I did a quick search through all tests and found a significant number where I or one of the other developers had forgotten to call fail() immediately following the method call where we expect an exception to be thrown, which I then had to fix by adding the call to fail() back in. After doing that I found there were actually three tests that then failed. So even though these were exceptional test cases I had to fix the offending code. Maybe I can write a Checkstyle or PMD rule that would detect this type of error in unit tests, because it is so easy to forget to call fail(), have the test pass, and move on to the next task.

Standard UML Is Preferable...NOT!

Posted on August 04, 2005 by Scott Leberknight

One of the projects I am involved with a work recently had its design review with major stakeholders - the typical dog and pony show nonsense where people who have no business making technical decisions try to impact your project - and one of the outcomes was that an "architect" was concerned that some diagrams were not "standard UML." This person basically stated in an email after the review that we should "prefer" standard UML diagrams, apparently because standard UML is, I suppose, standard. There was no other reason given. He then attached a sample formal UML state diagram from some other project. I looked at it and could not even understand it due to the level of complexity.

I sent out an email to my team telling them to ignore what that "architect" said and to continue producing documentation that fits the intended target audience. In this case that meant not only technical personnel but also functional and business domain experts who do not know UML, have never probably heard of it, and do not care a lick about whether a diagram is UML. The only thing they care about is whether they can understand it in the context of their domain expertise. Period. The diagrams in question were simple screen flow diagrams done using PowerPoint. They consisted of squares with text inside (the screen name) and arrows going between the squares to show the navigation between screens. Nontechnical people can easily glance at this type of diagram and grasp it. Try taking a simple diagram like that and shoving a formal UML model in front of them. I am guessing most will immediately tune out as you start to explain how the little solid circle is the starting point and the circle with the solid circle inside is the end point. Or is the solid circle the end point? Well, before we can build our system we better make sure we get those correct, or else the code monkeys might just code everything in reverse, right?

The point is this: just as you should prepare a briefing according to your intended audience, you should do the same for software modeling. And you should only model as much as you need, using whatever notation is comfortable for your team, to understand your problem and begin implementation. No more and no less. If you need to show a flow diagram to a business domain expert, please stop trying to educate them on formal UML notations and just use simple notations so they can concentrate on whether your model is accurate, not on trying to remember what symbol means what. And even between developers, how many times do people get bent around the axle when whiteboarding some classes and their relationships? Does anyone really remember which diamond (solid or empty) is aggregation and which is composition? Does it even matter if your colleague understands what you mean? I vote "No." So, resist the formal modeling and resultant analysis paralysis and vote "No" on the formal UML referendum by your local PowerPoint Architect.

Traceability from Requirements to Code

Posted on August 04, 2005 by Scott Leberknight

My question is this: who cares and why should we care if we can trace from a functional use case down to one or more classes, or even methods? Recently I have been discussing this issue, as I might be required to worry about this type of traceability on an upcoming project. My main problem with this whole idea stems from the fact that well-designed and abstracted OO code typically will not map in any sort of one-to-one fashion from a use case to code. This is mainly because of the fact that use cases are a functional breakdown of system requirements, while OO (and nowadays AOP) code that implements those requirements does not follow a structured, top-down breakdown. Thus there is not a one-to-one mapping of requirements to code. So that being the case, in my experience at least, what good does it do you to be able to show that a specific use case links to a whole mess of classes? Seriously, if someone can show that use case "Edit FooBlah" links to classes as well as other artifacts such as XML files, JSPs, properties files, etc. how does that truly help them? What is the purpose other than simply being able to do the linkage? The "Edit FooBlah" use case in say, a Struts-Spring-Hibernate typical web application crosses all tiers and touches a whole mess of artifacts ranging from Java classes to HTML, JavaScript, CSS, JSPs, Struts configuration files, resource bundles, Spring configuration files, Hibernate mapping files, utility classes, etc. etc. etc. If I've properly architected the application into well-defined layers, however, I already know exactly what artifacts make up a particular use case since almost all the typical use cases will be built in the same, consistent way.

I can clearly see the purpose for linking bug fixes, issues, etc. to the code that was altered to perform a fix. For example, if a bug is found, someone enters it into a tool like Atlassian JIRA. It then is reviewed, worked on, and code is checked in against that bug. The affected code tends to be concentrated (hopefully) into a small number of classes and is is easy to see how a particular issue was resolved. JIRA and similar tools are also useful to document new feature requests, improvements, etc. once a system is released into at least alpha or beta. In addition, tracking issues like this allows release notes to be easily assembled from reports run for a particular release number and to inform users about fixes, changes, and other things that might directly affect them. People can also search and find whether there is a workaround to a specific issue or whether someone else already reported the bug they found. There are lots of other good reasons why we use issue tracking systems like this.

But as for linking requirements, in whatever form they might be, directly to code, I do not see the tangible benefits. What if your requirements are on index cards in a XP-style development shop? Or if they are in textual use cases in a wiki? Is anyone really going to gain tremendous benefits from being able to tell you that a use case links to the following 27 different files? And the management of this information becomes much more complicated over time, as requirements change throughout the initial development as well as product maintenance. If a feature a product contains is fundamentally changed or improved, do I care that I can go back and trace the old requirements to code that has been removed from the system long ago? To me the most important thing is to have a set of automated tests that define whether a system does the tasks it is intended to perform. This can be unit tests, functional test suites, user acceptance tests, performance and load tests, and whatever other types of tests you want. But in the end they determine whether the product meets its users needs. Period. Users do not care to look at some gigantic stack of paper containing system requirements. They simply want a system they can easily use to perform their job. That's it. It's that simple. But for whatever reason many people in our young and immature industry still continue to believe that documentation such as requirements traceability, test plans, system requirements specifications, system design documents, system security plans, data migration plans, and umpteen other documents are the most important facet in developing software.

At JavaOne in 2004 I saw a demo of some products that claimed to easily and seamlessly (of course) provide linkage between requirements and code. So here's how it worked. You go into the requirements tool, select the requirement your class or (as the demo showed) method satisfies, and then drag and drop the requirement directly into your source code. The tool suite, which of course required you to have all the various parts of the suite, then created a code comment "linked" to the requirement and apparently was then able to track exactly what code linked to each requirement. Notwithstanding how brittle this solution was - just go into the source and delete the comment - it was also invasive and ugly. Assuming you had a sizable system, soon your developers would be spending all their time linking code to requirements, and trying to figure out which requirements some little utility class that everyone is using (e.g. StringUtils) maps to, which of course is an exercise in futility.

Limiting the Number of Query Results With ORDER BY Clauses

Posted on July 14, 2005 by Scott Leberknight

Many times you need to limit the number of results returned by a query, for example if you have the potential for a user-created query to return a very large number of results. Since this pretty much always happens in applications, you have to deal with it somehow. But most books and articles don't really talk about the gory details, since many times the mechanisms you must use are database and vendor-specific. The two databases I've worked most with are MySql and Oracle. Limiting results in MySql is ridiculously simple, and I really wish their syntax was part of standard SQL. For example, suppose you have a person table and you want to search on first and last name, and you need to order by last name then first name. In addition, you need to limit the results to 25 results. In MySql you can issue the following query:

select * from person where last_name = 'Smith' order by last_name, first_name limit 25

The key in the above is the limit clause which MySql provides. MySql applies the limit after it has retrieved the results and applied the order by clause as well. Trivial. Now let's consider how to do this in Oracle.

Oracle provides a pseudo-field called rownum which seems like a promising way to specify limits on result sets. Suppose we issue the following query in Oracle:

select * from person where last_name = 'Smith' and rownum <= 25 order by last_name, first_name

When you run that query, you find much different results than you expect. This is because Oracle performs the query and applies the rownum to each row in the results before applying the order by clause. So the question is how in the world do you use Oracle's rownum in concert with an order by clause and get the correct results. I cheated. I have been using Hibernate for a while on an Oracle database and know it is able to handle this exact situation - that is, limiting the number of results and applying an order by clause to a query. So I set the hibernate.show_sql property to true and looked at what it generated for a query. It turns out to be very simple and makes use of a subselect:

select * from (
  select * from person where last_name = 'Smith' order by last_name, first_name
)
where rownum <= 25

So Hibernate is simply wrapping the original query in another query which limits the results. Since the subselect is executed first, the rownum property of the outer select works properly. Pretty slick the way Hibernate does that but it would be much easier if Oracle had a syntax more like the elegant MySql limit clause.

Unfortunately, limiting the number of results for a query is only part of the equation, since you normally also need to provide some type of paging functionality. There are tools, such as the very popular DisplayTag custom tag written by Fabrizio Giustina, to help with the display of data in a nice table that provides paging and sorting. Unfortunately, most of them want to have all the results completely in memory for the paging to work properly. I hate doing this since you are just taking up memory by putting data in the user session that the user will probably never even access. Thus you are just wasted memory by storing the complete result set in memory. I have always tried to implement paging so that you go back to the database for each page of information and not store anything in the session. You think Google stores a couple million results in memory for each search? Doubt it.

So how can you accomplish paging in the database? Again you normally have to use vendor-specific syntax. For MySql you simply use the limit clause, but you specify two parameters: the start row and the number of rows to retrieve. For example, suppose we want the first page from the person search assuming a page size of 25:

select * from person where last_name = 'Smith' order by last_name, first_name limit 0, 25

Simple. Suppose you want the second and then third pages. The queries are:

select * from person where last_name = 'Smith' order by last_name, first_name limit 25, 25
select * from person where last_name = 'Smith' order by last_name, first_name limit 50, 25

Again, in MySql this is really simple to accomplish. In Oracle, not as simple since you now have to get additional operators involved, specifically row_number() and over() functions. More on that later...for now I suggest you migrate all Oracle databases to MySql. Wouldn't that be nice? :-)

One last thing. The really nice thing about Hibernate is that it understands how to limit result sets and perform paging operations in all the databases it supports. Your code does not need to care. You simply do something like:

Criteria criteria = session.createCriteria(Person.class);
criteria.addOrder(Order.asc("lastName"));
criteria.addOrder(Order.asc("firstName"));
criteria.setFirstResult(50);
criteria.setMaxResults(25);
List results = criteria.list();

Then, Hibernate will perform the appropriate vendor-specific query based on the dialect you configure Hibernate with. Thus, using a tools like Hibernate allows you to write queries at a higher level and completely abstract the actual query that is performed on your specific database at runtime.

"I Find Your Lack of Ant Disturbing"

Posted on June 24, 2005 by Scott Leberknight

Yes I have been watching Star Wars IV: A New Hope lately in case you are wondering about the reference to Vader's "I find your lack of faith disturbing" comment. Anyway, recently I've come across a couple of situations where I was doing some consulting work and found that people were using -- no, relying completely -- on their IDE to perform builds of J2EE applications. One even included an EAR, EJB-JAR, and WAR file all being built exclusively by the IDE. Then Novajug launched a poll yesterday on what people were using as their build tool. As of about 10 minutes ago people using their IDE stood at 20%. Now granted the number of people who have voted thus far is small and the 20% only translates to 11 people. But I find even that number of people using only an IDE to build applications alarming and disturbing. As for my recent consulting, I won't mention the IDE by name, but it is a commercial IDE that allows you to define, using mounds of dialog boxes, in excruciating detail everything about your build, which is where they had put everything! And all of those settings were checked into version control no less.

The reason I was helping them out was because one developer's IDE would simply not build the WAR file, instead giving a single one line error message: java.lang.ArrayIndexOutOfBoundsException. Nice. Well, with this amount of detail I naturally was unable to help him out, and suggested in passing that they try switching their build system to Ant rather than the IDE. And there really isn't much more to say in this paragraph about it either, so I'll continue to the next one.

I suppose the reason I thought about writing this is because I pretty much assumed everyone nowadays uses Ant to build J2EE applications. Apparently I am wrong. Not the first time. Anyway, another tidbit about that project I was helping -- they had no "test" tree in their project, thus no, nada, zero unit tests. Maybe bad practices run amuck and are concentrated on a small set of projects, such that if someone is using only their IDE to do builds, perhaps it's not all that surprising they aren't unit testing either. But that's just random speculation.

No More Struts?

Posted on May 24, 2005 by Scott Leberknight

At this past weekend's No Fluff conference one thing was missing that has been at every conference I've been to for as long as I can remember...a session on Struts. There was not a single session on Struts, unless you count the session on Shale which I suppose is possibly the last gasp for Struts before it completely succumbs to JSF. Well, I suppose I don't really know enough (or anything) about Shale nor am I enthralled about learning, since I think most people will use straight JSF or Spring MVC or perhaps even Tapestry or WebWork. Though, other than Erik Hatcher I don't personally know anyone using Tapestry on a production project and I know only one person using WebWork. Erik loves Tapestry and though it looks pretty cool I am much more interested in Spring MVC simply because I am already a heavy Spring user. Everyone else is still on Struts it seems and a select few people I have talked to are using Spring MVC.

Spring 2005 No Fluff Just Stuff

Posted on May 23, 2005 by Scott Leberknight

Just attended the Reston, VA No Fluff Just Stuff conference this past weekend. As usual there were lots of high-quality speakers and content. This time around there were a bunch of new sessions to choose from and I attended a bunch of sessions on topics that I don't know very much about like Swing GUI development, cryptography, Java Security and JAAS, even a session delving into the depths of JavaScript. I never knew how dynamic JavaScript actually is and some of the basic rules it uses when it parses and executes code. I plan to document all the sessions I attended over (hopefully) the next week or three.

Impatiently Awaiting...

Posted on May 03, 2005 by Scott Leberknight

IntelliJ 5.0

JSF Conversion Oddities?

Posted on April 14, 2005 by Scott Leberknight

Suppose you have a web application with a field named "amount" on a form. Suppose also the field should convert its input to a number with at least 2 fractional digits. Ok, JSF provides a <f:convertNumber> tag that provides an attribute minFractionDigits. So far so good. We could put the following code in our JSF page:

<h:inputText id="amount" value="#{payment.amount}">
    <f:convertNumber minFractionDigits="2"/>
</h:inputText>

Makes some degree of sense. Whether I want to place this conversion directly in the view layer is debatable. Actually not really. I don't want to put it there but that's what JSF wants you to do, so for my example I did just that. Then I started playing around, submitting the form with various values. Entering valid numbers works fine, e.g. 75.00, 45, 21. So let's see what it does with something that is not a number. Entering 'abcd' produces a conversion error which gets output to the screen. The default error message is 'Conversion error occurred' and changing it to something like 'The amount could not be converted to a number' turns out to be exceedingly difficult in JSF on a per field basis, but that will have to wait for another blog. Ok, what about a mixture of numbers and letters? Entering '50xyz' produces...50.00. Um, wait a second. Apparently the invalid trailing letters are ignored. Last time I checked, if you try to feed that number to the standard factory method Double.valueOf() you get a NumberFormatException as you would expect. Why would JSF accept that value and convert it as a valid number? That just doesn't seem to make any sense at all.

Oh but wait. Before assuming this is purely the fault of JSF, I did a little more and wrote a JUnit test to test out parsing numbers using the NumberFormat.parse() class. The string "50xyz" is successfully parsed using this method. The JavaDoc for NumberFormat.parse() actually states "Parses text from the beginning of the given string to produce a number. The method may not use the entire text of the given string." That last sentence tells the story. The method doesn't necessarily use all of the input string, which I think is a bit odd. What if the input string were "50?75"? The parsed value is 50! This is not only odd, it seems just plain wrong to me. What if someone meant to type in "50.75" but typed "50?75" (since the "?" is next to "." on the keyboard? The parse does not flag an error and would then cause a wrong input value to be accepted. So apparently the culprit here is actually the JDK, not JSF. So this just illustrates that you need to thoroughly test your applications using many different forms of invalid data to ensure your application actually considers it invalid. But since JSF (apparently) uses NumberFormat during its conversion phase, this type of data will be converted successfully and not flagged as erroneous input.

Note that NumberFormat.parse() is not the only method with a lenient parse policy. The SimpleDateFormat.parse() method also does. For example, the string "4/2005edfnnas", when parsed with the pattern "MM/yyyy", is parsed successfully to the date 4/1/2005. And, the JSF <f:convertDateTime> tag probably uses SimpleDateFormat as well, since the above input string converts to "4/1/2005" in JSF. So if you are using JSF and its converter tags, be aware of the leniency during parsing.