Quantcast
Channel: sql – Java, SQL and jOOQ.
Viewing all articles
Browse latest Browse all 426

Turn Around. Don’t Use JPA’s loadgraph and fetchgraph Hints. Use SQL Instead.

$
0
0

Thorben Janssen (also known from our jOOQ Tuesdays series) recently published an interesting wrap-up of what’s possible with Hibernate / JPA query hints. The full article can be seen here:
http://www.thoughts-on-java.org/11-jpa-hibernate-query-hints-every-developer-know

Some JPA hints aren’t really hints, they’re really full-blown query specifications, just like JPQL queries, or SQL queries. They tell JPA how to fetch your entities. Let’s look at javax.persistence.loadgraph and javax.persistence.fetchgraph.

The example given in Oracle’s Java EE 7 tutorial is this:

You have a default entity graph, which is hard-wired to your entity class using annotations (or XML in the old days):

@Entity
public class EmailMessage implements Serializable {
    @Id
    String messageId;
    @Basic(fetch=EAGER)
    String subject;
    String body;
    @Basic(fetch=EAGER)
    String sender;
    @OneToMany(mappedBy="message", fetch=LAZY)
    Set<EmailAttachment> attachments;
    ...
}

Notice how the above entity graph mixes formal graph meta information (@Entity, @Id, @OneToMany, …) with query default information (fetch=EAGER, fetch=LAZY).

EAGER or LAZY?

Now, the problem with the above is that these defaults are hard-wired and cannot be changed for ad-hoc usage (thank you annotations). Remember, SQL is an ad-hoc query language with all of benefits that derive from this ad-hoc-ness. You can materialise new result sets whose type was not previously known on the fly. Excellent tool for reporting, but also for ordinary data processing, because it is so easy to change a SQL query if you have new requirements, and if you’re using languages like PL/SQL or libraries like jOOQ, you can even do that in a type safe, precompiled way.

Unlike in JPA, whose annotations are not “ad-hoc”, just like SQL’s DDL is not “ad-hoc”. Can you ever switch from EAGER to LAZY? Or from LAZY to EAGER? Without breaking half of your application? Truth is: You don’t know!

The problem is: choosing EAGER will prematurely materialise your entire entity graph (even if you needed only an E-Mail message’s subject and body), resulting in too much database traffic (see also “EAGER fetching is a code smell” by Vlad Mihalcea). Choosing LAZY will result in N+1 problems in case you really do need to materialise the relationship, because for each parent (“1”), you have to individually fetch each child (“N”) lazily, later on.

Do SQL people suffer from N+1?

As a SQL person, this sounds ridiculous to me. Imagine specifying in your foreign key constraint whether you always want to auto-fetch your relationship…

ALTER TABLE email_attachment
ADD CONSTRAINT fk_email_attachment_email
FOREIGN KEY (message_id)
REFERENCES email_message(message_id)
WITH FETCH OPTION LAZY -- meh...

Of course you don’t do that. The point of normalising your schema is to have the data sit there without duplicating it. That’s it. It is the query language’s responsibility to help you decide whether you want to materialise the relationship or not. For instance, trivially:

-- Materialise the relationship
SELECT *
FROM email_message m
JOIN email_attachment a
USING (message_id)

-- Don't materialise the relationship
SELECT *
FROM email_message m

Duh, right?

Are JOINs really that hard to type?

Now, obviously, typing all these joins all the time can be tedious, and that’s where JPA seems to offer help. Unfortunately, it doesn’t help, because otherwise, we wouldn’t have tons of performance problems due to the eternal EAGER vs LAZY discussion. It is a GOOD THING to think about your individual joins every time because if you don’t, you will structurally neglect your performance (as if SQL performance wasn’t hard enough already) and you’ll notice this only in production, because on your developer machine, you don’t have the problem. Why?

Works on my machine ಠ_ಠ

One way of solving this with JPA is to use the JOIN FETCH syntax in JPQL (which is essentially the same thing as what you would be doing in SQL, so you don’t win anything over SQL except for automatic mapping. See also this example where the query is run with jOOQ and the mapping is done with JPA).

Another way of solving this with JPA is to use these javax.persistence.fetchgraph or javax.persistence.loadgraph hints, but that’s even worse. Check out the code that is needed in Oracle’s Java EE 7 tutorial just to indicate that you want this and that column / attribute from a given entity:

EntityGraph<EmailMessage> eg = em.createEntityGraph(EmailMessage.class);
eg.addAttributeNodes("body");
...
Properties props = new Properties();
props.put("javax.persistence.fetchgraph", eg);
EmailMessage message = em.find(EmailMessage.class, id, props);

With this graph, you can now indicate to your JPA implementation that in fact you don’t really want to get just a single E-Mail message, you also want all the specified JOINs to be materialised (interestingly, the example doesn’t do that, though).

You can pass this graph specification also to a JPA Query that does a bit more complex stuff than just fetching a single tuple by ID – but then, why not just use JPA’s query language to express that explicitly? Why use a hint?

Let me ask you again, why not just specify a sophisticated query language? Let’s call that language… SQL? The above example is solved trivially as such:

SELECT body
FROM email_message
WHERE message_id = :id

That’s not too much typing, is it? You know exactly what’s going on, everyone can read this, it’s simple to debug, and you don’t need to wrestle with first and second level caches because all the caching that is really needed is buffer caching in your database (i.e. caching frequent data in database memory to prevent excessive I/O).

The cognitive overhead of getting everything right and tuning stuff in JPA is so big compared to writing just simple SQL statements (and don’t forget, you may know why you put that hint, but your coworker may so easily overlook it!), let me ask you: Are you very sure you actually profit from JPA (you really need entity graph persistence, including caching)? Or are you wrestling the above just because JPA is your default choice?

When JPA is good

JPA (and its implementations) is excellent when you have the object graph persistence problem. This means: When you do need to load a big graph, modify it in your client application, possibly in a distributed and cached and long-conversational manner, and then store the whole graph back into the database without having to wrestle with locking, caching, lost updates, and all sorts of other problems, then JPA does help you. A lot. You don’t want to do that with SQL.

Do note that the key aspect here is storing the graph back into the database. 80% of JPA’s value is in writing stuff, not reading stuff.

But frankly, you probably don’t have this problem. You’re doing mostly simple CRUD and probably complex querying. SQL is the best language for that. And Java 8 functional programming idioms help you do the mapping, easily.

Conclusion

Don’t use loadgraph and fetchgraph hints. Chances are very low that you’re really on a good track. Chances are very high that migrating off to SQL will greatly simplify your application.


Filed under: jpa Tagged: EntityGraph, FetchGraph, Hints, jpa, sql

Viewing all articles
Browse latest Browse all 426

Trending Articles