JPA Id/Object References always, with Null Objects, a Pattern?
      An object has references to other objects. A row in a relational database table has foreign keys (FK) to rows in other tables.
While this basic "impedence mismatch" between the model used in the relational datastores prevalent in the enterprise today (RDBMS) and an object model (OO) is well understood and addressed by today's Object Relational Mapping (ORM) technologies, there is a specific use case which (AFAIK) is typically not easily addressed by how today's ORMs are used: What if, sometimes, you need that FK directly?
Imagine, in pseudo code (you get the idea) : class A { long id; B b; ... } & class B { long id; ... }. In certain situations, it would be handy, and feel natural from the OO point of view, if you could just ask for a.b.id. Alas this typically leads to "lazy loading", a delayed access with a SELECT FROM b ... in an ORM. If you really just want the b.id, this deferred RDBMS access makes no sense of course.
The "problem" why normally this doesn't work when using e.g. a Java Persistence API (JPA) implementation ORM is that relation fields in entity instances returned by a query (or an EM's "get one" find() method) are normally either fully initialized because the field was actually annotated as FetchType.EAGER, or because a JOIN FETCH in a JPQL Query (standardized by the specification, not implementation specific) requested it to be, or due to some JPA implementation specific API such as the OpenJPA FetchPlan API which asked for that (all of which leads to table JOINs and/or additional SELECT queries), or null if none of that is used.
The JPA specification must have had use cases like these in mind, and offers a concept of interest in this context, namely the probably less well known T getReference(Class<T> entityClass, Object primaryKey) method of an EntityManager.
Furthermore, it turns out that e.g. B's id is in fact typically already available to the ORM internally after it read an A, even if the A's b field is still null. This makes sense, and is how the lazy loading stuff normally works behind the scenes in all ORMs AFAIK (that's how they do one and not two SELECT statements when lazy loading).
It occurred to me that it would be really handy if all relationship fields of an Entity were always initialized, either with a full blown real object if the field really was eagerly fetched, or with whatever kind of "hollow" object getReference() created efficiently (typically only the object's id fields composing the oid). This would make the a.b.id example from above work for both the "eager" and the "lazy" scenario homogeneously, efficiently & very naturally!
With a little bit of unfortunately unavoidable hacking to access one JPA implementation's internals (OpenJPA in my example), needed because the JPA public API (both v1 and v2) does not allow direct access to that internal A's B id, the loaded state, and a getReference() from within a @PostLoad without access to an EntityManager, this idea does work indeed, as demonstrated in my example project's test case. (Other JPA implementations likely would allow similar direct access to its data structures? The only thing in the example that would need to be "ported" to another ORM such as e.g. EclipseLink and Hibernate is factored in the JPAHelper class... if anybody is interested in trying this out?)
One interesting side effect I ran into while looking at this and trying to get a running sample was the case of e.g. A's b really having to be null - because the say b_id FK in the A table IS actually NULL in the DB (if it's optional / nullable). I thought it would be good if you could STILL do a.b.id (always), with that expression (access path) simply returning null (or 0 if the id field is of an e.g. primitive int or long type instead of a Integer or Long object) in that case, but never causing a NullPointerException.
The initial inspiration for that was how the new Scala programming language appears to (normally? from what I understood so far; I'm only half way through my Scala book!) prevent NullPointerExceptions altogether. Then a good colleague pointed out that conceptually this is of course nothing new, and not specific to Scala - it's the Null Object pattern at work. So I threw in a bit of Null JPA Entity objects, and the interplay of my Reference object idea above with the application of the null object pattern here seems really neat.
Download the "JPA Id/Object References always, with Null Objects" example project to have a closer look at running code demonstrating this idea. - Do you like this approach? Is this a "pattern"? Could & should future JPA specification standardize support for such a usage?
Acknowledgments: Thanks to Yann Andenmatten for always inspiring feedback & discussions, and the dynamic/runtime AOP-ish NullEntityFactory contribution to the example project.
    While this basic "impedence mismatch" between the model used in the relational datastores prevalent in the enterprise today (RDBMS) and an object model (OO) is well understood and addressed by today's Object Relational Mapping (ORM) technologies, there is a specific use case which (AFAIK) is typically not easily addressed by how today's ORMs are used: What if, sometimes, you need that FK directly?
Imagine, in pseudo code (you get the idea) : class A { long id; B b; ... } & class B { long id; ... }. In certain situations, it would be handy, and feel natural from the OO point of view, if you could just ask for a.b.id. Alas this typically leads to "lazy loading", a delayed access with a SELECT FROM b ... in an ORM. If you really just want the b.id, this deferred RDBMS access makes no sense of course.
The "problem" why normally this doesn't work when using e.g. a Java Persistence API (JPA) implementation ORM is that relation fields in entity instances returned by a query (or an EM's "get one" find() method) are normally either fully initialized because the field was actually annotated as FetchType.EAGER, or because a JOIN FETCH in a JPQL Query (standardized by the specification, not implementation specific) requested it to be, or due to some JPA implementation specific API such as the OpenJPA FetchPlan API which asked for that (all of which leads to table JOINs and/or additional SELECT queries), or null if none of that is used.
The JPA specification must have had use cases like these in mind, and offers a concept of interest in this context, namely the probably less well known T getReference(Class<T> entityClass, Object primaryKey) method of an EntityManager.
Furthermore, it turns out that e.g. B's id is in fact typically already available to the ORM internally after it read an A, even if the A's b field is still null. This makes sense, and is how the lazy loading stuff normally works behind the scenes in all ORMs AFAIK (that's how they do one and not two SELECT statements when lazy loading).
It occurred to me that it would be really handy if all relationship fields of an Entity were always initialized, either with a full blown real object if the field really was eagerly fetched, or with whatever kind of "hollow" object getReference() created efficiently (typically only the object's id fields composing the oid). This would make the a.b.id example from above work for both the "eager" and the "lazy" scenario homogeneously, efficiently & very naturally!
With a little bit of unfortunately unavoidable hacking to access one JPA implementation's internals (OpenJPA in my example), needed because the JPA public API (both v1 and v2) does not allow direct access to that internal A's B id, the loaded state, and a getReference() from within a @PostLoad without access to an EntityManager, this idea does work indeed, as demonstrated in my example project's test case. (Other JPA implementations likely would allow similar direct access to its data structures? The only thing in the example that would need to be "ported" to another ORM such as e.g. EclipseLink and Hibernate is factored in the JPAHelper class... if anybody is interested in trying this out?)
One interesting side effect I ran into while looking at this and trying to get a running sample was the case of e.g. A's b really having to be null - because the say b_id FK in the A table IS actually NULL in the DB (if it's optional / nullable). I thought it would be good if you could STILL do a.b.id (always), with that expression (access path) simply returning null (or 0 if the id field is of an e.g. primitive int or long type instead of a Integer or Long object) in that case, but never causing a NullPointerException.
The initial inspiration for that was how the new Scala programming language appears to (normally? from what I understood so far; I'm only half way through my Scala book!) prevent NullPointerExceptions altogether. Then a good colleague pointed out that conceptually this is of course nothing new, and not specific to Scala - it's the Null Object pattern at work. So I threw in a bit of Null JPA Entity objects, and the interplay of my Reference object idea above with the application of the null object pattern here seems really neat.
Download the "JPA Id/Object References always, with Null Objects" example project to have a closer look at running code demonstrating this idea. - Do you like this approach? Is this a "pattern"? Could & should future JPA specification standardize support for such a usage?
Acknowledgments: Thanks to Yann Andenmatten for always inspiring feedback & discussions, and the dynamic/runtime AOP-ish NullEntityFactory contribution to the example project.


2 Comments:
I just came across this blog (nothing new), and linked from there some interesting stuff related to the Null Object part of the idea described above.
Acknowledgments: Thanks to Yann Andenmatten for always inspiring feedback & discussions, and the dynamic/runtime AOP-ish NullEntityFactory contribution to the example project.
Post a Comment
<< Home