Tuesday, February 20, 2007

Domain Driven Design : Use ORM backed Repository for Transparent Data Access

In my last post, I had discussed about having Repositories as a higher level of abstraction in domain driven design than vanilla DAOs. Many people have questioned about how the getOutstationEmployees() method in EmployeeRepositoryImpl could have been more performant using plain old SQL instead of having it abstracted behind the layers of DAOs and Repositories. Actually, the main idea behind the post was to establish the fact that Repositories are a more natural way to interface domain Aggregates with the underlying database than DAOs. Let us see why ..

  • The Data Access Object pattern evolved as part of the core J2EE design patterns as a means of handling bean managed persistence, where every business object was mapped to a DAO. And for relational databases, every DAO had a natural mapping to the database table. Hence the DAO pattern enforced a stronger coupling with the underlying database structure. This strategy encourages the Transaction Script pattern of modeling a domain, which is definitely not what DDD preaches.


  • Repository provides a more domain centric view of data access, where the client uses the Ubiquitous Language to access the data source. The DAOs, OTOH, provide a more database centric view, which is closer to the implementation than the domain.


  • Repositories provide controlled access to the underlying data in the sense that it exposes only the Aggregate roots of the model, which the clients should be using. When we model an Order domain entity, it makes sense to expose LineItems only in the context of the Order, and not as separate abstractions. Repositories are mapped to the Aggregate root level and ensure that the client gets a coherent view of the Order entity.


Using ORM solutions with DDD

DDD advocates a strong domain model and ORM encourages transparent persistence of domain objects. In case of an RDBMS backed application, both the paradigms aim towards decoupling the relational data layer from the object oriented domain layer. Hence it is more natural that they will complement each other when we think of scalable application architectures with a strong domain model. And when we consider the combination of DDD and ORM, the DAO paradigm looks deprecated, because the domain layer is no longer concerned about bean level persistence - it deals with Aggregate level persistence. We talk about persisting an Order as a whole, not individual LineItems. Hence we talk about data access and persistence in terms of Repositories, which is a higher level of abstraction than DAOs. And when we talk about transaction control, synchronization of a unit of work and transparent persistence of domain entities, naturally we think of Hibernate Sessions or JPA Entity Managers. So, the concept of a Repository fits this deal like a glove - suddenly you feel like programming at your domain level, relational tables are something that can be managed by the DBA.

Towards a Generic Repository Implementation

How would you like to design a Repository, which can participate in multiple implementations across various ORMs, exposing domain contracts in the Ubiquitous Language ? Clearly the design needs to be extensible on both sides -

  • On the abstraction side, we need to have extensibility for the domain. All domain repositories will be part of this hierarchy.

  • On the implementation side, we need to have extensibility for multiple implementations, e.g. JPA, Hibernate etc.


Think Bridge design pattern, which allows us to decouple an abstraction from its implementation so that the two can vary independently.

On the abstraction side we have


public interface IRepository<T> {
  List<T> read(String query, Object[] params);
}



and a base class, which delegates to the implementation ..



public class Repository<T> implements IRepository<T> {

  private RepositoryImpl repositoryImpl;

  public List<T> read(String query, Object[] params) {
    return repositoryImpl.read(query, params);
  }

  public void setRepositoryImpl(RepositoryImpl repositoryImpl) {
    this.repositoryImpl = repositoryImpl;
  }
}



On the implementation side of the Bridge, we have the following base class


public abstract class RepositoryImpl {
  public abstract <T> List<T> read(String query, Object[] params);
}



and an implementation based on JPA ..



public class JpaRepository extends RepositoryImpl {

  // to be injected through DI in Spring
  private EntityManagerFactory factory;

  @Override
  public <T> List<T> read(String query, Object[] params) {
    JpaTemplate jpa = new JpaTemplate(factory);

    if (params == null) {
      params = ArrayUtils.EMPTY_OBJECT_ARRAY;
  }

    try {
      @SuppressWarnings("unchecked")
      List<T> res = jpa.executeFind(new GenericJpaCallback(query, params));
      return res;
    } catch (org.springframework.dao.DataAccessException e) {
      throw new DataAccessException(e);
    }
  }
}



Similarly we can have a Hibernate based implementation ..


public class HibernateRepository extends RepositoryImpl {
  @Override
  public <T> List<T> read(String query, Object[] params) {
    // .. hibernate based implementation
  }
}



But the client can work based on the contract side of the Bridge, with the implementation being injected through Spring ..
Here's a sample client repository contract based on the domain model of Richardson in POJOs in Action ..


public interface IRestaurantRepository {
  List<Restaurant> restaurantsByName(final String name);
  List<Restaurant> restaurantsByStreetName(final String streetName);
  List<Restaurant> restaurantsByEntreeName(final String entreeName);
  List<Restaurant> restaurantsServingVegEntreesOnly();
}



for the aggregate root Restaurant having the following model :


@Entity
public class Restaurant {
  /**
   * The id.
   */
  @Id
  @GeneratedValue(strategy = GenerationType.AUTO)
  private long id;

  /**
   * The name.
   */
  private String name;

  /**
   * The {@link Address}.
   */
  @OneToOne(cascade = CascadeType.ALL)
  private Address address;

  /**
   * Set of {@link Entree}.
   */
  @ManyToMany
  @JoinTable(inverseJoinColumns = @JoinColumn(name = "ENTREE_ID"))
  private Set<Entree> entrees;

  // all getters and setters removed for clarity
}



It uses JPA annotations for the object relational mapping. Note the one-to-one relationship with Address table and the many-to-many relationship with Entree. Clearly, here, Restaurant is the Aggregate root, with Entree and Address being part of the domain entity. Hence the repository is designed from the Aggregate root, and exposes the collections of Restaurants based on various criteria.

Provide an implementation of IRestaurantRepository using the abstraction side of the Bridge ..


public class RestaurantRepository extends Repository<Restaurant>
  implements IRestaurantRepository {
  public List<Restaurant> restaurantsByEntreeName(String entreeName) {
    Object[] params = new Object[1];
    params[0] = entreeName;
    return read(
      "select r from Restaurant r where r.entrees.name like ?1",
      params);
  }
  // .. other methods implemented
}



finally the Spring beans configuration, that injects the implementation ..


<bean id="repoImpl"
  class="org.dg.inf.persistence.impl.jpa.JpaRepository">
  <property name="factory" ref="entityManagerFactory"/>
</bean>

<bean id="restaurantRepository"
  class="org.dg.repository.impl.jpa.RestaurantRepository"
  lazy-init="true">
  <property name="repositoryImpl" ref="repoImpl"/>
</bean>



Here is the complete implementation model of the Repository pattern :



Note how the above design of RestaurantRepository provides access to the Aggregate root Restaurant as collections without exposing the other entities like Entree and Address. Clearly the Repository pattern, if implemented properly, can actually hide the underlying database structure from the user. Here RestaurantRepository actually deals with 3 tables, but the client is blissfully unaware of any of them. And the underlying ORM makes it even more transparent with all the machinery of automatic synchronization and session management. This would not have been possible with the programming model of DAOs, which map naturally 1:1 with the underlying table. This is what I mean when I say that Repositories allow us to program the domain at a higher level of abstraction.

23 comments:

Stephen Colebourne said...

You haven't separated the DB persistence fully though. The domain model class Restaurant contains annotations which are specific to the DB. How would you suggest reworking you idea to avoid this?

Anonymous said...

jodastephen -- the annotations can be logically thought of as separate because they don't effect the code logic -- they're simply metadata on the properties that already have to be there... I would argue that the mapping of the data itself is more natural in the domain objects because it relates directly to the domain -- the manner of retrieval and storage, on the otherhand, is more suited to the Repository...

If that's still not acceptable, there are other ways of doing this mapping using external XML files in Hibernate (not sure about JPA) that would pull this information out of the domain classes...

Dmitriy Kopylenko said...

@anonymous -- JPA also supports XML for mapping config as an alternative to annotations. Again, it's a matter of taste and I personally prefer annotations.

http://dima767.blogspot.com

Colin Jack said...

In my view it is only worth using the bridge style solution if you really need to support multiple implementations.

If you only have one implementation then I think a single class hierarchy is probably enough. You just need to ensure that your IRepository interface is generic enough (so for example it cannot use ICriteria in Hibernates case).

On another note is there any way you can make the main section of your blog wider as it is quite difficult to read the code currently.

Unknown said...

@Colin:

+1 on your observation that the Bridge is suited when we have multiple implementations. In fact we are having multiple implementations - one with JDBC and the other with the new JPA stuff.

Regarding your suggestion on making the body of the blog wider, I agree with you. Recently I switched over to the new template in blogger - I really have to find out ways of making the code more readable.

Cheers.

Blair Yang said...

Hi Debasish,

It is a nice post, but I am not sure how can you address the transaction control issue crossing multiple "aggregate roots". For example, if we have an association that restaurants can register or withdraw, association can be an "aggregate root". And an registration operation involves association entity change as well as restaurant change. (let's assume this.) Then where you put your transaction control code? supposed you have RestarautRepository and AssociationRepository.

In addition, we are talking about creating domain-driven model which should be a rich-domain. In your example, the domain obj looks like a ordinary value object which does not have any complex business method. What I really mean is that if you define such a method in domain object, most likely you end up with duplicating that method in one of Repository classes in which the real logic is implemented.

In short, the whole idea of DDD with repository pattern seems attractive, but it is really hard to fully separate domain logic from persistent logic because in reality, persistent layer itself contains certain business logic, if it is not a lot!

Regards,
Zhonghua

Unknown said...

@zhong
regarding transaction control: Transaction control code is typically in the service layer, which offers coarse grained services using all domain level objects. Typically the service methods are annotated with transactions, if u use DI containers like Spring. Hence you can use all domain repositories and entities in the service layer to define your transaction boundaries.

regarding rich domain model: The entity Restaurant is the domain entity, which contains all the business logic. The example shown in the blog is a small one and hence none of the complex domain methods are shown there. And please note that RestaurantRepository is also a domain object of type "Repository" (as defined by Evans). Hence it is logical that domain logic will be there too. The persistence implementation is there in the other side of the Bridge rooted at RepositoryImpl.

Hope this helps.

Blair Yang said...

"regarding transaction control: Transaction control..."
I agree with your point. It's just when you are using both Hibernate and Spring at the same time, and you have only one database, naturally you'll use JDBC native transaction control. But some times it gets confusing if we should implement it by hibernate or by Spring.

"regarding rich domain model: The entity..."
It seems fair if you put Restaurant Repository as part of domain model. However, look at the code in RestaurantRepository:

return read("select r from Restaurant r where r.entrees.name like ?1",params);

This is definitely persistent logic.

In addition, if repository is considered to be part of domain, why you separated them into Restaurant and RestaurantRepository two classes. In my opinion, the method signatures in IRestaurant seem to be part of richer domain, and should be put together. (operations and data are put in one place.)

Unknown said...

@zhong:
re: code in RestaurantRepository .. return read("select ...")

I agree the query is persistence logic and embedded within a Java String. We can improve upon the separation by fetching the query string from a hash table and using named queries of JPA. But the most important point to note is that the entire domain model will have dependency on the interface IRestaurantRepository and have the implementation RestaurantRepository injected through a DI container. Have a look at the Spring beans configuration towards the end of the post. And IRestaurantRepository does not have any persistence logic in it. Thus we have a complete separation of persistence logic and the domain layer using the IRestaurantRepository as a top level domain contract.

re: separation of Restaurant and RestaurantRepository classes

This is separation of concerns. The class Restaurant is the domain entity which is an Aggregate in DDD terms. It is the root of the business object and will contain all business rules related to the Restaurant domain concept. e.g. How a restaurant serves the entrees, how to compute the restaurant's price index etc. all of which are not part of this simple example. The class RestaurantRepository, OTOH, is an implementation of IRestaurantRepository, which is the root of what Evans calls Repository objects in DDD. The role of a Repository is to serve as the interface between the domain aggregates and the persistent database. The domain layer and the service layer will interact with the repositories to read/write data from the database. If, for some reason, tomorrow we change the implementation of the underlying persistence layer, we need to change *only* the repository implementation (RestaurantRepository, in
this case). And since the rest of the application depends *only* on IRestaurantRepository, it remains unaffected. Nice separation of concerns .. huh!

HTH.

Unknown said...

I'm a DDD beginner and i'm trying to find practical ways of implementing it in my current work. I found this post very interesting, like others in your blog, but what about about a JpaRepository not so Spring dependendent? I would like to build a common library that offers infrastructure services like these.
Do u think that something like this could work?

public class JpaRepository extends RepositoryImpl {

private EntityManager em;

public void setEntityManager(EntityManager em) {
this.em = em;
}

public <T> List<T> read(String query, Object[] params) {
if (params == null) {
params = ArrayUtils.EMPTY_OBJECT_ARRAY;
}
Query q = em.createQuery(query);
for (int i = 0; i < params.length; i++) {
q.setParameter(i + 1, params[i]);
}
@SuppressWarnings(value = "unchecked")
List<T> resList = q.getResultList();
return resList;
}
}


I also would prefer to pass parameters as a Map<String, Object> rather than as an Object[] referring them by name rather than by position. It should be easy I suppose or I'm completely wrong?

Best Regards

Domenico

Unknown said...

@Domenico:
You have a perfectly OK approach going there. It is very much possible to write plain JPA applications without using any Spring dependencies. The code snippet that u have come up with also looks very much ok .. only a couple of observations :
a) You can use the annotation javax.persistence.PersistenceContext for the injected EntityManager.

@PersistenceContext
private EntityManager em;

This annotation belongs to javax.persistence and does not bind your code to Spring. Also, please note that Spring can understand @PersistenceUnit and @PersistenceContext annotations both at field and method level if a PersistenceAnnotationBeanPostProcessor is enabled.

b) You can also use the @Repository annotation for exception translation (from checked exceptions to runtime ones). This annotation comes with Spring, but provides a good value add in handling exception translation.

c) You can very well use the Map parameter and design your api s around it. Mine was only a sample approach of how to get the implementation of a Repository decoupled from the interfaces.

On the whole, your approach looks perfectly ok. Happy DDDing !

Cheers.

Setya said...

Debasish,

In your example, Restaurant is as a root aggregate.

My question :

1. Is it against DDD principle if in RestaurantRepository I provide method to obtain Entrees with Restaurant/RestaurantId as parameter ?

Something like :

List getEntrees(Restaurant restaurant).

or

List getEntrees(String restaurantId).

2. If the answer is yes, since I have to obtain Entrees through Restaurant, how do I add/remove Entrees individually ?

Thanks and Regards,

Setya

Unknown said...

@Setya:
Restaurant is the root aggregate and RestaurantRepository gives me all access to the root. Hence it is perfectly OK to have the following method as part of the RestaurantRepository contract :

List< Entree> getEntrees(final Restaurant r)

This method retrieves the list of entrees which the Restaurant r serves.

Regarding your other confusion, it is also perfect to have the following method which takes away an individual entree being served from a restaurant :

void removeEntree(Restaurant r, final Entree e), which removes Entree e from being served by Restaurant r.

HTH.

Setya said...

Debasish,

Thank you for your explanation.

My confusion originates from statement in DDD that "you can only obtain reference to an object from its aggregate root" which I conclude that you should not have the following method:

List getEntrees(final Restaurant r)

but you can only have the following method:

Restaurant getRestaurant(String id)

than you must retrieve the entry this way:

Restaurant r = getRestaurant(id);
List entrees = r.getEntrees();
Entry entry = entrees.get(0);

From your explanation, my conclusion above is wrong and I'm glad I'm wrong since it feels like too much work just to obtain Entry from a Restaurant.

But is it also OK if I modify your method to this:

void removeEntree(String entryId)

to remove the entry since entryId is unique no matter which restaurants the entry belongs to ?

Unknown said...

for Setya:

In DDD, the basic concept behind repository is that the *root* repository is your entry point to access the aggregate. It does not necessarily mean that you will have to get the root aggregate item first and do the traversal yourself. The repository serves as the facade and provides you controlled access to the whole aggregate. This explains the validity of the method

List getEntrees(final Restaurant r)

as part of the contract for RestaurantRepository.

Regarding removeEntree, I think we are talking about different things. The method

void removeEntree(Restaurant r, final Entree e)

removes the particular entree e from being served by the restaurant r, i.e. it removes the association between the restaurant and the entree. At the database level it removes the association entry from the many-to-many association table. This will not remove the entree from the master table (Entree).

If you are talking about removing the entree altogether, then this is a different use case and we need to have the Entree as the aggregate root for this. This method will be part of the EntreeRepository. One entity can both serve as the aggregate root in one chain and as a non-leaf member in another.

Cheers.

Setya said...

Debasish,

In DDD, the basic concept behind repository is that the *root* repository is your entry point to access the aggregate. It does not necessarily mean that you will have to get the root aggregate item first and do the traversal yourself. The repository serves as the facade and provides you controlled access to the whole aggregate. This explains the validity of the method,

Now I start to see the light, so for each aggregate root we should have 1 repository ?

Regarding removeEntree, I think we are talking about different things. The method

void removeEntree(Restaurant r, final Entree e)

removes the particular entree e from being served by the restaurant r, i.e. it removes the association between the restaurant and the entree. At the database level it removes the association entry from the many-to-many association table. This will not remove the entree from the master table (Entree).

Sorry, I guess I confuse your example with Order & OrderItem relationship.

Using Order & OrderItem as an example I want to clarify my early question about removing OrderItem from OrderRepository using method:

void removeOrderItem(String orderItemId)

instead of

void removeOrderItem(Order o,OrderItem oi)

The orderItemId is unique no matter which Order the OrderItem belongs to.

Regards,

Setya

Unknown said...

Reading about aggregates in DDD I was tended to make a parallel between this concept and the distinction, maded by UML, within associations between Composition and Aggregation. I've understand that, to manage childs lifecycle, in the case of composition, it's safer to delegate all CRUD operations on the childs to the aggregate root.

This is not the case of the Restaurant - Entree association. This is an aggregation case and so I can understand your suggestion to manage the association lifecycle through RestaurantRepository leaving an EntreeRepository to deal with the Entree entity lifecycle. It's another use case to remove the Entree entity as you said. From a conceptual standpoint the Entree lifecycle is not nested nor it's dependent by the Restaurant one. Also many Restaurants can serve the same Entree as expressed through JPA @ManyToMany annotation. The relationship is owned by the Restaurant Entity at the class level and through a many-to-many association table at database level so its lifecycle is managed by the restaurant aggregate root.

But is the simplier case of Order - OrderItem that makes me trouble. It's a composition case, i think, cause an OrderItem makes sense only when there's an Order to belong to. It's a OneToMany association in which one Order own a relation with one or more OrderItem. The relation could be also bidirectional, if useful, so the Item can retrieve its Order through its ManyToOne association.

1) following a responsibility assignment principle the OrderItem entity lifecycle responsibility shouldn't belongs to the Order entity that acts as aggregate root?
In this case, the root should manage not only the lifecycle of the associations with childs, as in the case of restaurant, but also the lifecycle of the same childs. Then a removeOrderItem use case should not only remove the association between parts but should also cascade to the child item deleting it from the OrderItem table. What is the best way to accomplish with is another question that bring me to the next point.

2) Who is the right owner of this association? The aggregate root, I suppose. This seems to be right at a class level but must be resolved at e-r level by adding an association table to manage such relation. Again, if the association is a composition kind, we need to find a way to extend the cascade action from the association table to the child table. Suppose that our Order class look like:

@Entity
public class Order {

@OneToMany(cascade = CascadeType.REMOVE)
private Set<OrderItem> items;

...

}

@Entity
public class OrderItem {
...
}


the Jpa annotation CascadeType.REMOVE scope is limited to the relation but doesn't extend his action to the OrderItem table. We could easily manage this by adding to our field a persistence provider specific annotation like this:

@Cascade(value = {CascadeType.DELETE_ORPHAN})


but in this way we are binding our domain class to a specific implementation (hibernate in this case). Not so clean :(

But what about if we move to the OrderItem the responsibility to own the relation with the Order class?

@Entity
public class Order {
...
}

@Entity
public class OrderItem {

@ManyToOne
@JoinColumn(name = "fk_order_id")
private Order order;

...
}


Pros: We remove the provider dependency; We need no more the association table in the database; We gain in scalability cause we don't need to load in memory the full set of items when all we need are simple operations on small subset of items.
Cons: we need to move all operation on childs from the root to the repository class. Not only lifecycle operations (this could be fine) but also potential bizlogic operations like getTotalPrice() that involve with child items (this is very bad); we constrain the OrderItem with a dependency on the Order class (not so bad imho if the association is a composition, but not very manageable).
The worst problem could be resolved by injecting in the Order entity a dependency on a OrderRepository but how to do this in a clean way, and how big is such a dependency?

Sorry for the big post

Best Regards
Domenico

Anonymous said...

Thanks for the nice post!

Robert Hafner said...

Unfortunately the use of the bridge pattern has not completeley removed the dependence on the repository implementor. For instance the code in RestaurantRepository contains a method which hard codes a JPA query. This query will have to be changed for a JDBCRepository Implementation. So you will not simply be able to switch the repository implementation in the spring config.

Anonymous said...

What about exception handling in repositories? As an example findRestaurantByName(String name). Should probably throw a domain exception instead of a DataAccess exception. What do you think?

/S

Unknown said...

@Sigmund: Sure .. repositories are domain artifacts - hence they should always throw domain exceptions. The post does not show this for brevity.

@Robert: I agree. You can however get some more abstraction by taking the queries out of the codebase and having them as NamedQueries of JPA.

Anonymous said...

We have used a Repository abstraction over hibernate for 4 years, and I think it is the best thing since whipped cream.

You shouldn't create specific repositories like ResturantRepository. We did that for a while, but then we found out there is only one repository: the EntityRepository, the one that keeps all the entities. (assuming you are using a single database.)

If you create generic ID's that, in addition to the Long value, also contain the type of the entity (no, not used in the database), you can have a single genering EntityRepository with this set of interfaces:

interface EntityId {
Class getType();
Long getKey();
}

interface Entity {
EntityId getId();
}

interface EntityRepository {
Entity get(EntityId id);
EntityId store(Entity e);
long count(EntitySpecification spec);
Collection find(EntitySpecification speec);
}
interface HibernateEntitySpecification extends EntitySpecification {
void populate(hibernate.Criteria c);
}

We use this a lot, and most of our unit tests actually runs in a dummy-implementation backed by 3 hashmaps (database, transaction, session) with no use of hibernate or sql. These are blazing fast (typical startup time 0.05 secs).

I just gave a talk at JavaZone on how we handle traversion of one-to-too-many relations on top of this fairly simplistic abstraction, the slides should be available any time now (www.javazone.no).

İnanç Gümüş said...

"You shouldn't create specific repositories like ResturantRepository. We did that for a while, but then we found out there is only one repository: the EntityRepository, the one that keeps all the entities. (assuming you are using a single database.)"

So, if you that, your domain layer should be consisting of many criteria objects which repository pattern also tries to encapsulate. Didn't this harm you?

Also, your code contains an EntitySpecification class which lets clients of the generic entity repository to find almost any entity through it. And there is also a hibernate implementation for that type. It contains a populate method with a hibernate type Criteria. How do you generalize here? I mean, EntitySpecification.populate(x criteria); what type is x?