Julian Jelfs’ Blog

NHibernate – removing duplicates combined with paging

Posted in NHibernate by julianjelfs on April 3, 2009

The criteria API for NHibernate gives us a really easy way to perform paged queries. Combine this with .Net extension methods and you can get something quite neat. I use the following extension method to page my criteria queries:

public static ICriteria Page(this ICriteria criteria, int pageIndex, int pageSize)
{
    return criteria.SetFirstResult(pageIndex*pageSize)
        .SetMaxResults(pageSize);
}

This means that I can write queries like this:

_session.CreateCriteria(typeof(IResource))
            .Page(pageIndex, pageSize)
            .List<IResource>());

Which is nice. However, when paging a result set you may encounter a problem in that some queries will generate duplicates. In the example above I am returning “resources”. Let’s assume that resources can belong to one or more “groups”. If I want to restrict my query to resources that belong to groups with some characteristic then I am potentially going to get duplicate results. If I don’t mind duplicates then I have no problem, but if I want to remove them then I have to be careful.

The easiest way to remove duplicates in NHibernate is to use a DistinctRootEntityTransformer like this:

_session.CreateCriteria(typeof(IResource))
        .CreateCriteria("Groups")
        .SetResultTransformer(new DistinctRootEntityResultTransformer())
        .List<IResource>();

The problem is that this removes duplicates in the code after the query has been executed. This is problematic if you are combining it with paging because the query will return you a certain number of records (a page full) and then the transformer will potentially remove some of those records and spoil your paging. So if you want to take advantage of the database for paging rather than doing the paging in memory (not always an option for large datasets) then you need a different strategy.

Obviously it is necessary to do the “de-duping” as part of the query. One way to do this is to use the DetachedCriteria capability in NHibernate. The idea is to create a DetachedCriteria which returns a projection of just the distinct ids of the entities that you want to retrieve (in this case resources). All of your restrictions are applied to the DetachedCriteria:

var resIds = DetachedCriteria.For<Resource>()
        .Add(Restrictions.EqProperty("id", "res.id"))
        .CreateCriteria("Groups")
        .SetProjection(Projections.Distinct(Projections.Id()));

So this gets us the list of Ids we want without any duplicates. But how to translate that into a list of domain objects with no duplicates, paged correctly.

_session.CreateCriteria(typeof(IResource), "res")
        .Page(pageIndex, pageSize)
        .Add(Subqueries.PropertyIn("res.id", resIds))                    
        .List<IResource>());

Using a correlated subquery we can restrict the query to just the distinct Ids returned by our DetachedCriteria and perform the paging all as part of the same query. Nice. The resultant sql looks something like this:

SELECT TOP 10 this_.resourceid       AS resourceid13_0_,
              this_.name             AS name13_0_,
              this_.firstname        AS firstname13_0_
FROM   rm_resource this_
WHERE  EXISTS (SELECT DISTINCT this_0_.resourceid AS y0_
               FROM   rm_resource this_0_
                      INNER JOIN rm_resourcegroup resourcegr1_
                        ON this_0_.resourceid = resourcegr1_.resourceid
               WHERE  this_0_.resourceid > @p0
                      AND this_0_.resourceid = this_.resourceid)

I’ve removed some of the bits form the select clause just to save space but you get the idea.

Advertisements
Tagged with:

3 Responses

Subscribe to comments with RSS.

  1. adrian said, on July 20, 2009 at 5:20 pm

    thnkss julian…just what i was looking forr!!! (y)

  2. kevin said, on September 18, 2009 at 2:34 am

    Thanks for this post. Adapted for my use case, with the paging on the detached criteria portion, and results come back as expected.

  3. […] I found a another solution here too- not sure this one works though, ultimately I had to come up with a solution that suited my project. […]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: