Sunday, February 8, 2009

Select Top X from Grouped Results

Every now and then because of a particular design I need to get top X number from grouped results.

In this particular case, each record has a status history with it and the status changes are tracked in a related history table.
In this case we'll call it

I want the most recent status from the JobStatusHistory, but there can be many status changes per record. Luckily the solution is easy (there are others including cursor based ways, but this is a fairly elegant solution)

We can use the little known row_number over(parition by XX) t-sql command
It provides great functionality but you really don't see it too much.

This gets me all results plus a field named 'rownum' (you can call it anything since I'm just aliasing it here as another field)

select id,row_number()over(partition by jobid order by createddate desc) rownum
from jobstatushistory

So having numbered results, we can use this as a where clause to get each result.

select jobid,newstatus from(
select jobid,newstatus,row_number()over(partition by jobid order by createddate desc) row_num
from jobstatushistory) t
where row_num=1

This will number the rows and get only the first one from each group (in this case 'grouped' is ordered by created date, rather than an actual 'group by' clause.

1 comment: