The normal use for the CASE expression in T-SQL is to perform simple modifications to one of the columns in a SELECT statement; the type of example given in MSDN show it being used to expand state abbreviations such as 'CA' to 'California', or to categorise products into 'Cheap' and 'Expensive' depending on their price. However, the CASE expression is a lot more flexible than that, and can be used to implement dynamic sorting and paging without dynamic SQL.
Taking the classic Northwind database, here is a query that can retrieve pages of customers, sorted by either contact name, company name or country, ordered either ascending or descending:
DECLARE @SortType TINYINT, @SortAscending BIT, @FirstRow INT, @MaxRows INT;
SELECT @SortType = 1, @SortAscending = 1, @FirstRow = 10, @MaxRows = 10;
WITH FoundCustomers AS
(
SELECT
ROW_NUMBER() OVER
(
ORDER BY CASE @SortType
WHEN 0 THEN c.ContactName
WHEN 1 THEN c.CompanyName
WHEN 2 THEN c.Country
END ASC
) AS RowNumberAsc
,ROW_NUMBER() OVER
(
ORDER BY CASE @SortType
WHEN 0 THEN c.ContactName
WHEN 1 THEN c.CompanyName
WHEN 2 THEN c.Country
END DESC
) AS RowNumberDesc
,c.*
FROM
dbo.Customers c
)
SELECT
fc.*
FROM
FoundCustomers fc
WHERE
CASE @SortAscending
WHEN 1 THEN fc.RowNumberAsc
WHEN 0 THEN fc.RowNumberDesc
END
BETWEEN @FirstRow AND @FirstRow + @MaxRows - 1
ORDER BY
CASE @SortAscending
WHEN 1 THEN fc.RowNumberAsc
WHEN 0 THEN fc.RowNumberDesc
END;
Although this is an elegant solution, it does have some issues with performance; this particular query takes approximately twice as long as a statement hard-coded to use a given sort type and order. Looking at the execution plan shows the main cost in the query is the sorting (on my machine it is around 70% of the cost of the hard-coded query) and that even though RowNumberDesc is never used the sort orders are still computed for it, which is where the extra time is going.
In pursuit of performance, we're going to have to repeat most of the statement by breaking the query into separate ascending and descending ones. This isn't so good for maintenance, but as performance is critical in most databases the additional overhead of the combined query is the worse of two evils.
DECLARE @SortType TINYINT, @SortAscending BIT, @FirstRow INT, @MaxRows INT;
SELECT @SortType = 2, @SortAscending = 1, @FirstRow = 10, @MaxRows = 10;
IF @SortAscending = 1
BEGIN
WITH FoundCustomers AS
(
SELECT
ROW_NUMBER() OVER
(
ORDER BY CASE @SortType
WHEN 0 THEN c.ContactName
WHEN 1 THEN c.CompanyName
WHEN 2 THEN c.Country
END ASC
) AS RowNumber
,c.*
FROM
dbo.Customers c
)
SELECT
fc.*
FROM
FoundCustomers fc
WHERE
fc.RowNumber BETWEEN @FirstRow AND @FirstRow + @MaxRows - 1
ORDER BY
fc.RowNumber;
END
ELSE
BEGIN
WITH FoundCustomers AS
(
SELECT
ROW_NUMBER() OVER
(
ORDER BY CASE @SortType
WHEN 0 THEN c.ContactName
WHEN 1 THEN c.CompanyName
WHEN 2 THEN c.Country
END DESC
) AS RowNumber
,c.*
FROM
dbo.Customers c
)
SELECT
fc.*
FROM
FoundCustomers fc
WHERE
fc.RowNumber BETWEEN @FirstRow AND @FirstRow + @MaxRows - 1
ORDER BY
fc.RowNumber;
END;
Examining the execution plan for this query shows that it is almost exactly the same as the query hard-coded to a particular sort type and order, with the addition of a Compute Scalar step before the sort. This additional step takes less than 0.001% of the query time, so is a perfectly acceptable addition. We can infer from this that SQL Server can optimise a statement ordered by a CASE expression equally as well as one ordered by a hard-coded column.
The typical approach for dynamic sorting and paging that I've seen in the past has used dynamic SQL with sp_executesql to achieve the results. By using the CASE expression instead you can make your stored procedures verifiable at design time which aids database refactoring immensely, and achieve the same performance as hard-coded queries.
Update: 28 July 08
It turns out you actually don't have to split the query into two to get optimal performance; my colleague Jason was playing around with this some more and came up with a combined query using multiple searched CASE expressions rather than the a single simple CASE expression as shown above. Amazingly this is actually legal syntax, and gives the same optimal execution plan as the simple form:
DECLARE @SortType TINYINT, @SortAscending BIT, @FirstRow INT, @MaxRows INT;
SELECT @SortType = 2, @SortAscending = 0, @FirstRow = 10, @MaxRows = 10;
WITH FoundCustomers AS
(
SELECT
ROW_NUMBER() OVER
(
ORDER BY
CASE WHEN @SortType = 0 AND @SortAscending = 1 THEN c.ContactName END ASC
,CASE WHEN @SortType = 0 AND @SortAscending = 0 THEN c.ContactName END DESC
,CASE WHEN @SortType = 1 AND @SortAscending = 1 THEN c.CompanyName END ASC
,CASE WHEN @SortType = 1 AND @SortAscending = 0 THEN c.CompanyName END DESC
,CASE WHEN @SortType = 2 AND @SortAscending = 1 THEN c.Country END ASC
,CASE WHEN @SortType = 2 AND @SortAscending = 0 THEN c.Country END DESC
) AS RowNumber
,c.*
FROM
dbo.Customers c
)
SELECT
fc.*
FROM
FoundCustomers fc
WHERE
fc.RowNumber BETWEEN @FirstRow AND @FirstRow + @MaxRows - 1
ORDER BY
fc.RowNumber;
Posted
Jul 21 2008, 11:04 PM
by
Greg Beech