Apply Now  | Knowledge Base  |  About Us | Locations |  Contact Us
  Knowledge Base

About .NET
LINQ
WPF
About SQL Server 2005
About Business Intelligence
Technical Articles
.NET Articles
SQL Server 2005 Articles
SharePoint 2007 Articles
Business Intelligence Articles
Online Resources





 Live chat by LivePerson



Take Five with SQL Server 2005

New T-SQL Capabilities in SQL Server 2005 - Part 3 of 3
Kevin S. Goff, Microsoft MVP 2007 

kick it on DotNetKicks.com 

In last month's article, I covered the new APPLY operator, Ranking functions, and the new TRY...CATCH capabilties in T-SQL 2005.  This month I'll talk about two of the largest enhancements in the SQL 2005 language - Common Table Expressions (CTEs) and Recursive Queries. 

Both of these are part of the SQL-99 standard.  CTEs allow developers to more expressively identify a temporary result set (similar to a derived table or temporary view).  Recursive Queries allow developers to more easily query hierarchies of data, by permitting developers (in a single SQL statement) to recursively query into a CTE that's also being populated).

CTEs and Recursive queries can be a little difficult to initially grasp, so I'm going to show several code examples to demonstrate their value.

1 - Common Table Expressions (first example)

Suppose we wanted to find all the orders in the Northwind Orders database, where the Freight was greater than $500, and the Freight exceeded the order amount. Now, this is a pretty basic example, but will show how we can use CTEs.

        WITH OrderCTE  (OrderID, CustomerID, OrderDate, Amount,Freight) AS
            (SELECT OH.OrderID, CustomerID,  OrderDate, (Unitprice * Quantity) AS Amount, Freight 
                       FROM Orders OH 
                           JOIN [dbo].[Order Details] OD  ON OH.OrderID = OD.OrderID 
                                 WHERE Freight > 500)

        SELECT * FROM OrderCTE WHERE Freight > Amount ORDER BY Freight DESC

In the query above, we declared a CTE called OrderCTE  (which, again, you can think of as a temporary result set) with the following line:

        WITH OrderCTE  (OrderID, CustomerID, OrderDate, Amount,Freight) AS

The remaining part query inside the CTE will populate the CTE with orders greater than $500. The subsequent statement queries the CTE for orders where the freight exceeds the amount.

        SELECT * FROM OrderCTE WHERE Freight > Amount ORDER BY Freight DESC

We certainly could have done this all with one query.  But some would argue that the CTE approach is a little more readable.  This becomes especially true with more complicated derived table queries.

One thing to note: the CTE has a very short shelf-life.  How short?...only for the statement following the creation of the CTE.  But if you want to break up complicated derived table statements for readability, CTEs are very helpful.

2 - Common Table Expressions (second example)

Suppose we wanted to run the same query for only those customers outside the United States that have at least 20 Orders. We can build 2 CTEs as follows, one for a list of Customers (CustomerCTE) and the other for Orders (OrderCTE), and then join them at the end:

        WITH CustomerCTE (CustomerID, CompanyName,OrderCount) AS
                  (SELECT CustomerID, CompanyName, 
                        (SELECT COUNT(*) FROM Orders WHERE Orders.CustomerID = Customers.CustomerID) AS OrderCount 
                                FROM Customers WHERE Country <> 'USA'),

                OrderCTE  (OrderID, CustomerID, OrderDate, Amount,Freight) AS
                        (SELECT OH.OrderID, CustomerID,  OrderDate, (Unitprice * Quantity) AS Amount, Freight 
                               FROM Orders OH 
                               JOIN [dbo].[Order Details] OD  ON OH.OrderID = OD.OrderID 
                                     WHERE Freight > 500)

        SELECT CustomerCTE.CustomerID, CompanyName, OrderID, OrderDate, Amount, Freight, OrderCount
                  FROM OrderCTE JOIN CustomerCTE ON CustomerCTE.CustomerID = OrderCTE.CustomerID
         WHERE Freight > Amount  AND OrderCount >= 20 ORDER BY Freight DESC

In the example above, I've created a 2nd CTE to retrieve the customers outside the US.  The CTE includes the order count for each customer:

        WITH CustomerCTE (CustomerID, CompanyName,OrderCount) AS
                  (SELECT CustomerID, CompanyName, 
                        (SELECT COUNT(*) FROM Orders WHERE Orders.CustomerID = Customers.CustomerID) AS OrderCount 
                                FROM Customers WHERE Country <> 'USA'),

At the end, we've created two CTEs that we can query for the Orders where the freight exceeds the amount, and the order count is greater than 20.

        SELECT CustomerCTE.CustomerID, CompanyName, OrderID, OrderDate, Amount, Freight, OrderCount
                  FROM OrderCTE JOIN CustomerCTE ON CustomerCTE.CustomerID = OrderCTE.CustomerID
         WHERE Freight > Amount  AND OrderCount >= 20 ORDER BY Freight DESC

3 - Recursive Queries (first example)

The first two examples used the basic capabilties of Common Table Expressions.  Now we'll expand our use of CTEs to cover recursive queries.

Let's start with a simple but meaningful example for recursively querying data: suppose we have a Product table that stores hierarchical data, with ParentID pointers to the parent level for each record:

        DECLARE @tProducts TABLE (ID int, Name char(50), ParentID int)

        INSERT INTO @tProducts VALUES (1, 'Brand 1', null)
        INSERT INTO @tProducts VALUES (2, 'Brand 2', null)
        INSERT INTO @tProducts VALUES (3, 'Brand 3', null)

        INSERT INTO @tProducts VALUES (6, 'Brand 1, Group 1', 1)
        INSERT INTO @tProducts VALUES (7, 'Brand 1, Group 2', 1)
        INSERT INTO @tProducts VALUES (8, 'Brand 1, Group 3', 1)

        INSERT INTO @tProducts VALUES ( 9, 'Brand 2, Group 1', 2)
        INSERT INTO @tProducts VALUES (10, 'Brand 2, Group 2', 2)

        INSERT INTO @tProducts VALUES ( 11, 'Brand 3, Group 3', 3)

        INSERT INTO @tProducts VALUES (12, 'Brand 1, Group 1, Item 1', 6)
        INSERT INTO @tProducts VALUES (13, 'Brand 1, Group 1, Item 2', 6) 

        INSERT
INTO @tProducts   VALUES (14, 'Brand 1, Group 1, Item 1, SKU 1', 12)

The table above stores a product hierarchy (Brand, Group, Item, and SKU), with one or more rows at each level, and a ParentID reference to the corresponding parent level.

Now suppose we want to query the table for a particular single record, and also retrieve all the parent records.  To do so, we need two queries: a query for a single record, and then a query that retrieves the parent records for the single record, and then the grandparent records for those parent records, etc.  The full syntax for such a query in SQL 2005 is as follows:

        DECLARE @cSearch char(50)
        SET @cSearch = 'Brand 1, Group 1'

        ; WITH ProductCTE (ID, Name, ParentID) AS 
            -- Anchor query
            (SELECT ID, Name, ParentID 
                    FROM @tProducts
                    WHERE Name = @cSearch 
                UNION ALL
                    -- Recursive query
                    SELECT Prod.ID, Prod.Name, Prod.ParentID 
                        FROM @tProducts Prod 
                            INNER JOIN ProductCTE
                            ON ProductCTE.ID = Prod.parentID )         

        SELECT * FROM ProductCTE

Let's break this query down - there are essentially three pieces to it:

First, we define the CTE:

        ; WITH ProductCTE (ID, Name, ParentID) AS

Second, we perform the main (or "anchor") query, to retrieve the specific row  (for example, "Brand 1, Group 1")

            (SELECT ID, Name, ParentID 
                    FROM @tProducts
                    WHERE Name = @cSearch

Finally (and perhaps most important), we add our recursive query to the anchor query (with a UNION ALL statement).  The recursive query allows us to query against the ProductCTE while we also query against the ProductCTE and the source table for records where the ProductCTE.ID equals the ParentID column of the source table.  This process continues until all searches are exhausted.

                UNION ALL
                    -- Recursive query
                    SELECT Prod.ID, Prod.Name, Prod.ParentID 
                        FROM @tProducts Prod 
                            INNER JOIN ProductCTE
                            ON ProductCTE.ID = Prod.parentID )

This will give us a result of  the following:

            ID            Name                                                    ParentID

              6            Brand 1, Group 1                                              1
            12            Brand 1, Group 1, Item 1                                  6
            13            Brand 1, Group 1, Item 2                                  6
            14            Brand 1, Group 1, Item 1, SKU 1                     12

Before we move on to the next example - suppose that instead of querying for all the parent records, we wanted to query to retrieve all the child records instead?  The answer is simple: we reverse the queries on ID and ParentID, as follows:

        SET @cSearch = 'Brand 1, Group 1, Item 2'

        ; WITH ProductCTE AS 
            -- Anchor query
            (SELECT ID, Name, ParentID   -- Retrieve the ParentID
                    FROM @tProducts
                    WHERE Name = @cSearch 

                UNION ALL
                    -- Recursive query
                    SELECT Prod.ID, Prod.Name, Prod.ParentID 
                        FROM @tProducts Prod 
                            INNER JOIN ProductCTE
                            ON ProductCTE.ParentID = Prod.ID )    -- reverse the join keys

        SELECT * FROM ProductCTE

5 - Recursive Queries (second example)

In my first article in this series, I covered the new XML capabilities in SQL Server 2005, and presented a reusable table-valued UDF to convert an XML string into a table variable.  Many developers need to handle stored procedure queries against a variable number of selection keys, and use either XML strings or comma-separated lists.  We can use recursive queries to build a reusable table-valued UDF from a CSV.

For example, suppose we have a stored procedure that must query a table based on a variable number of user selections.  An end-user selects seven items from a list (though it could be more or less than seven).  Instead of sending up an XML string and using the XML2TableVar function we covered in the first article, we want to send up a CSV instead that looks like the following:

    ( '1, 11, 111, 222, 333, 444, 555' )

Our goal is to create a table variable with those seven key values, so that we can query a table containing those key values. Prior to SQL 2005, we would parse the string, one comma at a time, and retrieve each value between two commas.  In SQL 2005, we can utilize recursive queries to build a list of all the comma positions in the table, and then query the CSV string for each value within the column position pairings.

        CREATE FUNCTION [dbo].[CSVToTable] 
           (@CSVList varchar(8000) )
        RETURNS 
            @tPKList TABLE  ( IntPK int )
        AS
        BEGIN

                  WITH CSVCte (StartPos, EndPos) AS 
                           ( SELECT 1 AS StartPos, CHARINDEX(',' , @CSVList + ',') AS EndPos
                                     UNION ALL
                                              SELECT EndPos + 1 AS StartPos , CHARINDEX(',',@CSVList + ',' , EndPos + 1) AS EndPos FROM CSVCTE 
                                                        WHERE CHARINDEX(',', @CSVList + ',', EndPos + 1) <> 0)

                  INSERT INTO @tPKList  
                           SELECT SUBSTRING(@CSVList, StartPos,EndPos-StartPos) FROM CSVCte

           RETURN 
        END

        SELECT * FROM dbo.CSVToTable('1, 11, 111, 222, 333, 444, 555')

To do this, first we create the CTE (called CSVCTE),  which will store all the comma positions in the string.

        WITH CSVCte (StartPos, EndPos) AS

The main query (anchor query) will retrieve the first comma position pair by using CHARINDEX.

        ( SELECT 1 AS StartPos, CHARINDEX(',' , @CSVList + ',') AS EndPos

The recursive query will pull all subsequent comma positions, by incrementing the End Position by 1 each time, until all comma positions have been found.

        UNION ALL
            SELECT EndPos + 1 AS StartPos , CHARINDEX(',',@CSVList + ',' , EndPos + 1) AS EndPos FROM CSVCTE 
                    WHERE CHARINDEX(',', @CSVList + ',', EndPos + 1) <> 0)

The final result of the CSVCTE CTE is the following start position/end position pairs:

            1         2
            3         6
            7         11
            12       16
            17       21
            22       26
            27       31

Finally, we can use the SUBSTRING function to query each pair combination in the CTE against the original string, to retrieve each substring based on each start/end position:

                  INSERT INTO @tPKList  
                           SELECT SUBSTRING(@CSVList, StartPos,EndPos-StartPos) FROM CSVCte

Hopefully this second example of recursive queries gives you some insight into their power. Let's take a look at one final example that can help solve a common reporting issue.

5 - Recursive Queries (third example)

Recently I had to build a result set of weekly sales orders. Even if a week did not have any orders, I still needed to include a row for each week-ending date, regardless of whether any sales activity occured.  Before SQL Server 2005, I would build a temporary table with a row for each week ending date, and then perform an outer join between that temporary table and my table of orders.  Let's take a look at a version of the code using recursive queries.

Before we start, let's take a look at our result set requirements.   We want to show a row for each week-ending date: our sales weeks are defined as Sunday through Saturday.  Since an order can occur any day of the week, we need to summarize the orders by the week-ending date.  So we need to take a little detour first, by writing a function to receive a date, and return the corresponding week-ending date (Saturday).   The function below (GetEndOfWeek) will take a date and use the SQL DatePart function to determine the date's day of the week, and add the necessary number of additional days to make it a Saturday date.

       CREATE FUNCTION dbo.GetEndOfWeek 
          (@dDate DateTime)

        -- Converts date to the Saturday date for the week

        RETURNS DateTime AS  
            BEGIN 
                  DECLARE @dRetDate DateTime
                  SET @dRetDate = 
                      @dDate + ( 7-DATEPART(WeekDay, @dDate))

                  RETURN @dRetDate

        END

Next, we need to declare some start and end dates, and create some sample data.  Note that we want to run the result set for the first quarter of 2007, but need to adjust the dates accordingly:

        DECLARE @StartDate DATETIME, @EndDate DATETIME
        SET @StartDate = dbo.GetEndOfWeek('1-1-2007')   -- First Sunday
        SET @EndDate = dbo.GetEndOfWeek('3-31-2007')   -- Last Saturday

        DECLARE @Orders TABLE (OrderDate datetime, Amount money)
        INSERT INTO @Orders VALUES ('20070201', 800)
        INSERT INTO @Orders VALUES ('20070202', 1100)
        INSERT INTO @Orders VALUES ('20070302', 2100)
        INSERT INTO @Orders VALUES ('20070301', 600)
        INSERT INTO @Orders VALUES ('20070303', 4100)

Now we need to build a CTE that creates a date entry for each week ending date in the date range.  Our main/anchor query will SELECT the startdate, and the recursive query will select each subsequent week-ending date plus 7 days, until we reach the end date.

        ;WITH DateCTE(WeekEnding)  AS
                   (SELECT @StartDate AS WeekEnding
                             UNION ALL 
                                      SELECT WeekEnding + 7 AS WeekEnding FROM DateCTE WHERE WeekEnding <= @EndDate )

Finally, we can query the DateCTE againist the Orders table, using an outer join. 

        SELECT WeekEnding, SUM(COALESCE(Orders.Amount,0)) AS WeeklySales 
                           FROM DateCTE 
                                     LEFT JOIN @Orders Orders ON dbo.GetEndOfWeek(Orders.OrderDate) = DateCte.WeekEnding 
                   GROUP BY DateCte.WeekEnding ORDER BY DateCte.WeekEnding
                    OPTION (MAXRECURSION 1000)

Note the OPTION MAXRECURSION statement: by default, SQL Server 2005 supports a maximum of 100 recursion levels, but we can increase that.

About the author:  

Kevin S. Goff has been a Microsoft C# .NET MVP for the last three years, from 2005 through 2007. He is a regular columnist for CoDe Magazine, for the Baker’s Dozen Productivity Tips. He is also the author of Pro VS 2005 Reporting using SQL Server and Crystal Reports, published by Apress. He is the principal consultant for Common Ground Solutions, and specializes in solutions in .NET, SQL Server, and Crystal Reports. His main site is www.commongroundsolutions.net, and his blog is www.TheBakersDozen.net .

Email this Author!

Related Links


   Email this pageprinter-friendly version   

Add the above content to:


 Google   Y! MyWeb   Furl   Simpy   Spurl   Blink  Del.icio.us   Digg  add to technorati Technorati


Microsoft Certification | Visual Studio 2005 | SQL Server 2005 |.NET Framework
HomeApply Now  | Master's Program℠  |  Corporate Services  |  .NET Courses  |  .NET Resources  |  Site Map  |  Trademarks