|
Take
Five with SQL Server 2005
New T-SQL Capabilities in SQL Server 2005 Part 1 of 3
Kevin S. Goff, Microsoft MVP 2007
In
last month's article, I presented some new XML capabilities that
developers can use in SQL Server 2005. This month I'll start a three-part
series on new Transact-SQL language features in SQL Server 2005. Microsoft
added many new language features in SQL Server 2005, in response to developer
requests and also in following compliance with the SQL-99 standard.
This month, we'll cover three new language features:
-
The new PIVOT
capability to convert rows of data into columns
-
The new OUTPUT
statement to immediately return data after an INSERT/UPDATE statement
-
A new TOP N enhancement that allows a developer to specify an
expression as the number parameter
1 - PIVOT
One of the most common tasks that application developers face is translating raw
database rows into some type of analytical view. Users often want to see
data summarized by quarter, or by month, or by some other category with
business significance. Making this happen requires converting rows of data into
columns. In SQL Server 2000, developers basically had two choices:
Query raw SQL data down to the application layer and write code in C# or VB or
some other language to further summarize the data into the required format
Use the T-SQL CASE statement to place the raw data into the required column
format, based on the corresponding condition. For example, suppose you
had a table of OpenBalances, with a column for the DueDate and the BalanceOwed,
and wanted to produce an aging-style report to show open balances by the
standard aging brackets (1-30 days, 31-60, 61-90, etc.), based on an "as of"
aging date variable.
SELECT SUM(CASE WHEN DueDate BETWEEN @dAgingDate-30 AND @dAgingDate
THEN BalanceOwed ELSE 0 END) AS Age30 ,
SUM(CASE WHEN DueDate BETWEEN @dAgingDate-60 AND @dAgingDate-31
THEN BalanceOwed ELSE 0 END) AS Age60 ,
SUM(CASE WHEN DueDate BETWEEN @dAgingDate-90 AND @dAgingDate-61
THEN BalanceOwed ELSE 0 END) AS Age90
FROM OpenBalances
While both approaches work, SQL Server 2005 provides a new PIVOT keyword
to more cleanly and easily accomplish this task. Simply stated,
PIVOT allows developers to turn rows of data into columns. Let's
take a look at two code samples using PIVOT. In our first example, let's
take the orders from the Northwind Orders database, and produce a result set
that summarizes order amounts by quarter for each customer, for the year 1997.
sp_dbcmptlevel
Northwind,
90 --
necessary on older databases
USE northwind
-- Create a table variable to hold the results
DECLARE
@tQtrPivotedTable
TABLE
(CustomerID char(25),
M_Q1 Money,
M_Q2 Money,
M_Q3 Money,
M_Q4 Money)
INSERT
INTO
@tQtrPivotedTable
SELECT
*
FROM
(SELECT
CustomerID,
DATEPART
(q,OrderDate)
as
OrderQtr,
-- grab the Quarter of the month, using
DateParts
(UnitPrice
* Quantity)
as
Amount
FROM
orders OH
JOIN [dbo].[Order
Details] OD
on OH.Orderid
=
OD.orderid
WHERE
Year(OrderDate)=1997)
AS
TempOrders
PIVOT
( SUM(Amount)
FOR
OrderQtr
In
( [1],[2],[3],[4]))
As
X
-- Pivot the Sum of the amount, based on the Quarter being 1 of the 4 values
SELECT
*
FROM @tQtrPivotedTable
-- Dump out the results
The code above determines the corresponding quarter for each Order Date, using
the SQL DatePart function. The code then pivots (i.e. converts rows into
columns) the sum of Order Amounts based on the Quarter being one of four
values. If you wanted to summarize the customer orders by Month, you
would do three things: change the table variable to store one column for each
month, use the DartPart of month (m) instead of quarter (q) and change the
references from OrderQtr to OrderMonth, and PIVOT for the OrderMonth in ([1],
[2], [3]......[12]).
It's critical to note that the list of IN values MUST be a pre-determined,
hard-coded value. The value you PIVOT for (in this case, OrderQtr) must equal
one of the enclosed values. If the list of enclosed values is variable, you
must construct the query string manually and use dynamic SQL.
Let's go back to our aging report example, since this is one of the most common
result sets that business application developers must produce. This is a little
more involved, but demonstrates how we can use PIVOT in more detailed
situation.
First, suppose we have a table of Customer Invoices, containing the invoice
date, invoiced amount, and amount received to date...
DECLARE
@tInvoices TABLE (CustomerID
char(15),
InvoiceNo Char(20),
InvoiceDate DateTime,InvoiceAmount
decimal(14,2),
ReceivedAmount decimal(14,2))
INSERT
INTO @tInvoices VALUES
('Customer 1',
'ABC',
'09-01-2005',
1000, 0)
INSERT INTO @tInvoices
VALUES ('Customer 1',
'DEF',
'10-01-2005',
2000, 100)
INSERT INTO @tInvoices
VALUES ('Customer
1',
'GHI',
'11-01-2005',
3000, 3000)
INSERT INTO @tInvoices
VALUES ('Customer
1',
'JKL',
'12-01-2005',
4000, 175)
INSERT INTO @tInvoices
VALUES ('Customer
1',
'MNO',
'12-18-2005',
4000, 175)
INSERT INTO @tInvoices
VALUES ('Customer
2',
'PQR',
'05-01-2005',
500, 250)
INSERT INTO @tInvoices
VALUES ('Customer
2',
'STU',
'08-01-2005',
12000, 0)
INSERT INTO @tInvoices
VALUES ('Customer
2',
'WYX',
'10-01-2005',
7000, 70)
INSERT INTO @tInvoices
VALUES ('Customer
2',
'YYZ',
'12-01-2005',
3200, 1750)
Next, every aging report must have an 'AS OF' date
(e.g. aging report as of the first of the month...)
DECLARE @dAgingDate
DATETIME
SET @dAgingDate
=
CAST('12-1-2005'
AS
DATETIME)
Also, not every aging report has the same date range. While most define
aging brackets as 1-30 days, 31-60, etc, we need to make it table-driven.
Therefore we'll create a table variable that stores aging definitions by using
a day range.
--
create brackets...could be configurable [1-45 days, etc.]
DECLARE @tAgingBrackets
TABLE
(
StartDay
int,
EndDay int,
BracketNumber
int,
BracketLabel char(20))
INSERT
INTO @tAgingBrackets
VALUES
(0,
30,
1,
'< 30 Days')
INSERT
INTO @tAgingBrackets
VALUES
(31,
60,
2,
'31-60 Days')
INSERT
INTO @tAgingBrackets
VALUES
(61,
90,
3,
'61-90 Days')
INSERT
INTO @tAgingBrackets
VALUES
(91,
120,
4,
'91-120 Days'
)
INSERT
INTO @tAgingBrackets
VALUES
(121,
999999,
5,
'> 120 Days')
We'll create a table variable to hold the result set - each customer and
invoice number/invoice date, and five columns for the five aging brackets:
-- create our result set
DECLARE
@tAgingDetails
TABLE (CustomerID
char(15),
InvoiceNo char(20),
InvoiceDate
DateTime,
Bracket1
decimal(14,2),
Bracket2
decimal(14,2),
Bracket3
decimal(14,2),
Bracket4
decimal(14,2),
Bracket5
decimal(14,2))
Finally, we use the PIVOT statement to convert the Amount Owed (invoice
amount - received to date) to one of five aging brackets, by matching the
BracketNumber from the @tAgingBrackets definition based on the number of days
the invoice has aged falling into the corresponding bracket.
INSERT
INTO @tAgingDetails
SELECT
*
FROM
(select
CustomerID,InvoiceNo,invoicedate,
InvoiceAmount-ReceivedAmountAS
AmountOwed,
TBR.BracketNumber
FROM
@tInvoices TI,
@tAgingBrackets TBR
WHERE InvoiceAmount-ReceivedAmount
<>
0 and
DATEDIFF(dd,invoicedate,@dagingdate)
BETWEEN
TBR.StartDay
and
TBR.EndDay
)
as
Temp
PIVOT
(
SUM(AmountOwed)
FOR
BracketNumber
In
( [1],
[2],[3],[4],[5]))
As
X
SELECT
*
FROM @tAgingBrackets
SELECT
*
FROM @tAgingDetails
ORDER
BY Custom rid,
InvoiceNo
If the example above seems a bit too much to digest, try this: we use the
DateDiff function to determine that the second invoice (dated 10/1/05) is 61
days old, based on an aging date of 12-1-2005. The query does a look-up
into the @tAgingBrackets table where the difference (61 days) is between a
specific startday/endday range. In this case, we find it falls into the
third bracket - and we pivot on the Amount Owed for the Bracket Number being
(in this case) the value of 3.
One final note on PIVOT: I've used table variables to temporarily hold result
sets, partly for self-documenting purposes for this article, In a future
article, I'll cover the new Common Table Expression capability in SQL Server
2005, and how we can use CTEs in place of table variables.
2 - OUTPUT
In SQL Server 2000, have you ever wanted to immediately query a table after an
INSERT or UPDATE statement, to retrieve the value of a column affected by the
original statement? It may be the result of a calculated column, or the
value of an identity column. or some other default value. Typically you'd
have to issue a subsequent SELECT statement, either in a stored procedure or
with another round-trip call from your application back to the server.
Also, in SQL Server 2000, have you ever wanted to get immediate 'old value-new
value' feedback when you UPDATE data? Often, you'd need to tap into the
INSERTED and DELETED system tables, which were only visible in a database
trigger.
For example, in SQL Server 2000, if we want to determine the value of an
IDENTITY column after an INSERT, we'd often use the SQL Server SCOPE_IDENTITY
function:
DECLARE
@tTestTable
TABLE
( MainPK [int]
IDENTITY(1,1)
NOT
NULL,
Name
Char(50))
INSERT INTO
@tTestTable
VALUES ('steve
Goff')
SELECT
SCOPE_IDENTITY()
SQL Server 2005 provides a new OUTPUT statement that allows developers to
accomplish these tasks more conveniently and effeciently. Using OUTPUT in
conjunction with an INSERT/UPDATE statement, we can easily read the information
added/changed. Instead of using SCOPE_IDENTITY, we can immediately OUTPUT the
value of the identity column:
DECLARE
@tTestTable
TABLE
( MainPK [int]
IDENTITY(1,1)
NOT
NULL,
Name
Char(50))
INSERT
@tTestTable
OUTPUT Inserted.MainPK
VALUES
('steve
Goff')
If you issue multiple INSERT statements and need a list of the insertions
afterwards, you can direct the OUTPUT to a table:
DECLARE
@tTestTable
TABLE
( MainPK [int]
IDENTITY(1,1)
NOT
NULL,
Name
Char(50))
DECLARE @tTemp
table
(mainpk int)
INSERT @tTestTable
OUTPUT
Inserted.MainPK
into
@tTemp VALUES
('Kevin
Goff')
INSERT @tTestTable
OUTPUT
Inserted.MainPK
into
@tTemp VALUES
('steve
Goff')
SELECT
*
FROM @tTemp
Next - if you issue an UPDATE statement and want immediate access to the
INSERTED and DELETED system tables to see the new value/old value for specific
columns, you can do the following:
DECLARE @tTest
TABLE
(
MainPK [int] IDENTITY(1,1)
NOT
NULL
,Amount
decimal(10,2))
INSERT
INTO @tTest
VALUES
(100)
INSERT
INTO @tTest
VALUES
(200)
INSERT
INTO @tTest
VALUES
(300)
UPDATE @tTest
SET
Amount =
Amount
* 10
OUTPUT
DELETED.MainPK,
DELETED.Amount
AS
OldValue,
INSERTED.Amount
AS
NewValue
And again, if you want to direct the OUTPUT results to a table variable, you can
do the following:
DECLARE
@tTemp
TABLE
(MainPK int,
OldValue Decimal(10,2),
NewValue Decimal(10,2))
UPDATE
@tTest
SET Amount
= Amount
* 10
OUTPUT DELETED.MainPK,
DELETED.Amount
AS
OldValue,
INSERTED.Amount
AS
NewValue
INTO @tTemp
Just a note: don't think of the new OUTPUT capability as a replacement for audit
trail logging using database triggers. Think of OUTPUT as a convenient
means to get immediate feedback without having to resort to additional
statements and/or round-trips back to the server. Database triggers offer
the best means of implementing audit trail functionality.
3 - TOP N
SQL Server 2000 treated the N in SELECT TOP N statements as a literal.
Developers who wanted to issue a SELECT TOP N statement to return the first N
number of rows would need to use Dynamic SQL (or use the ROWCOUNT statement) if
the N was variable. An example might be an application allows an end user to
set the value of N as a runtime option (e.g. see the top 10 sales,
see the top 5 returns, etc.).
SQL Server 2005 now treats the N as a numeric variable, by allowing developers
to place the N in parenthesis. So if we want to return the TOP 5 (or 10,
etc) orders based on Order Amount descending, we can do the following:
use northwind
DECLARE
@nTop int
SET @nTop
= 5
Select
TOP
(@nTop)
customerid,
oh.orderid,
(unitprice
*
quantity)
as
amount
from
orders OH
join
[dbo].[Order
Details] OD
on oh.orderid
=
od.orderid
order
by
Amount
Desc
You can specify TOP (N) PERCENT, and you can also use the optional WITH
TIES clause, to return additional rows with the same value in the ORDER BY
expression.
The TOP (N) can even be the result of any function or expression that returns an
integer. For example, the following query will return the top 3 rows from
the Orders table based on Freight Descending, because there are 3 rows in the
Shippers table:
SELECT TOP(
SELECT COUNT(*)
FROM SHIPPERS)
* FROM ORDERS ORDER
BY Freight DESC
Finally, you can also apply TOP (N) functionality to INSERT and UPDATE
statements. This is especially helpful when doing special processing on
intermediary result sets. The following code creates a table variable
with two rows, and issues an UPDATE statement to update the first two
rows. Then the code creates a second table variable and inserts the first
two rows from the first table variable.
DECLARE
@nTop
int
SET @nTop
= 2
DECLARE @tTest1
TABLE
( Amount decimal(10,2))
INSERT
INTO @ttest1
VALUES
(
100)
INSERT
INTO @ttest1
VALUES
(
200)
INSERT
INTO @ttest1
VALUES
(
300)
UPDATE
TOP(@nTop)
@tTest1 SET
Amount =
Amount *
10
DECLARE
@tTest2
TABLE
( Amount decimal(10,2))
INSERT
TOP(2)
@tTest2 SELECT
*
FROM
@tTest1 order
by
amount
SELECT
*
FROM @ttest2
Next month...
Next month, in part two of this three-part series on T-SQL 2005 enhancements,
we'll look at the following:
-
The new APPLY operator that allows developers to more effectively integrate
table-valued User-Defined functions (UDFs) with queries
-
New error-handling capabilities to handle errors and exceptions
-
New ranking functions that greatly simplify the process of assigning ranking
numbers to query results - a major benefit for reporting applications and
custom web pagination of result sets
Final Thoughts:
For every piece of information that an article covers, there
are always multiple items that aren’t covered. Remember…GOOGLE is your
friend. When you have time, Google some of the bolded keywords in this article
- it will lead to more valuable information than you ever thought possible.
About the Author:
Kevin S. Goff has been a Microsoft C# .NET MVP for the last
three years, from 2005 through 2007. He is a regular columnist for CoDe
Magazine, for the Baker’s Dozen Productivity Tips. He is also the author
of Pro VS 2005 Reporting using SQL Server and
Crystal
Reports, published by Apress. He is the principal consultant
for Common Ground Solutions, and specializes in solutions in .NET, SQL Server,
and Crystal Reports. His main site is
www.commongroundsolutions.net,
and his blog is
www.TheBakersDozen.net.
Email this Author!
Related Links
Want to learn more about Microsoft SQL Server 2005?
|