Building OLAP Date Dimensions
I-Data Weekly - March 31, 2008
Kevin S. Goff, Microsoft MVP 2008
Welcome to I-Data Weekly, a technical blog dedicated to Business
Intelligence with the Microsoft SQL Server family of products.
I-Data stands for Intelligent Data, reflecting the goal of presenting the type of
content you'll find in SetFocus' Master's
Program - Business Intelligence Track .
Each week, we'll present a different technical piece from our
Business Intelligence Courseware - covering items such as data patterns
with SSIS packages, building hierarchical OLAP cube dimensions, custom MDX code,
advanced reporting with SQL Server Reporting Services, dashboard construction with
PerformancePoint Server, and SharePoint integration. As Microsoft is scheduling
SQL Server 2008 for release later this year, we'll also include some coverage of
the new BI features that SQL 2008 has to offer.
This week's installment of I-Data Weekly focuses on building
a custom OLAP Date Dimension. Virtually all businesses need to
summarize data by date range - therefore, it's critical to represent a company's
business calendar in the database system. You need to have a date system in
place, to be able to query and summarize data for important date groupings.
To do this, we'll take a look at the following:
- Defining our end result: an OLAP date dimension hierarchy
- Starting off by creating a base Date Calendar of date definitions in a SQL Server
table
- Creating an SSAS project that uses the Date Calendar table as a Data Source View,
and adds some additional definitions to the Calendar
- Building the OLAP Date Dimension, and defining some properties for the dimension
- Writing some queries against an OLAP sales database to summarize data by month,
quarter, etc.
1 - Defining the end result: an OLAP Date Dimension Hierarchy
The
SetFocus BI training methodology involves starting at the end, to visualize
the final result, and then working backwards (step by step) to build the results.
Our goal here is to build an OLAP date dimension hierarchy that will allow
us to do the following:
- Summarize sales data by Year, by Quarter, by Month, or by individual day
- Establish a fiscal calendar, where sales months may be different than calendar months
(for instance, the last few days of March might actually be part of a business month
for April)
- Build definitions for Month Names and Quarter Names so that you can sort them chronologically
on something other than the display name (for instance, display months as "Jan 2006",
"Feb 2006", etc. while internally storing the key values as sortable values, such
as "200601", "200602", etc.)
- Establish a Date Hierarchy, so that you can query and return sales data in a tree-like
format (Year, Quarter, Month, etc.)
Here is our "end result", a Date dimension called Date Calendar, with dimension
attributes for Year (BzYear), Quarter (BzQuarter),
Month (BzMonth), and Actual Date (Full Date).
The Dimension also contains a hierarchy called Bz Calendar Tree,
for drilldown capabilities.
(Special note: many of these images have been resized - if
you want to see the entire original image, simply right-click on the image and take
the "Save Picture As..." option.)

2 - Creating a base Date Calendar of Date Definitions
So how do we go about building such a Dimension? There are several steps,
but fortunately each step isn't that difficult. The very first thing you need
to do is establish a database table that contains all calendar dates, along with
the associated Business Month, Business Quarter, and Business Year for each date.
Here we have a table called DateCalendar, with some example entries:

While the example above is a very simple one (each Business Month/Quarter/Year is
based on the calendar value), you could just as easily define the entries based
on the company business month/quarter/year. Additionally, you might have two
sets of date attributes, one for the actual calendar dates and one for the business
dates.
You can build this calendar with date values going back to your business' "beginning
of time", and stretch it out as far in advance as you wish.
The next step is to read this data into Analysis Services, so that we can build
an OLAP dimension.
3 - Create an SSAS project that uses the Date Calendar table as a Data Source View,
and adds some additional definitions to the Calendar
Your next step would be to create a SQL Server 2005 Analysis Services project that
uses the database (to which the calendar belongs) as the data source.
Once you create a data source view and include the DateCalendar
table from the first step, you'll want to include some additional calculated expression
columns to represent the "English-like" display for month and quarter. Let's
assume you've agreed that you want to show months in the form "MMM YY",
and you want to show quarters in the form "Q1 YY".
So you'll need to add two columns expressions for those. Additionally, you'll
want to include two expressions to drive the sort order for the Month and Quarter,
since we can't directly sort on the display values (at least not if we want the
sort order to be correct!). So that means we need to add a total of
four column expressions.
Our "end goal" for this step is the following two images, which show the DateCalendar
table in an SSAS Data Source view, along with the four new columns (our new
display columns, MonthDesc and QuarterDesc, along
with their sort values, MonthValue and QuarterValue).


To create the actual calculated expressions, you can right-click on the table in
the Data Source View, and take the option to add a new calculated expression.
For example, the expression for the Month Description would use the T-SQL SUBSTRING,
DATENAME, AND CAST functions to build a display
string.

Similarly, you can use T-SQL to create the calculated expression for MonthValue
(which we'll use as the sort key for Month Desc).

One final note: you could just as easily have built these four calculated expressions
as part of the root DateCalendar table.
4 - Build the OLAP Date Dimension, and define some properties for the dimension
After establishing all of the columns in the Data Source view (the base date and
numeric month/quarter/year columns) and the calculated display expressions, you
can then turn your attention towards building the Date Calendar Dimension.
The next figure represents the SSAS 2005 Date Dimension designer. Note
from the figure that not all of the columns from the Data Source View (on the right)
appear as attributes (on the left). The attributes represent the columns that
you'll use to "slice and dice" the sales data (in this case, by Year, Quarter, or
Month). While the calculated columns for MonthValue
and QuarterValue don't appear as attributes, we still need to use
them as the actual key values for their description counterparts. As you can
see in the figure below, you can right-click on an attribute (e.g. Month
Desc), pull up the property sheet, and set the Key Column to MonthValue.
(You'll then repeat the process for the QuarterDesc/QuarterValue
combination).

Finally, to create the hierarchy, you can drag the dimension attributes into the
hierarchy design area. Also note that we've renamed the attribute columns
(this is optional, and simply based on your naming convention preferences).

Note:
The
SetFocus BI Master's Program devotes an entire week to SQL Server 2005 Analysis
Services, including dimension/hierarchy design.
5 - Writing some queries against an OLAP sales database to summarize data by month,
quarter, etc.
After you've processed the OLAP cube, you can start writing MDX queries to summarize
data by any of the dimension attributes in the Date Calendar.
For example, here's a query to sort dollar sales by BzMonth, in
descending order.

Note that the query uses [BzMonth].CurrentMember.Properties('Key')
for the second parameter of the Sort function. This is necessary because we established
the key property for BzMonth as the calculated expression for MonthValue.
Similarly, if you wanted to sort the sales dollars by Quarter in descending order,
you'd use the following query:
Finally, if you want to use the hierarchy to display sales in hierarchy order (Year,
Quarter, Month, Date), you can use the MDX DESCENDANTS function, as follows:
Note: The
SetFocus BI Master's Program devotes an entire week to MDX Programming.
Next week: One of the most powerful tools in the Microsoft BI stack is SQL
Server 2005 Integration Services. SSIS 2005 is a complete replacement
for DTS and provides tremendous functionality for ETL (extract, transformation,
and load) operations. SSIS 2005 is so powerful, that new users often need
time to grasp just how powerful SSIS tasks can be. Next week we'll take our first
look at how we can use SSIS 2005 to perform ETL processing.
Final thoughts:
This week's article provided a basic walkthrough for creating a date calendar dimension
and hierarchy, along with a few MDX examples for summarizing data by date dimension
attributes. We could have done other things as well, such as implementing
multiple sets of dates, adding additional attributes (week number, half-year, etc.).
Additionally, we could have added other attributes to account for seasonality
, as some companies perform historical trend analysis based on events that occur
on different dates every year (e.g. Easter). Finally, we
could have implemented some MDX named sets, to retrieve information such as the
last 12 months or last 4 quarters of available data. We'll explore those at
a later date.
About the author: