This article describes the ways to optimize star schemas and making your data warehouse run faster

This article is not database specific

Basic Star Schema of a retail shop

Star Schema


In the previous article, we have described the ways of calculating SQL query execution time.

Let's think about what can be done about Query performance.

We have only two options

  • Improve hardware
  • Make the database server do less work
Improving Datawarehouse hardware

There are a lot of ways of improving hardware. You can add memory processors to increase your hard drive size etc. But quite often it does not help

We have seen some configurations with 64 gigs of memory and 32 processors

With all 32 processors doing nothing but the database is running very slow.

All this because the DBA decided to use raid 5 for better protection.
The performance of your disks this the most important part of the data warehouse.

It defines it. It tells how long do you have before the system becomes unusable due to pour performance

HDtach is a very useful utility for checking disk performance, you would be very surprised when compare the laptop with the servers. Modern laptops quite often much faster.


The bottom line is:

  • HDD performance is the key 
  • Do not use RAID 5 use RAID 10
Making the database server do less.


Let's go back to the Star Schema Diagram

STATE CODE field is 5 bytes long
Everybody knows that the US state abbreviation is always two characters long

if we change our definition from nvarchar(2) to nchar(2) field size changes from 5 bytes to 4

Next well know thing, that there are no international characters in US state abbreviation. There is no need to use Unicode and we can change field type from nchar(2) to char(2)

The field length is 2 bytes now.

Everybody knows that there are only 52 states in US

What we can do next, is to move STATE CODE out of the fact table into the dimension and use STATE_ID BYTE in fact table as state identifier.

By using common sense we were able to reduce the field size five times. The same technic can be applied to the rest of the fields.

Conclusion

Make sure your disk fast and fact table record size as small as possible

Direct link, no registration required.