When does it make sense to use a Snowflake Schema vs. Star Schema in database design?

What are some practical real-life examples?
Answer:

A star schema is used as a basic implementation of an OLAP cube. If your fact table contains a 1 to many relationship to each of your dimensions in your data warehouse schema then it is appropriate to use a star schema. Although if your fact table has a many to many relationship with its dimensions (i.e. many rows in your fact equal many rows in your dimension) then you must resolve this using a snow flake schema where the bridge table contains a unique key to each row in the fact table. An example of a 1 to many relationship (star schema) is a fact table which contains sales data, and a dimension table which contains a list of stores. 1 store can have many sales but each sale only comes from 1 store. i.e. 1 row in the dimension table can equal many rows in the fact table. To modify the above example to make it a snow flake schema would be as follows: a store can have many sales but each sale can come from many stores. This would be a many to many relationship and you would need a bridge table to implement this functional requirement. Many data warehouse guides (including Kimball's Data Warehouse Tool kit) recommend limiting the implementation of a snow flake schema. Ref: http://www.1keydata.com/datawarehousing/concepts.html

Was this solution helpful to you?

Other answers

Snowflakes can reduce duplicate/repeated attributes & add some normalization in Star Schema. One quick example from the book "Star Schema - Complete Reference"[1]: A product table might have brand, brand code, brand manager; but as there will only a few brands, no need to duplicate the brand attributes; in which case the brand can be a snowflake ... Ref: [1] http://www.amazon.com/Schema-Complete-Reference-Christopher-Adamson/dp/0071744320/

Krishna Sankar

I would only use a snowflake when I am extremely limited in the memory available to me, which means when I am working in a prior generation RDBMS on a 32bit platform. I would never design such a thing, but if I inherited such a beast I might leave it intact. My preference would be to use a 64bit operating system with a new columnar store like (ironically) Snowflake Computing or Vertica or Redshift. In these cases, I would use wide denormalized fact tables with very little or no performance penalty. Snowflake ( http://www.snowflake.net/e) Vertica (http://www.vertica.com/) Amazon Redshift (http://aws.amazon.com/redshift/ )

Michael David Cobb Bowen

I would like to add to many of the interesting posts, Snowflakes can be helpful where analytic apps want users to consume data in a "drill down fashion" fashion. Aka Cube, hierarchy etc.. Date dimensions are easy to understand. Years -> QTR -> Month -> Week -> Day -> Time Other Cubes might be organizational Global Org -> Regional Org -> Division -> Local Product Data often has a lot of Drill down options Prod Cat -> Product Group -> Item Size Etc.

Andrew Hansen

In snowflake schema, you further normalize the dimensions. Ex: a typical Date Dim in a star schema can further be normalized by storing Quarter Dim, Year dim in separate dimensions. Snowflake schema is generally used if: 1) You have a requirement where you don't need to frequently query a certain set of dimension data but still need it for information purposes. By storing this data in a separate dimension, you are reducing redundancy in main dimensions. 2) You have a reporting or cube architecture that needs hierarchies or slicing feature. 3) You have fact tables that have different level of granularity. Ex: You have sales fact table where you are tracking sales at product level. Then you also have budget fact table where you are tracking budgeting by product category. Here is a short video which I think is easy to follow. It is, however, not recommended because it increases the joins and complexity of your query and hence slows down the performance. PS: Bridge tables are not snowflake but bridge tables. The purpose of bridge tables are to resolve m:m relationship. A snowflake dimension would have further (or leaf level) information of the parent dimension stored for usability and storage.

Anonymous

Snowflake is a further normalization of Star schema. You would use it to prevent repetition. An example where this is used is the location dimension. So in a sales data warehouse for example, you might have the following dimensions: User Account Lead Location. but... A user, Account or Lead could have its own location. So instead of repeating location in each of these, you would create a foreign key from each of those dimensions to the Location dimension. An ERD showing this relationship would begin to look like a snowflake. IMO, snowflake is a good practice for a data warehouse in an RDBMS, but not so much in an OLAP database. Feeding an OLAP database de-normalized data is a better practice.

David Badenchini

Related Q & A:

How to talk to a company as a prospective contractor vs. employee?Best solution by Freelancing
Does it ever make sense to use RAM Disk to force RAM allocation for tempdb with SQL Server 2008?Best solution by Database Administrators
How do I change a password directly in the PhpBB3 SQL database?Best solution by Server Fault
A tax professional vs Tax Software?Best solution by Yahoo! Answers
How can I use a HDMI when my TV/Monitor doesn't have internal speakers?Best solution by Yahoo! Answers

Just Added Q & A:

How many active mobile subscribers are there in China?Best solution by Quora
How to find the right vacation?Best solution by bookit.com
How To Make Your Own Primer?Best solution by thekrazycouponlady.com
How do you get the domain & range?Best solution by ChaCha
How do you open pop up blockers?Best solution by Yahoo! Answers

For every problem there is a solution! Proved by Solucija.

Got an issue and looking for advice?
Ask Solucija to search every corner of the Web for help.
Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.