How to design an LDAP schema?

When does it make sense to use a Snowflake Schema vs. Star Schema in database design?

  • What are some practical real-life examples?

  • Answer:

    A star schema is used as a basic implementation of an OLAP cube. If your fact table contains a 1 to many relationship to each of your dimensions in your data warehouse schema then it is appropriate to use a star schema. Although if your fact table has a many to many relationship with its dimensions (i.e. many rows in your fact equal many rows in your dimension) then you must resolve this using a snow flake schema where the bridge table contains a unique key to each row in the fact table. An example of a 1 to many relationship (star schema) is a fact table which contains sales data, and a dimension table which contains a list of stores. 1 store can have many sales but each sale only comes from 1 store. i.e. 1 row in the dimension table can equal many rows in the fact table. To modify the above example to make it a snow flake schema would be as follows: a store can have many sales but each sale can come from many stores. This would be a many to many relationship and you would need a bridge table to implement this functional requirement. Many data warehouse guides (including Kimball's Data Warehouse Tool kit) recommend limiting the implementation of a snow flake schema. Ref: http://www.1keydata.com/datawarehousing/concepts.html

John Cook at Quora Visit the source

Was this solution helpful to you?

Other answers

Snowflakes can reduce duplicate/repeated attributes & add some normalization in Star Schema. One quick example from the book "Star Schema - Complete Reference"[1]: A product table might have brand, brand code, brand manager; but as there will only a few brands, no need to duplicate the brand attributes; in which case the brand can be a snowflake ... Ref: [1] http://www.amazon.com/Schema-Complete-Reference-Christopher-Adamson/dp/0071744320/

Krishna Sankar

I would only use a snowflake when I am extremely limited in the memory available to me, which means when I am working in a prior generation RDBMS on a 32bit platform. I would never design such a thing, but if I inherited such a beast I might leave it intact. My preference would be to use a 64bit operating system with a new columnar store like (ironically) Snowflake Computing or Vertica or Redshift. In these cases, I would use wide denormalized fact tables with very little or no performance penalty. Snowflake ( http://www.snowflake.net/e) Vertica (http://www.vertica.com/) Amazon Redshift (http://aws.amazon.com/redshift/ )

Michael David Cobb Bowen

I would like to add to many of the interesting posts, Snowflakes can be helpful where analytic apps want users to consume data in a "drill down fashion" fashion.  Aka Cube, hierarchy etc.. Date dimensions are easy to understand.  Years -> QTR -> Month -> Week -> Day -> Time Other Cubes might be organizational Global Org -> Regional Org -> Division -> Local Product Data often has a lot of Drill down options Prod Cat -> Product Group -> Item Size Etc.

Andrew Hansen

In snowflake schema, you further normalize the dimensions. Ex: a typical Date Dim in a star schema can further be normalized by storing Quarter Dim, Year dim in separate dimensions. Snowflake schema is generally used if: 1) You have a requirement where you don't need to frequently query a certain set of dimension data but still need it for information purposes. By storing this data in a separate dimension,  you are reducing redundancy in main dimensions. 2) You have a reporting or cube architecture that needs hierarchies or slicing feature. 3) You have fact tables that have different level of granularity. Ex: You have sales fact table where you are tracking sales at product level. Then you also have budget fact table where you are tracking budgeting by product category. Here is a short video which I think is easy to follow. It is, however, not recommended because it increases the joins and complexity of your query and hence slows down the performance. PS: Bridge tables are not snowflake but bridge tables. The purpose of bridge tables are to resolve m:m relationship. A snowflake dimension would have further (or leaf level) information of the parent dimension stored for usability and storage.

Anonymous

Snowflake is a further normalization of Star schema. You would use it to prevent repetition.  An example where this is used is the location dimension. So in a sales data warehouse for example, you might have the following dimensions: User Account Lead Location. but... A user, Account or Lead could have its own location. So instead of repeating location in each of these, you would create a foreign key from each of those dimensions to the Location dimension.  An ERD showing this relationship would begin to look like a snowflake. IMO, snowflake is a good practice for a data warehouse in an RDBMS, but not so much in an OLAP database.  Feeding an OLAP database de-normalized data is a better practice.

David Badenchini

Just Added Q & A:

Find solution

For every problem there is a solution! Proved by Solucija.

  • Got an issue and looking for advice?

  • Ask Solucija to search every corner of the Web for help.

  • Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.