Posts

Showing posts from 2017

Populating data to the Date Dimension table

Date dimension plays an important role in data warehouse designing. It allows you to examine behavior and trends in the data over a period of time. The columns for the DimDate table (date dimension table) should be chosen according to the requirement of the data analysis.  When entering data to the DimDate table, extracting dates from the source database is not a good method because there can be missing days in between as the transactions may not have occurred on each day of the year.  Therefore, the best way is to populate data to the DimDate table separately, as the date is an universal attribute. There are various methods of populating data to the table;     Using an Excel file In this method, an excel spreadsheet which contains the date values is used as the data source and it’s extracted to the data warehouse using SSIS.     Using a SQL script In this method, a SQL script or a stored procedure is used to generate dates between tw...

Dimensional modeling - Star schema vs Fact constellation

Image
Dimensional model of Data warehousing Ralph Kimball introduced the dimensional modeling in 1996 with his book, The Data Warehouse Toolkit. Dimensional modeling is implemented by de-normalizing the database for the reporting purposes, because the normalized databases are too hard to query and extract data for business reports. There are two main approaches for implementing dimensional data warehousing models:           Star schema           Snowflake schema They describe how the dimensions relate to the fact tables.  Facts and fact tables hold the information that the business is concerned about,  that measure the performance of the organization. Dimension tables contain descriptive attributes which gives the ability to the qualify the measures (facts). These two tables are related to each other and are connected by a key. Granularity  is the measure of how t...