Data Warehouse

A system used for reporting and data analysis from various sources to provide business insights

What is a Data Warehouse?

A data warehouse (often abbreviated as DW or DWH) is a system used for reporting and data analysis from various sources to provide business insights. They are usually used as a central repository to connect and integrate data from homogenous sources.

 

Data Warehouse

 

Summary

  • A data warehouse (often abbreviated as DW or DWH) is a system used for reporting and data analysis from various sources to provide business insights. It operates as a central repository where information arrives from various sources.
  • Once in the data warehouse, the data is ingested, transformed, processed, and made accessible for use in decision-making.
  • The three main types of data warehouses are enterprise data warehouse (EDW), operational data store (ODS), and a data mart.

 

History of Data Warehouses

The concept of data warehouses first came into use in the 1980s when IBM researchers Paul Murphy and Barry Devlin developed the business data warehouse. American computer scientist Bill Inmon is considered the “father” of the data warehouse due to his authorship of several works, such as the Corporate Information Factory and other topics on the building, usage, and maintenance of the data warehouse.

Inmon wrote the first book, held the first conference, and offered the first classes on data warehouses and is known for his creation of the definition of a data warehouse – “a subject-oriented, nonvolatile, integrated, time-variant collection of data in support of management’s decisions.”

 

Types of Data Warehouses

The three main types of data warehouses are enterprise data warehouse (EDW), operational data store (ODS), and a data mart.

 

Enterprise Data Warehouse (EDW)

An enterprise data warehouse (EDW) is a centralized warehouse that provides decision support services across the enterprise. EDWs are usually a collection of databases that offer a unified approach for organizing data and classifying data according to subject.

 

Operational Data Store (ODS)

An operational data store (ODS) is a central database used for operational reporting as a data source for the enterprise data warehouse described above. An ODS is a complementary element to an EDW and is used for operational reporting, controls, and decision making.

An ODS is refreshed in real-time, which makes it preferable for routine activities such as storing employee records. An EDW, on the other hand, is used for tactical and strategic decision support.

 

Data Mart

A data mart is considered a subset of a data warehouse and is usually oriented to a specific team or business line, such as finance or sales. It is subject-oriented, making specific data available to a defined group of users more quickly, providing them with critical insights. The availability of specific data ensures that they do not need to waste time searching through an entire data warehouse.

 

How Do Data Warehouses Work?

A data warehouse operates as a central repository where information arrives from various sources. The data that flows in may be structured, semi-structured, or unstructured and may come from internal applications, customer-facing applications, and external systems.

Once in the data warehouse, the data is ingested, transformed, and processed so that the users can access the processed data to use towards decision-making. By merging large quantities of information in the data warehouse, an organization can form a more holistic analysis to ensure that it already considered all the available information before making a decision.

 

Benefits of Data Warehouses

Data warehouses offer many overarching benefits for companies that use them, with the primary benefit of storing and analyzing large amounts of variant data and extract value from them while keeping historical data for record-keeping.

Bill Inmon, the father of data warehousing, gave four unique characteristics of data warehouses, namely being subject-oriented to focus on a particular area, ability to integrate different data types from various sources, non-volatile (stable), and time-variant, which analyzes changes over time.

Well-designed data warehouses perform queries quickly, deliver high-quality data, and allow end-users to reduce the volume of data (if wanted) to examine a certain area closely. Achieving faster decisions is a critical aspect of data warehouses since it provides data in consistent formats ready to be analyzed.

Also, data warehouses provide the analytical power and a complete dataset to base decisions on hard facts instead of hunches or incomplete or poor quality data.

 

Additional Resources

CFI is the official provider of the Business Intelligence & Data Analyst (BIDA)® certification program, designed to transform anyone into a world-class financial analyst.

In order to help you become a world-class financial analyst and advance your career to your fullest potential, these additional resources will be beneficial:

  • Business Intelligence
  • Corporate Strategy
  • Data Breach
  • Time Series Data Analysis