Вы находитесь на странице: 1из 10

Data Warehousing

Mohd Arman Ali | 18- LSM-14 | GJ6233


 What is Data Warehousing

 Definition

 Keywords
 Subject-Oriented
 Integrated
 Time-variant
 Non-volatile
CONTENT
 Data Warehouse Backend Process

 Conclusion
What is Data Warehousing?

 Term was first coined by William H. Inmon (Bill Inmon) in 1990.


 Environment not a product.
 Designed for query and analysis rather than for transaction processing.
 Usually contains historical data derived from transaction data, but can
include data from other sources.
 Technique for collecting and managing data from varied sources to
provide meaningful business insights.
 Blend of Technologies and components which allows the strategic use of
data.
Definition

“Data Warehouse is a Subject oriented, Integrated, Time-variant, Non-


volatile Collection of data in support of the management's decision making
process.”

 Some important keywords from the definition needs explanation : like-

 Subject-Oriented
 Integrated
 Time-variant
 Non-volatile
Subject-Oriented
 Organized around major subjects like customers, products and sales.
Rather than concentrating on the day-to-day operations and transaction
processing of an organization, it focuses on the Modeling and Analysis of
data for decision makers.
 It typically provide a simple and concise view of particular subject issues
by excluding data that are not useful in the decision support process.
For example, to learn more about your company's sales data, you can
build a data warehouse that concentrates on sales. Using this data
warehouse, you can answer questions such as "Who was our best
customer for this item last year?" or "Who is likely to be our best customer
next year?" This ability to define a data warehouse by subject matter, sales
in this case, makes the data warehouse subject oriented.
Integrated

 It is usually constructed by integrating multiple heterogeneous


sources, such as relational databases, Flat Files and Online
Transaction Records.
Flat Files are simple data files in text or binary format with a
structure known by the Data Mining Algorithm to be applied.

 Data cleaning and data integration techniques are applied to


ensure consistency in naming conventions, encoding in structures,
attribute measures and so on.
Time-variant

Data are stored in provide information from an historic perspective (e.g;


the past 5-10 years). Every key structure in the data warehouse contains,
either implicit or explicit, a time element. This is very much in contrast
to online transaction processing (OLTP) systems, where performance
requirements demand that historical data be moved to an archive.

Non-volatile

 Once entered into the data warehouse, data should not change.
 Purpose: to enable you to analyze what has occurred.
Data warehouse backend process
 Data extractions- which gathers data from multiple heterogeneous
and external sources.
 Data cleaning- which detects errors in the data and rectifies them
when possible.
 Data transformation- which converts data from Legacy or host
format to warehouse format.
 Load- which sorts, summarizes, consolidates, computes, views,
checks integrity and build indices and partitions.
 Refresh- which propagates the updates from the data sources to
the warehouse.
Conclusion

 Most leading and reliable technology used today.


 Can be used for planning, forecasting and management for resource
planning, financial forecasting and control for the research on any
aspect.
 Can make available old and new research material such as
periodicals, newspaper, clippings, magazines and any other
resources.
 A lot has been done after the evaluation of the concept of data
warehousing and a lot still needs to be done because of being
rapidly growing system since the early 90s.
 Although seems very useful for any organization.
A presentation designed by MWA

Вам также может понравиться