Вы находитесь на странице: 1из 4

Data warehousing

What is a data warehouse?


A data warehouse is a powerful database model

That significantly enhances the user’s ability to

Quickly analyze large, multidimensional data sets

It cleanses and organizes data to allow users to

Make business decisions based on facts.

Data warehousing Definition:


“Data warehousing is an aspect to gather data from multiple sources
into central repository, called data warehouse.”

“A data warehouse is simply a single complete, and consistent store of


data obtained from a variety of sources and made available to end users
in a way they can understand and use it in a business context.”

Data warehouse processes


Data cleaning

Data integration

Data transformation

Data loading

Periodic data refreshing

History of data warehousing


The concept of data warehousing dates back to the late 1980s when
IBM researcher Barry Davlin and Paul murphy developed the “the
business data warehouse.”

In essence, the data warehousing concept was intended to provide an


architectural model for the flow of data from operational systems to
decisions support environments.

Facts about data warehousing:


Issues involved in warehousing include techniques for dealing with
errors and techniques for efficient storage and indexing of large
volumes of data.

This system is used for reporting and data analysis.

It usually contains historical data derived from transaction data.

Data warehousing is not meant for current “live” data.

Components of data warehouse


Sources data source interaction

Data transformation

Data warehouse (data storage)

Reporting (data presentation)

Metadata
Data warehouse Advantages:
Complete control over the four main areas

Of data management systems:

Clean data

Query processing: multi options

Indexes: multiple types

Security: data and access

Data warehouse disadvantages:


Adding new data sources takes time and associated high cost.

Data owners lose control over their data, raising ownership, security
and privacy issues.

Long initial implementation time and associated high cost.

Difficult to accommodate changes in data types and ranges, data source


schema, indexes and queries.

Characteristics of data warehousing:


Subject oriented: A data warehouse can be used to analyze particular
subject area.

For example:” sales” can be particular subject.

Integrated: a data warehouse integrates data from multiple data


sources.

For example: sources A and sources B may have different ways of


identifying a product, but in a data warehouse, there will be only a
single way of identifying a product.

Data warehouse usage:


Three kinds of data warehouse applications

Information processing: supports querying, basic statistical analysis,


and reporting using crosstabs, tables, charts, and graphs.

Analytical processing: multidimensional analysis of data warehouse


data. Supports OLAP operations, slice dice, drilling, pivoting.

Data mining: knowledge discovery from hidden patterns supports


associations, constructing analytical models, performing classification
and prediction, and presenting the mining results using visualization
tools.

Вам также может понравиться