Вы находитесь на странице: 1из 4

Data warehousing is combining data from multiple and usually various sources into one comprehensive and easily

manipulated database. Common accessing systems of data warehousing include queries, analysis and reporting. Because data warehousing creates one database in the end, the number of sources can be anything you want it to be, provided that the system can handle the volume, of course. The final result, however, is homogeneous data, which can be more easily manipulated.

Data warehousing is commonly used by companies to analyze trends over time. In other words, companies use data warehousing to view day-to-day operations, but its primary function is facilitating strategic planning resulting from long-term data overviews. From such overviews, business models, forecasts, and other reports and projections can be made. Routinely, because the data stored in data warehouses is intended to provide more overview-like reporting, the data is read-only. If you want to update the data stored via data warehousing, you'll need to build a new query when you're done. This is not to say that data warehousing involves data that is never updated. On the contrary, the data stored in data warehouses is updated all the time. It's the reporting and the analysis that take more of a long-term view. Data warehousing is not the be-all and end-all for storing all of a company's data. Rather, data warehousing is used to house the necessary data for specific analysis. More comprehensive data storage requires different capacities that are more static and less easily manipulated than those used for data warehousing. Data warehousing is specifically used by larger companies for analyzing larger sets of data for enterprise purposes. Smaller companies wishing to analyze just one subject, for example, usually access data marts, which are much more specific and targeted in their storage and reporting. Data warehousing often includes smaller amounts of data grouped into data marts. In this way, a larger company might have at its disposal both data warehousing and data marts, allowing users to choose the source and functionality depending on current needs. Computerization of business processes; technological advances in transmission and storage of data; and powerful database management tools have made possibilities of data manipulation and analysis. Business managers are eager to explore the repositories of current and historical data to identify trends and patterns in the wrap and hoof of business. They hope to mine data and use them for taking intelligent business decisions. In this context, industries are increasingly focusing on data warehousing, Online Analytical Processing (OLAP), and other related technologies.

What's the Difference Data warehouse and OLAP are terms which are often used interchangeably. Actually they refer to two different components of a decision support system. While data in a data warehouse is composed of the historical data of the organization stored for end user analysis, OLAP is a technology that enables a data warehouse to be used effectively for online analysis using complex analytical queries. The differences between OLAP and data warehouse is tabulated below for ease of understanding:

Data Warehouse Data from different data sources is stored in a relational database for end use analysis. Data from different data sources is stored in a relational database for end use analysis. Data is organized in summarized, aggregated, subject oriented, non volatile patterns. Data in a data warehouse is consolidated, flexible collection of data, Supports analysis of data but does not support online analysis of data. Online Analytical Processing A tool to evaluate and analyze the data in the data warehouse using analytical queries. A tool which helps organize data in the data warehouse using multidimensional models of data aggregation and summarization. Supports the data analyst in real time and enables online analysis of data with speed and flexibility. What is Data Warehousing? Data warehousing is a collection of decision support technologies that enable the knowledge worker, the statistician, the business manager and the executive in processing the information contained in a data warehouse meaningfully and make well informed decisions based on outputs. The Data warehousing system includes backend tools for extracting, cleaning and loading data from Online Transaction Processing (OLTP) Databases and historical repositories of data. It also consists of the Data storage area--composed of the Data warehouse, the data marts and the Data store. It also provides for tools like OLAP for organizing, partitioning and summarizing data in the data warehouse and data marts and finally contains front end tools for mining, querying, reporting on data. It is important to distinguish between a Data warehouse and Data warehousing.

A Data warehouse is a component of the data warehousing system. It is a facility that provides for a consolidated, flexible and accessible collection of data for end user reporting and analysis. A data warehouse has been defined by Inmom (considered one of the founders of the Data warehouse concept) as a subject-oriented, integrated, time-varying, non-volatile collection of data that is primarily used in organizational decision making. 1. The data in a data warehouse is categorized on the basis of the subject area and hence it is subject oriented 2. Universal naming conventions, measurements, classifications and so on used in the data warehouse, provide an enterprise consolidated view of data and therefore it is designated as integrated. 3. The data once loaded can only be read. Users cannot make changes to the data and this makes it non-volatile. 4. Finally data is stored for long periods of time quantified in years and bears a time and date stamp and therefore it is described as time variant.

Data Mining Overview Data mining is a system of searching through large amounts of data for patterns. It is a relatively new concept which is directly related to computer science. Despite this, it can be used with a number of older computer techniques such as pattern recognition and statistics. The goal of data mining is to extract important information from data that was not previously known. Data mining is a technique that has a large number of applications in a wide variety of different fields. However, it is commonly used by businesses or organizations that need to recognize certain patterns or trends. For example, the sales department of a company may use data mining to track the type of items a customer buys. As they begin observing the type of items the customer purchases, they may notice that the customer buys a large quantity of coffee on a regular basis. The data mining program will automatically make a connection between that specific customer and a certain brand of coffee. The sales department can then use this information to launch a direct mail campaign which informs the customer of a sale that is being held for the brand of coffee they enjoy buying. Because they know that the customer likes a particular brand of coffee, it is highly likely that they will purchace large quantities of it because it is on sale. In addition to this, the company can also market other products or services to the customer.

Data mining is profitable in a situation like this because it allows the company to find out detailed information about their customer that would have been difficult to determine otherwise. Data mining can also be used to track behaviors within a system for a long period of time. For example, a large retail store chain may use data mining to analyze the type and number of items which are purchased over a five year period. The company may find that a certain brand of toothpaste and tooth brushes are purchased in large quantities. Based on this information, the retail store chain could then proceed to take the toothpaste and tooth brushes and put them next to each other. This will allow their profits to greatly increase. The name for this sales method is Market Basket Analysis. The increasing popularity of parallel computing has made it possible to search through massive amounts of data without the need for a theoretical framework. One important factor of data mining is that it will often be used to analyze information from a variety of different perspectives. The important information that is gained from data mining can be used to increase profits or lower costs. There are a number of software products that have been designed for those who wish to use data mining techniques. Once you're able to search through large amounts of information, you will be able to analyze it in a large number of different ways. Once you've analyzed the information, you can make conclusions and decisions which are based on logic. While the term data mining is a new, the concept of searching through data for patterns is not. Many large companies have powerful computers that allows them to search through information to analyze reports over a given period of time. What sets data mining apart from these older research methods is that data mining is a result of the advancement of computer processing power. In addition to this, they storage capabilities of contemporary computers have allowed data mining to be much more accurate than techniques that were used in the past. Because most data mining tools come in the form of software, the costs involved with searching and analyzing information have greatly dropped. For example, if a retail store noticed that a large number of their customers were purchasing alcoholic beverages on Thursday, this would tell them that the drinks are being purchased for the upcoming weekend. The company could use this information by selling the drinks at full price on Thursdays in order to increase their profits.

Data Mining Applications Data mining is a relativey new technology that has not fully matured. Despite this, there are a number of industries that are already using it on a regular basis. Some of these organizations include retail stores, hospitals, banks, and insurance companies.

Вам также может понравиться