Вы находитесь на странице: 1из 13

Data Mining

R.K. Dwivedi, Dept.of IT, KIET

Introduction
Extraction of knowledge from data. Exploration of large quantities of data to discover meaningful pattern from data. inferring new information from already collected data.

R.K. Dwivedi, Dept.of IT, KIET

Data Mining
Data mining is the entire process of applying computer-based methodology, including new techniques for knowledge discovery, from data. Data mining Techniques involve sophisticated algorithms, including Decision Tree Classifications and Clustering.
R.K. Dwivedi, Dept.of IT, KIET

Cont
some data mining systems such as neural networks are inherently geared towards prediction and pattern recognition, rather than knowledge discovery. These include applications in AI for decision making process.

R.K. Dwivedi, Dept.of IT, KIET

Data Mining Components


Knowledge Discovery Concrete information taken from stored data. Data you may not have known, but which is supported by recorded facts. Knowledge Prediction Uses known data to forecast future trends, events, etc. (ie: Stock market predictions) Whether Knowledge Discovery or Knowledge Prediction, data mining takes information that was once quite difficult to detect and presents it in an easily understandable format (ie: graphical or statistical)
R.K. Dwivedi, Dept.of IT, KIET

Data Mining vs. Data Analysis


In terms of software and the marketing thereof Data Mining != Data Analysis Data Mining implies software uses some intelligence over simple grouping and partitioning of data to infer new information. Data Analysis is more in line with standard statistical software. These usually present information about subsets and relations within the recorded data set (ie: search engine usage, average visit time, etc. )

R.K. Dwivedi, Dept.of IT, KIET

Kinds of Data to be mined:


Data Mining be applicable to any kind of information repository such as : Flat Files, Relational Databases Data Warehouses Advanced Databases: Spatial Data bases, Multimedia Databases , Time Series Databases .

R.K. Dwivedi, Dept.of IT, KIET

Early Steps of Data Mining


Data Preprocessing handling incomplete data, noisy data, uncertain data. Data Selection selects the suitable data for mining purposes. Data representation transforms data into suitable values for the mining algorithm to find patterns.
R.K. Dwivedi, Dept.of IT, KIET

Uses of Data Mining


AI/Machine Learning Game Data Mining Help to analyze strategies to games, and thus developing intelligent AI opponents. (ie: Chess) Business Strategies Market Basket Analysis Identify customer preferences and purchasing patterns. Risk Analysis Product Defect Analysis Analyze product defect rates for given plants and predict possible complications.
R.K. Dwivedi, Dept.of IT, KIET

Uses of Data Mining (Continued)


User Behavior Validation Fraud Detection with credit cards, comparing purchases with historical purchases. Can detect activity with stolen cards. Health and Science Protein Folding Predicting protein interactions and functionality within biological cells.
R.K. Dwivedi, Dept.of IT, KIET

Sources of Data for Mining


Databases Text Documents Computer simulations Social Networks

R.K. Dwivedi, Dept.of IT, KIET

R.K. Dwivedi, Dept.of IT, KIET

Prevalence of Data Mining


Data is already being mined, whether you like it or not. Many web services require that you allow access to your information [for data mining] in order to use the service. Google mines email data in Gmail accounts to present account owners with ads. Facebook requires users to allow access to info from nonFacebook pages. Facebook privacy policy: "We may use information about you that we collect from other sources, including but not limited to newspapers and Internet sources such as blogs, instant messaging services and other users of Facebook, to supplement your profile. This allows access to your blogs, as well as information obtained through partner sites(ebay,amazon).
R.K. Dwivedi, Dept.of IT, KIET

Вам также может понравиться