Академический Документы
Профессиональный Документы
Культура Документы
Dr P.S.R.Chandra Murthy
Assistant Professor
Dept. of CSE,
University College Of By
Engineering & Technology
ANU, Mrs Bhargavi .P
Guntur. Roll No :Y15CSER008
Objective
The purpose of this paper is to give survey on the
applications, challenges of data streams and issues,
various techniques developed for mining frequent
patterns from data streams, analysis and comparative
study of those techniques.
Introduction
Large volumes of data can be mined for interesting and
relevant information in a wide variety of applications.
Ex - simple transactions of everyday life such as using a
credit card, a phone or browsing the web lead to
automated data storage.
When the volume of the underlying data is very large, it
leads to a number of computational and mining
challenges.
Currently, a large class of data-intensive applications, in
which data is in the form of continuous streams, has
been widely recognized.
Issues and Challenges
• Frequent information of the data may not hold for a
long time.
• un-expected information which is not considered as
frequent may become frequent.
• some may show their dominance even though not
active in the recent time.
• some of the patterns may be referred as infrequent
even though having sufficient appearances in recent
transactions.
• Redundancy leads to unambiguous decisions.
Proposed Model
Dynamic models should be developed for managing
both the evolving tuples and the candidate frequent
item sets.
Appropriate models are designed to find the maximal,
closed frequent item set and generators.
Efficient Data structures are required to handle data
coming from various sources.
Frequent Item set Mining
Frequent Item set Mining FIM is a process of analyzing
transactions of a data base for deriving hidden
knowledge which is a pair of items that are occurred
together in more transactions.
FIM plays an important role in Association rule mining ,
hence it led to several techniques.
FIM techniques are classified into two categories that
are Candidate generate and test based approaches and
Frequent Pattern Growth (FPG) approaches.
Contd..
Due to the continuous flow of data, the discovered
patterns may be violated for incoming data.
Traditional data mining techniques of static databases
are failed to extract knowledge from data streams.
As a result it has motivated the researchers to apply
frequent pattern mining concepts and new techniques
for aiming the applications of data streams.
Mining Frequent Patterns
Land mark window model
Sliding window model
Damped window model
Title Time Window Model
Research Issues
Data streams are a sequence of ordered transactions.
The challenging issue is to extract variation of frequent
item sets over the transactions
Data streams are unbounded data, it is challenging to
use memory efficiently for maintaining large amount of
frequent information, data structures to compute the
cumulative support
Incoming data rate is high, it is challenging issue to
match the execution speed of the proposed approaches.
Hence it is required to have an approach which takes
less number of passes over the data stream
Applications
Network
Intrusion Detection
Sensor Network Analysis
Environmental and Weather Data
Future Directions
Data structure
Incremental Mining
Multiple resources
Multidimensional Mining
Visualization of rules and patterns
Tuple Evaluation
Rare patterns
Outlier detection
THAN
Q
?