Академический Документы
Профессиональный Документы
Культура Документы
Neural network
Clustering
Similarity over
search
ODBS
Object, object
identifier
Inheritance
10/15/08 Jyotsna Chauhan 1
Sequential Pattern
A sequence of actions or events is sought.
Example.. If a patient underwent cardiac surgery for blocked
arteries and later developed high blood urea within a year of
surgery, he or she is likely to suffer from kidney failure within
next app.18 months.
Detection of sequential patterns is equivalent to detecting
association among events with certain temporal relationships.
Example
A sequential rule: A→ B, says that event A
will be immediately followed by event B
with a certain confidence
⇒ P.credit = excellent
∀ person P, P.degree = bachelors and
(P.income ≥ 25,000 and P.income ≤ 75,000)
⇒ P.credit = good
Rules are not necessarily exact: there may be some
misclassifications
Classification rules can be shown compactly as a decision tree.
10/15/08 Jyotsna Chauhan 4
Decision Tree
Fraud Detection
Goal: Predict fraudulent cases in credit card
transactions.
Approach:
Use credit card transactions and the information on its
account-holder as attributes.
When does a customer buy, what does he buy, how often he
pays on time, etc
Label past transactions as fraud or fair transactions. This
forms the class attribute.
Learn a model for the class of the transactions.
Use this model to detect fraud by observing credit card
transactions on an account.
10/15/08 Jyotsna Chauhan 7
Regression
Predict a value of a given continuous valued variable
based on the values of other variables, assuming a
linear or nonlinear model of dependency.
Greatly studied in statistics, neural network fields.
Examples:
Predicting sales amounts of new product based on
advertising expenditure.
Predicting wind velocities as a function of temperature,
humidity, air pressure, etc.
Time series prediction of stock market indices.
Cluster analysis
Grouping a set of data objects into clusters
Clustering is unsupervised classification: no
predefined classes
Typical applications
As a stand-alone tool to get insight into data distribution
As a preprocessing step for other algorithms
16
Similarity search over
sequences
Information is stored in a database in a particular sequence.
We assume that the user specifies a “query sequence” and
wants to retrieve all data sequences that are similar to query
sequence.
Let us begin by describing sequences and similarity between
sequences.
A data sequence X is a series of numbers X=< X1,…….,Xk >.
X is called time series and K is called the length of the sequence.
A subsequence Z = <Z1……Zj> is obtained from another
sequence X by deleting numbers from front and back of the
sequence X.
The Z is subsequence of X if Z1 = Xi, Z2 = X i+1,…… ,Zj = X i+j ,
for some i = 1,……,k+j-1
multimedia;financial apps/forecasting
Geographic Inf. Sys.
CAD/CAM
Network management
employee customer
Eliminatesredundancies
A modular program is
Easier to write
Easier to read
Easier to modify
•Information hiding
–Hides certain implementation details within a module
–Makes these details inaccessible from outside the module
Data abstraction
–Asks you to think what you can do to a collection of data
independently of how you do it
–Allows you to develop each data structure in relative
isolation from the rest of the solution
–A natural extension of procedural abstraction
Good Day