Вы находитесь на странице: 1из 62

Advance database

Advance DataBase
Advance DataBase

DATA
• In general, data is any set of characters that has been gathered and translated for
some purpose, usually analysis. It can be any character, including text and numbers,
pictures, sound, statistics or video.
• Eg: 0143 0157 0164 0145 0162 0040 0150 0157 0160 0145
• @ : > ?~#”*(){/|\
Advance DataBase

Information
• Process form of data that bearing some meaning.
• Managed data upon which a decision can made.
• Information is any entity or form that resolves uncertainty or provides
the answer to a question of some kind.
Advance DataBase

Terms
• Table
• Row (record)
• Column
• Redundancy
• Inconsistency
• Integrity
Advance DataBase

Supervised data classification


• In supervise data classification data are taken from an entity and then labeled manually.
• The labeled data is then put into an algorithm for classification.
• The algorithm classified that data into groups and generate models this is called training phase.
• In recognition or prediction phase the unseen data are given to the algorithms and the algorithms generate models and then match
with models of training phase.
Advance DataBase
Advance DataBase

Unsupervised data (clustaring)


• No labels are given to the data for unsupervised learning algorithms, leaving it on its own to find structure in its input.
Unsupervised learning can be a goal in itself (discovering hidden patterns in data) or a means towards an end.
Advance DataBase

Unsupervised ML
Advance DataBase

Euclidean distance
• The Euclidean distance between points p and q is the length of the line segment connecting them ( pq ) =
• Two dimensions:
• In the Euclidean plane, if p = (p1, p2) and q = (q1, q2) then the distance is

• Example
Advance DataBase

The K-Means Clustering

1. Given k, the k-means algorithm is implemented in 4 steps:

2. Partition objects into k nonempty subsets

3. Compute seed points as the centroids of the clusters of the current partition. The
centroid is the center (mean point) of the cluster.

4. Assign each object to the cluster with the nearest seed point.

5. Go back to Step 2, stop when no more new assignment.


Advance DataBase

Diagram Architecture For K-mean


Advance DataBase

The K-Means Clustering Method

10 10
10
9 9
9
8 8
8
7 7
7
6 6
6
5 5
5
4
Update the
4
4
Assign each 3 3
3 cluster
objects to 2 2
2
means
1
most similar 1 1

0
center 0 0

0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10

reassign reassign

10 10
K=2 9 9

Arbitrarily choose K object as


8 8

initial cluster center


7 7

6 6

5 5

4 Update the 4

3 cluster 3

2 means 2

1 1

0 0
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
Advance DataBase

K-Means Example
• Given: {2,4,10,12,3,20,30,11,25}, k=2
• Randomly assign means: m1=3,m2=4
• K1={2,3}, K2={4,10,12,20,30,11,25}, m1=2.5,m2=16
• K1={2,3,4},K2={10,12,20,30,11,25}, m1=3,m2=18
• K1={2,3,4,10},K2={12,20,30,11,25}, m1=4.75,m2=19.6
• K1={2,3,4,10,11,12},K2={20,30,25}, m1=7,m2=25
• Stop as the clusters with these means are the same.
Advance DataBase

K-Nearest Neighbors Algorithm


• K-nearest neighbor is a supervised learning algorithm (classification) where the result of new instance query is classified based on
majority of K-nearest neighbor category.
• The purpose of this algorithm is to classify a new object based on attributes and training .
• Given a query point, we find K number of objects or (training points) closest to the query point.
• The classification is using majority vote among the classification of the K objects.
Advance DataBase

1. Determine parameter K = number of nearest neighbors


2. Calculate the distance between the query-instance and all the training samples
3. Sort the distance and determine nearest neighbors based on the K-th minimum distance
4. Gather the category of the nearest neighbors
5. Use simple majority of the category of nearest neighbors as the prediction value of
the query instance
Advance DataBase

Knn Example
• We have data from the questionnaires survey (to ask people opinion) and objective testing with two attributes (acid durability and
strength) to classify whether a special paper tissue is good or not. Here is four training samples .
• Now the factory produces a new paper tissue that pass laboratory test with X1 = 3 and X2 = 7. Without another expensive survey,
can we guess what the classification of this new tissue is?

X1 = Acid Durability X2 = Strength Y = Classification


(seconds)
(kg/square meter)
7 7 Bad
7 4 Bad
3 4 Good
1 4 Good
Advance DataBase

1. Determine parameter K = number of nearest neighbors


Suppose use K = 3
2. Calculate the distance between the query instance and all the training samples
Coordinate of query instance is (3, 7), instead of calculating the distance we compute square distance which is faster to calculate
(without square root)

X1 = Acid Durability X2 = Strength Square Distance to query


(seconds) instance (3, 7)
(kg/square meter)
7 7

7 4

3 4

1 4
Advance DataBase

3. Sort the distance and determine nearest neighbors based on the K-th minimum distance.

X1 = Acid X2 = Strength Square Distance to Rank Is it included


Durability query instance (3, 7) minimum in 3-Nearest
(seconds) (kg/square distance neighbors?
meter)
7 7 3 Yes

7 4 4 No

3 4 1 Yes

1 4 2 Yes
Advance DataBase

4. Gather the category of the nearest neighbors. Notice in the second row last column that the category of nearest neighbor (Y) is
not included because the rank of this data is more than 3 (=K).

X1 = Acid X2 = Square Distance to Rank Is it Y=


Durability Strength query instance (3, 7) minimum included in Category
(seconds) distance 3-Nearest of nearest
(kg/square neighbors? Neighbor
meter)
7 7 3 Yes Bad

7 4 4 No -

3 4 1 Yes Good

1 4 2 Yes Good
Advance DataBase

5. Use simple majority of the category of nearest neighbors as the prediction value of the query instance

We have 2 good and 1 bad, since 2>1 then we conclude that a new paper tissue that pass laboratory test with X1 = 3 and X2 = 7 is
included in Good category.
Advance DataBase

ANN Introduction
• Inspired by biological brain
• imitate the human brain's ability to make decisions and draw conclusions when presented with complex, noisy, irrelevant, and/or
partial information.
• Go by many names such as connectionist models, parallel distributed processing models
• Models are based on dense interconnection of simple computational elements.
Advance DataBase

ANN Applications
• Aerospace: autopilot, flight path simulation, fault detection
• Banking : fraud detection, credit application evaluation
• Defense: weapon steering, target tracking, image and signal processing data
compression
• Electronics: voice synthesis, IC chip layout, process control
• Entertainment: animation
• Financial: market forecasting
• Insurance: Policy application evaluation
• Manufacturing: planning and management, process control, machine diagnosis)
• Medical: ECG, Cancer cells analysis
• Oil and Gas : Exploration
• Robotics: Manipulator controller, vision system
• Speech: Speech recognition, compression, vowel classification
• Telecommunications : Image and data compression
• Transportation: Vehicle scheduling, routing system
Advance DataBase

Assignment
• How ANN could be utilized in the areas mentioned in previous slide ?
• Each Person should select different area based upon his/her Class No.
• Submission Deadline:
▫ Before Final-term
Advance DataBase

Biological Neuron
• Neuron : a basic building block in the nervous system
• The major components of a neuron include: a central body, dendrites, and an axon
• The signal flow goes from the dendrites, through the cell body, and out through the axon. The signal from one neuron is passed on
to another by means of connection (Synapse) between the axon and of the first and a dendrite of the second
Advance DataBase

Biological Neuron
Advance DataBase

ANN General Model(Perceptron)


Advance DataBase

ANN General Model


• Consists of data-processing structure containing processing elements, called neurons,
• Fully interconnected with one-way signal channels, called connections.
• Each neuron can take in data from many sources but can send data out in one direction.
• Output connection can branch into a number of other connections
▫ Each connection then carries the same signal
• Each input is multiplied by a corresponding weight.
• All of the weighted inputs are then summed to determine the activation level of the neuron.
• All neural network architectures are based on this model.
Advance DataBase

ANN Example
• Invented in 1957 by Frank Rosenblatt at the Cornell Aeronautical Laboratory, a perceptron is the simplest neural network
possible: a computational model of a single neuron.
• A perceptron consists of one or more inputs, a processor, and a single output.

Figure : The perceptron


Advance DataBase

• A perceptron follows the “feed-forward” model, meaning inputs are sent into the neuron, are processed, and
result in an output. In previous diagram, this means the network (one neuron) reads from left to right: inputs
come in, result goes out.
• Let’s follow each of these steps in more detail.
Step 1: Receive inputs.
Say we have a perceptron with two inputs—let’s call them x1 and x2.
Input 0: x1 = 12
Input 1: x2 = 4
Step 2: Weight inputs.
Each input that is sent into the neuron must first be weighted, i.e. multiplied by some value (often a number
between -1 and 1). When creating a perceptron, we’ll typically begin by assigning random weights. Here,
let’s give the inputs the following weights:
Weight 0: 0.5
Weight 1: -1
We take each input and multiply it by its weight.
Input 0 * Weight 0 ⇒ 12 * 0.5 = 6
Input 1 * Weight 1 ⇒ 4 * -1 = -4
Advance DataBase

Step 3: Sum inputs.


The weighted inputs are then summed.
Sum = 6 + -4 = 2
Step 4: Generate output.
The output of a perceptron is generated by passing that sum through an activation function.
in the case of a simple binary output, the activation function is what tells the perceptron whether to “fire” or not. You can envision
an LED connected to the output signal: if it fires, the light goes on; if not, it stays off.
Activation function are complex a little and if you start to understand activation function then you may soon find yourself in
calculus textbook. However we’re going to do something really easy.
Let’s make the activation function the sign of the sum. In other words, if the sum is a positive number, the output is 1; if it is
negative, the output is -1.
output = sign(sum) ⇒ sign(2) ⇒ +1
Advance DataBase

The Perceptron Algorithm:

1. For every input, multiply that input by its weight.


2. Sum all of the weighted inputs.
3. Compute the output of the perceptron based on that sum passed through an activation function (the sign of the sum).
Advance DataBase

Feed Forward Artificial Neural


Networks
Advance DataBase

• Sigmoid Function/ Logistic function


• Activation Function
Advance DataBase
Bias Units = +1
x0 a0(2) a0(3)

Advance DataBase

L1 L2 L3 L4
Output

0
x1 a1(2) a1(3) a1(4) 0

x2 a2(2) a2(3) a2(4) 1 0

x3 a3(2) a3(3) a3(4) 2 0

4
x4 a4(2) a4(3) a4(4) 0

5
x5 a5(2) a5(3) a5(4) 0

43
xn an(2) an(3) a44(4) 0

Input Layer Hidden Layers Output Layer


x0 W10(1) Bias Unit
Advance DataBase

L1 L2

W11(1)
x1
a1(2)

x2

x3

x4

x5

xn

Input Layer Hidden Layer


x0 W10(1) Bias Unit
Advance DataBase

L1 L2

x1 W11(1)

W12(1)
x2
a2(2)
W13(1)
x3

x4

x5

xn

Input Layer Hidden Layer


x0 W10(1) Bias Unit
Advance DataBase

L1 L2

x1 W11(1)

x2 W12(1)

W13(1)
x3
a3(2)
W14(1)
x4
W15(1)

x5
W1n(1)

xn

Input Layer Hidden Layer


x0 W10(1) Bias Unit
Advance DataBase

L1 L2

x1 W11(1)

x2 W12(1)

W13(1)
x3
W14(1)
x4
W15(1) a4(2)

x5
W1n(1)

xn

Input Layer Hidden Layer


x0 W10(1) Bias Unit
Advance DataBase

L1 L2

x1

x2

x3

x4

x5
a5(2)

xn

Input Layer Hidden Layer


x0 W10(1) Bias Unit
Advance DataBase

L1 L2

x1

x2

x3

x4

x5

W1n(1)
xn an(2)

Input Layer Hidden Layer


x0
Advance DataBase

L1 L2

x1 a1(2)

x2 a2(2)

x3 a3(2)

x4 a4(2)

x5 a5(2)

xn an(2)

Input Layer Hidden Layer


a0(2) W20(2) Bias Unit
Advance DataBase

L2 L3

W21(2)
a1(2)

a1(3)
W22(2)
a2(2)

a3(2)

a4(2)

a5(2)

an(2)

Hidden Layers
a0(2) W20(2) Bias Unit
Advance DataBase

L2 L3

a1(2) W21(2)

W22(2)
a2(2)
a2(3)
W23 (2)

a3(2)

a4(2)

a5(2)

an(2)

Hidden Layers
a0(2) W20(2) Bias Unit
Advance DataBase

L2 L3

a1(2) W21(2)

a2(2) W22(2)

W23(2)
a3(2)

W24(2) a3(3)

a4(2)
W25(2)

a5(2)

W2n(2)

an(2)

Hidden Layers
a0(2) W20(2) Bias Unit
Advance DataBase

L2 L3

a1(2)

a2(2)

a3(2)

a4(2)

a4(3)

a5(2)

an(2)

Hidden Layers
a0(2) W20(2) Bias Unit
Advance DataBase

L2 L3

a1(2)

a2(2)

a3(2)

a4(2)

W25(2)
a5(2) a5(3)

an(2)

Hidden Layers
a0(2)
W20(2) Bias Unit
Advance DataBase

L2 L3

a1(2)

a2(2)

a3(2)

a4(2)

a5(2)

an(2) an(3)

Hidden Layers
Bias Unit
a0(3)

Advance DataBase

L3 L4

0
a1(3) a1(4) 1
W31 (3)

a2(3)

a3(3)

a4(3)

a5(3)

an(3)
a0(3) Bias Units
Advance DataBase

L3 L4

a1(3)

a2(3) a2(4) 1 1
W32(3)

a3(3)

a4(3)

a5(3)

an(3)
Bias Units
a0(3)
Advance DataBase

L3 L4

a1(3)

a2(3)

a3(3)
2
a3(4)
1
W33(3)

a4(3)

a5(3)

an(3)
a0(3) Bias Units
Advance DataBase

L3 L4

a1(3)

a2(3)

a3(3)

4
a4(3) a4(4) 1
W34(3)

a5(3)

an(3)
a0(3) Bias Units
Advance DataBase

L3 L4

a1(3)

a2(3)

a3(3)

a4(3)

5
a5(3) a5(4) 1
W35(3)

an(3)
a0(3)
Bias Unit
Advance DataBase

L3 L4

a1(3)

a2(3)

a3(3)

a4(3)

a5(3)

43
a44(4) 1

an(3)
Advance DataBase

ANN
Advance DataBase

ANN
Advance DataBase

ANN
Advance DataBase

Ann
Advance DataBase

ANN
Advance DataBase

ANN
Advance DataBase

ANN
Advance DataBase

ANN

Вам также может понравиться