Вы находитесь на странице: 1из 5

OVERVIEW OF MACHINE LEARNING

Types of Machine Learning System

Machine Learning system that it is useful to classify them in broad categories based on :

- Whether or not they are trained with human supervision (supervised, unsupervised,
semisupervised, and reinforcement learning)
- Whether or not they can learn incrementally on the fly (online vs batch learning)
- Whether they work by simply comparing new data points to known data point, or instead detect
pattern in the training data and build a predictive model, much like scientists (instance based vs
model based learning)

Supervised Learning

In supervised learning, the training data you feed to the algorithm includes the desired solutions, called
labels.

Figure : a labeled training set for supervised learning (e.g : spam classification)

- Typical supervised learning task is classification. Like figure : train spam or ham email
- Predict is an another task in supervised learning. The target is a numeric value, such as the proce
of a car, given a set of features (mileage, age, brand, etc) called predictor  Regression
- Regression analysis : statistical methodology that is most often used for numeric prediction

Classification vs Prediction

Classification :

- Predicts categorical class labels


- Classifies data (constructs a model) based on the training set and the values (class label) in a
classifying attribute and uses it in classifying data

Prediction :

- Model continuous – value function e.g : predict unknown or missing value

Classification – A Two steps Process

Model Construction : describing a set of predetermined classes

- Each sample is assumed to belong to a predefined class, as determined by the class label
attribute
- The set of record/tuples used for model construction : training set
- The model is represented as classification rules, decision tree, or mathematical formula

Mode Usage : for classifying future or unknown objects

- Estimate accuracy of the model


- The known of test sample is compared with the classified result from the model
- Accuracy rate is the percentage of test set samples that are correctly classified by the model
- Test set is independent of training set, otherwise over fitting will occur

Classification Process : (1) Model Construction

Classification Process : (2) Use the Model in Prediction

Note :

In machine learning an attribute is a data type (e.g : “Mileage”) while a feature has several meanings
depending on the context but generally means an attribute plus its value (eg : mileage = 15 000). Many
people use the words attribute and feature interchangeably, though.
Figure : regression

Unsupervised Learning

The data training is unlabeled. The system tries to learn without a teacher

Figure : an labeled training set for unsupervised learning

The unsupervised learning also known as cluster analysis or clustering

Clustering is learning by observation rather than learning by examples

Unsupervised Method _ Clustering Type

- Partitioning
o Find mutually exclusive cluster of spherical shape
o Distance – based
o May use mean or medoid (etc.) to represent cluster center
o Effective for small to medium size datasets
- Hierarchical
o Clustering is a hierarchical decomposistion (i.e. multiple levels)
o Cannot correct erroneous merges or splits
o May incorporate other techniques like microclustering or consider object “linkage”
- Density based
o Can find arbitrarily shape cluster
o Clusters are dense region of objects in space that are separated by low density regions
o Cluster density : each point must have a minimum number of points within its
“neighborhood”
o May filter out outliers
- Grid based
o Use a multiresolution grid data structure
o Fast processing time (typically independent of number of data object, yet dependent on
grid size)

Supervised Learning

Some algorithm can deal with partially training data, usually a lot of unlabeled data and a little bir of
labeled data. This is called semisupervised learning

Figure : semisupervised learning

Most semisupervised learning algorithms are combinations of supervised and unsupervised learning
(example : deep belief networks – DBFn)  DBF based in unsupervised component called restricted
Boltzmann Machines (RBM) stacked on top of one another. RBM are trained sequentially in an
unsupervised manner and then the whole system is fine tuning using supervised learning techniques

Reinforcement Learning

Reinforcement learning  observe the environment, select and perform actions and get REWARDS in
return (or penalties in the form of negative rewards)
Figure : reinforcement learning

Вам также может понравиться