Академический Документы
Профессиональный Документы
Культура Документы
PROJECT REPORT
ON
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING
Submitted by
to
i
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
CERTIFICATE
Certified that this report entitled ‘Traffic sign recognition using machine learning’ is the report of
project presented by Anusha Ravindran, KSD15IT005 during the year 2018-2019 in partial
fulfillment of the requirements for the award of the degree of Bachelor of Technology in Computer
Science Engineering of the University of Kerala.
SANDEEP CHANDRAN
Assistant Professor
Dept. of CSE, LBSCEK
SARITH DIVAKAR M
Assistant Professor
Dept. of CSE, LBSCEK
SMITHAMOL
Dept. of CSE,LBSCEK
ii
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
DECLARATION
I, Anusha Ravindran hereby declare that, this project report entitled TRAFFIC SIGN
RECOGNITION USING MACHINE LEARNING is a bonafide work of mine carried out
under the supervision of Mr.Sandeep Chandran, Asst.Proffesor, Department of computer science
and engineering, LBS college of engineering. I declare that, to the best of my knowledge, the
work reported herein does not form part of any other project report or dissertation on the basis of
which a degree or award was conferred or an earlier occasion to any other candidate. The content
of this project is not being presented by any other student to this or any other university for the
award of a degree.
Signature:
Signature:
iii
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
iv
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
ACKNOWLEDGEMENT
I take this opportunity to express my deep sense of gratitude and sincere thanks to all who helped
me to complete the project successfully.
We sincerely thank our principal Dr. MUHAMMED SHEKOOR for providing us facilities in
order to go ahead with our project.
We express our sincere gratitude to Mrs. SMITHA MOL M B, Head of the department,
Computer Science and Engineering, LBS College of Engineering, for supporting us with necessary
facilities which was essential in the successful completion and presentation of our work.
We express our sincere gratitude to Mr.SANDEEP CHANDRAN for supporting and guiding us
throughout the work.
We also express our heartiest gratitude to project coordinator Mr.SARITH DIVAKAR M for the
timely suggestion and encouragement given for the successful completion of this work.
Finally, yet importantly, we would like to express our heartfelt thanks to our beloved parents for
their blessings and classmates for their help and wishes for the successful completion of this work.
i
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
ABSTRACT
Traffic sign recognition is an important but challenging task, specially for automated driving and
driver assistance. This is a technology by which a vehicle is able to recognize the traffic sign put
on the road.It uses image processing techniques to detect the traffic signs. The detection methods
can be generally divided into colour based, shape based and learning based methods.
Its accuracy depends on two aspects: feature extractor and classifier. Current popular algorithms
mainly use convolutional neural networks (CNN) to execute both feature extraction and
classification. Such methods could achieve impressive results but usually on the basis of an
extremely huge and complex network. What’s more, since the fully connected layers in CNN
form a classical neural network classifier, which is trained by gradient descent-based
implementations, the generalization ability is limited and sub-optimal. Firstly CNN learns deep
and robust features, followed by the removing of the fully-connected layers, which turns CNN to
be the feature extractor.
ii
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
CONTENTS
List of Figures iv
List of Tables v
Chapter-1. Introduction 1
Chapter-3 Methodology 14
iii
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
4.1.3.2 Grayscaling 19
4.1.3.3 Local histogram equilazation 19
4.1.3.4 Normalization 19
4.1.4 Designing model architecture 19
4.1.4.1 LeNet 20
4.1.4.1.1 LeNet architecture 20
4.1.5 Model training and evaluation 20
4.1.6 Testing the model using the testset 21
Chapter-5 Code 22
Chapter-6 Results 33
Chapter-7 Conclusion 39
References 40
iv
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
List of figures.
1. Figure 1 CNN 03
2. Figure 2 Classification using CNN 15
3. Figure 3 Architecture of proposed model 17
v
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
CHAPTER 1
INTRODUCTION
1
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
A real-time and robust automatic traffic sign recognition can support and disburden the
driver, and thus, significantly increase driving safety and comfort. For instance, it can remind
the driver of the current speed limit, prevent him from performing inappropriate actions such
as entering a one-way street, passing another car in a no passing zone, unwanted speeding, et
The aim of this project is to lessen many of these restrictions. Identification of the traffic
signs is a demanding function for safe driving for the driver as well as the vehicles
following.One can recognize traffic sign using its shape,colour and orientation.We can use
the various features of the image dataset for classification.
1.2 OBJECTIVE
Traffic sign recognition is a technology which identifies traffic sign from a fair distance. In
this contribution, we describe a real-time system for vision based traffic sign detection and
recognition. We focus on an important and practically relevant subset of (Indian) traffic
signs, namely speed-signs and no-passing-signs, and their corresponding end-signs, The
problem of traffic sign recognition has some beneficial characteristics. First, the design of
traffic signs is unique, thus, object variations are small. Further, sign colors often contrast
very well against the environment. Moreover, signs are rigidly positioned relative to the
environment (contrary to vehicles), and are often set up in clear sight to the driver.
Nevertheless, a number of challenges remain for a successful recognition. First, weather and
lighting conditions vary significantly in traffic environments, diminishing the advantage of
the above claimed object uniqueness. Additionally, as the camera is moving, additional image
distortions, such as, motion blur and abrupt contrast changes, occur frequently. Further, the
2
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
sign installation and surface material can physically change over time, influenced by
accidents and weather, hence resulting in rotated signs and degenerated colors.
1.3 SCHEME
CNN is one of the neural network models for deep learning, which is characterized by three
specific characteristics, namely locally connected neurons, shared weight and spatial or
temporal sub-sampling. Generally, CNN can be considered to be made up of two main parts.
The first contains alternating convolutional and maxpooling layers. The input of each layer is
just the output of its previous layer. As a result, this forms a hierarchical feature extractor that
maps the original input images into feature vectors. Then the extracted features vectors are
classified by the second part, that is, the fully-connected layers, which is a typical
feedforward neural network.
Convolutional
Layer CNN Feature
.
. .
Input Layer
. . .
. . .
Input Image
. .
. .
3
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
CHAPTER 2
LITERATURE SURVEY
In traffic environments, signs regulate traffic, warn the driver, and command or prohibit
certain actions. A real-time and robust automatic traffic sign recognition can support and
disburden the driver, and thus, significantly increase driving safety and comfort. For instance,
it can remind the driver of the current speed limit, prevent him from performing inappropriate
actions such as entering a one-way street, passing another car in a no passing zone, unwanted
speeding, etc. Further, it can be integrated into an adaptive cruise control (ACC) for a less
stressful driving. In a more global context, it can contribute to the scene understanding of
traffic context (e.g., if the car is driving in a city or on a freeway).
In this contribution, a real-time system for vision based traffic sign detection and recognition
is described. Here, main focus on an important and practically relevant subset of traffic signs,
namely speed-signs and no-passing-signs, and their corresponding end-signs, respectively.
4
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
The problem of traffic sign recognition has some beneficial characteristics. First, the design
of traffic signs is unique, thus, object variations are small. Further, sign colors often contrast
very well against the environment. Moreover, signs are rigidly Note that data are available in
color. positioned relative to the environment (contrary to vehicles), and are often set up in
clear sight to the driver.
The vast majority of published traffic sign recognition approaches utilizes at least two steps,
one aiming at detection, the other one at classification, that is, the task of mapping the
detected sign image into its semantic category.
Nevertheless, a number of challenges remain for a successful recognition. First, weather and
lighting conditions vary significantly in traffic environments, diminishing the advantage of
the above claimed object uniqueness. Additionally, as the camera is moving, additional image
distortions, such as, motion blur and abrupt contrast changes, occur frequently. Further, the
sign installation and surface material can physically change over time, influenced by
accidents and weather, hence resulting in rotated signs and degenerated colors. Finally, the
constraints given by the area of application require inexpensive systems (i.e., low-quality
sensor, slow hardware), high accuracy and real-time computation.
The drawback of this sequential appliance of color and shape detection is as follows. Regions
that have falsely been rejected by the color segmentation cannot be recovered in the further
processing. A joint modeling of color and shape can overcome this problem. Additionally,
color segmentation requires the fixation of thresholds, mostly obtained from a time
consuming and error prone manual tuning.
Colour and shape are basic characteristics of traffic signs which are used both by the driver
and to develop artificial traffic sign recognition systems. However, these sign features have
not been represented robustly in the earlier developed recognition systems, especially in
disturbed viewing conditions. In this study, this information is represented by using a human
vision colour appearance model and by further developing existing behaviour model of
visions. Colour appearance model CIECAM97 has been applied to extract colour information
5
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
and to segment and classify traffic signs. Whilst shape features are extracted by the
development of FOSTS model, the extension of behaviour model of visions. Recognition rate
is very high for signs under artificial transformations that imitate possible real world sign
distortion (up to 50% for noise
level, 50 m for distances to signs, and 5_ for perspective disturbances) for still images. For
British traffic signs (n = 98) obtained under various viewing conditions, the recognition rate
is up to 95%. Colour and shape are dominant visual features of traffic signs with distinguish
characteristics and are key information for drivers to process when driving along the road.
Therefore to develop a driver assistant system for recognition of traffic signs, this information
should be utilised effectively and efficiently even in the knowledge that colour and shape
vary with the change of lighting conditions and viewing angles. Colour is regulated not only
for the traffic sign category (red = stop, yellow = danger, etc.) but also for the tint of the paint
that covers the sign, which should correspond, with a tolerance, to a specific wavelength in
the visible spectrum . However, most colour-based techniques in computer vision run into
roblems if the illumination source varies not only in intensity but also in colour as well. This
is because that the spectral composition, and therefore the colour, of daylight change
depending on weather conditions, e.g., sky with/ without clouds, time of the day, and night
when all sorts of artificial lights are surrounded . Many authors therefore have developed
various techniques to make use of the colour information of traffic signs.
Tominaga developed clustering method in a colour space, whilst Ohlander used a recursive
region splitting method to achieve colour segmentation. The colour spaces they applied are
HSI (Hue, Saturation, Intensity), and L*a*b* space. These colour spaces normally limit to
only one lighting condition, which is D65. Hence, the range of each colour attribute, such as
hue, will be narrowed down due to the fact that weather condition changes with colour
temperatures ranging from 5000 to 7000 K.
Shape is another powerful visual feature for recognition of signs [4–10]. However, when
signs appear in cluttered scenes, many objects may appear similar to the road signs. Also,
when the viewing angles are different, the signs will appear differently with some degree of
distortion, sometimes with torn corners and occluded parts. Furthermore, signs vary in scale:
getting bigger as a vehicle moves toward them, and vary in size: appearing relatively small
with about 40–50 pixels wide at the most. Another difficulty is linked to the way the signs are
captured by the acquisition system. It is stated that all road signs will be seen with a non-zero
6
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
angle between the optical axis of each camera and the normal vector to the sign surface . This
angle should be as high as 30_, depending on the distance between the sign and the cameras.
Piccioloi and Campani concentrated on geometrical reasoning for the detection of triangular
and circular signs. For the triangular shapes, they segmented them using the horizontal or
having a slope of the ranges [60 _ e, 60+e], [_60 _ e,_60 + e] degrees, where e is the
deviation from 60 calculated from samples. Hough Transform was applied to detect the
circles. However, only two types of shape were studied. Miura et attracted sign candidates
using white circular regions by using binarization with area filtering, which only keeps white
regions whose areas are within a predetermined range. Due to the dust of the road, the white
regions sometimes may not be the areas with higher intensity values, which will result in lots
of false candidates.
More recently, Escalera has developed a driver support system which employs a genetic
algorithm for detection of sign state and a neural network for achieving the classification. But
the neural network needs to be re-trained whenever a new case is included, which is very
time consuming. Due to the adaptation to the environment, human can correctly identify
traffic signs invariant of lighting conditions and viewing angles. Therefore invariant features
can be extracted using vision models. In this study, two vision models have been applied and
developed. One model is CIECAM97 for measuring colour appearance invariant of lighting
conditions and is utilised to extract colour features. The other vision model, foveal system for
traffic signs (FOSTS), is developed based on behaviour model of visions (BMV) model
imitating some mechanisms of the real visual system for perceiving shapes [14–16]
CIECAM97 is a standard colour appearance model recommended by CIE (International
Colour Commission on Illumination) in 1997 for measuring colour appearance under various
viewing conditions [17,18]. This model can estimate a colour appearance as accurate as an
average observer. It takes weather conditions into account and simulates human_s perception
for perceiving colours under various viewing conditions and for different media, such as
reflection colours, transmissive colours, etc. For human perception, the most common terms
used for colour or colour appearance are lightness, chroma, and hue that can be predicted
using the
model. The input parameters are viewing conditions, including lighting source, reference
white, and the background.
7
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
Road sign detection and recognition can be an important aid to the driver, letting him
concentrate on driving; such kind of system can remember signs encountered, even those that
go unnoticed or neglected, thus reducing the impact of these events on driving comfort, and
also decreasing the possibility of related road accidents. Road signs are designed to be easily
readable, with high
contrast and saturated colors, and are installed according to a strict regulation; however,
environmental light, weather conditions, paint degradation, dirt, shadows and occlusions
make automatic traffic sign recognition a challenging task.
The main goal of this paper is to describe the classification stage of a traffic sign recognition
system. Since traffic signs follow strict shape formats, the classification system is driven by
the information provided by the shape detection stage, whose output is used to route the input
pattern to a specialized classificator. Several road sign classification techniques are described
in literature. One of the simplest methods is cross-correlation with models: in model signs,
resampled to 16×16 pixels and rototraslated, are used to find the best match. Random forests,
an ensemble learning technique, are used by to classify signs, and a comparison is made
between this technique and SVM and AdaBoost. Support vector machines (SVM) are largely
adopted to classify the inner part of road signs. Linear SVM and SVM with Gaussian Kernels
are used to recognize the symbol contained in the resampled inner part of road signs: only
significative pixels inside the region are used to train the SVM, and each object is only
compared with signs with the same shape and color. Example of the system output. inputs are
hard to analyze, can be useful for a classifier to reduce its input. Principal component analysis
(PCA) and linear discrimant analysis (LDA) techniques can be used to fit this task. Neural
8
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
networks are also largely adopted, and this technique is also the one chosen to provide the
classification stage described and evaluated in this paper.
A comparative study between networks with one or two hidden layers has already been made
, demonstrating that better performance can be achieved using networks with two hidden
layers. Tests are also available on the use of Resilient Back-propagation or Scale Conjugate
Gradient to train neural networks . Neural networks have also been recently used in
embedded systems for traffic sign recognition. Tests have been made on how to train neural
networks, using both synthetic and real images. Since a large road signs database can be
easily collected, this paper presents exhausting benchmarks to provide a tested and effective
indication on how to train neural network for a road sign classification system. The paper is
organized as follows: It presents the system architecture; briefly explains the detection stage
presented in a previous article.
Road sign detection and recognition can be an important aid to the driver, letting him
concentrate on driving; such kind of system can remember signs encountered, even those that
go unnoticed or neglected, thus reducing the impact of these events on driving comfort, and
also decreasing the possibility of related road accidents. Road signs are designed to be easily
readable, with high
contrast and saturated colors, and are installed according to a strict regulation; however,
environmental light, weather conditions, paint degradation, dirt, shadows and occlusions
make automatic traffic sign recognition a challenging task.
9
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
The main goal of this paper is to describe the classification stage of a traffic sign recognition
system. Since traffic signs follow strict shape formats, the classification system is driven by
the information provided by the shape detection stage, whose output is used to route the input
pattern to a specialized classificator. Several road sign classification techniques are described
in literature. One of the simplest methods is cross-correlation with models: in model signs,
resampled to 16×16 pixels and rototraslated, are used to find the best match. Random forests,
an ensemble learning technique, are used by to classify signs, and a comparison is made
between this technique and SVM and AdaBoost. Support vector machines (SVM) are largely
adopted to classify the inner part of road signs. Linear SVM and SVM with Gaussian Kernels
are used to recognize the symbol contained in the resampled inner part of road signs: only
significative pixels inside the region are used to train the SVM, and each object is only
compared with signs with the same shape and color. Example of the system output. inputs are
hard to analyze, can be useful for a classifier to reduce its input. Principal component analysis
(PCA) and linear discrimant analysis (LDA) techniques can be used to fit this task. Neural
networks are also largely adopted, and this technique is also the one chosen to provide the
classification stage described and evaluated in this paper.
A comparative study between networks with one or two hidden layers has already been made
, demonstrating that better performance can be achieved using networks with two hidden
layers. Tests are also available on the use of Resilient Back-propagation or Scale Conjugate
Gradient to train neural networks . Neural networks have also been recently used in
embedded systems for traffic sign recognition. Tests have been made on how to train neural
networks, using both synthetic and real images. Since a large road signs database can be
easily collected, this paper presents exhausting benchmarks to provide a tested and effective
indication on how to train neural network for a road sign classification system. The paper is
organized as follows: It presents the system architecture; briefly explains the detection stage
presented in a previous article.
10
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
the input image (for an audio input each feature map would be a 1D array, and for a video or
volumetric image, it would be a 3D array). At the output, each feature map represents a
particular feature extracted at all locations on the input. Each stage is composed of three
layers: a filter bank layer, a non-linearity layer, and a feature pooling layer.
A typical ConvNet is composed of one, two or three such 3-layer stages, followed by a
classification module. Each layer type is now described for the case of image recognition.
artificial) is how to produce good internal representations of the visual world. What sort of
internal representation would allow an artificial vision system to detect and classify objects
into categories, independently of pose, scale, illumination, conformation, and clutter? More
interestingly, how could an artificial vision system learn appropriate internal representations
automatically, the way animals and human seem to learn by simply looking at the world? In
the time-honored approach to computer vision (and to pattern recognition in general), the
question is avoided: internal representations are produced by a hand-crafted feature extractor,
whose output is fed to a trainable classifier. While the issue of learning features has been a
topic of interest for many years, considerable progress has been achieved in the last few years
with the development of so-called deep learning methods. Good internal representations are
hierarchical.
In vision, pixels are assembled into edglets, edglets into motifs, motifs into parts, parts into
objects, and objects into scenes. This suggests that recognition architectures for vision (and
for other modalities such as audio and natural language) should have multiple trainable stages
stacked on top of each other, one for each level in the feature hierarchy. This raises two new
questions: what to put in each stage and how to train such deep, multi-stage architectures?
Convolutional Networks (ConvNets) are an answer to the first question. Until recently, the
answer to the second question was to use gradient-based supervised learning, but recent
research in deep learning has produced a number of unsupervised methods which greatly
reduce the need for labeled samples.
11
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
such as rectification and contrast normalization, and using unsupervised pre-training of each
filter bank, the need for labeled samples is considerably reduced. Because of their
applicability to a wide range of tasks, and because of their relatively uniform architecture,
ConvNets are perfect candidates for hardware implementations, and embedded applications,
as demonstrated by the increasing amount of work in this area. We expect to see many new
embedded vision systems based on ConvNets in the next few years.
Despite the recent progress in deep learning, one of the major challenges of computer vision,
machine learning, and AI in general in the next decade will be to devise methods that can
automatically learn good features hierarchies from unlabeled and labeled data in an integrated
fashion. Current and future research will focus on performing unsupervised learning on
multiple stages simultaneously, on the integration of unsupervised and unsupervised learning,
and on using the feed-back path implemented by the decoders to perform visual inference,
such as pattern completion and disambiguation
various architectures and training procedures are compared to determine which non-
linearities are preferable, and which training protocol makes a difference. Generic Object
Recognition using Caltech 101 Dataset: We use a two-stage system where, the first stage is
composed of
an Flayer with 64 filters of size 9×9, followed by different combinations of non-linearities
and pooling. The second-stage feature extractor is fed with the output of the first stage and
extracts 256 output features maps, each of which combines a random subset of 16 feature
maps from the previous stage using 9×9kernels. Hence the total number of convolution
kernels is 256 ×16 = 4096.
.
1. Excellent accuracy of 65.5% is obtained using unsupervised pre-training and supervised
refinement with abs and normalization non-linearities. The result is on par with the popular
model based on SIFT and pyramid match kernel SVM . It is clear that abs and normalization
are crucial for achieving good performance. This is an extremely important fact for users of
convolutional networks, which traditionally only use tanh().
2. Astonishingly, random filters without any filter learning whatsoever achieve decent
performance (62.9% for R), as long as abs and normalization are present (Rabs −N−PA). A
more detailed study on this particular case can be found in .
12
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
3. Comparing experiments from rows Rvs R+,Uvs U+, we see that supervised fine tuning
consistently improves the performance, particularly with weak non-linearities.
4. It seems that unsupervised pre-training (U,U+) is crucial when newly proposed non-
linearities are not in place. Handwritten Digit Classification using MNIST Dataset: Using the
evidence gathered in previous experiments, we used a two-stage system with a two-layer
fully-connected classifier. The two convolutional stages were pre-trained unsupervised, and
refined supervised. An error rate of 0.53% was achieved on the test set. To our knowledge,
this is the lowest error
rate ever reported on the original MNIST dataset, without distortions or preprocessing. The
best previously reported error rate was 0.60% . Experiments on German traffic sign
recognition benchmark (GTSRB) demonstrate that the proposed method can obtain
competitive results with state-of-the-art algorithms with much less complexity.
13
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
CHAPTER 3
METHODOLOGY
Current popular algorithms mainly use convolutional neural networks to execute both feature
extraction and classification. Such these methods could achieve impressive results but often
on the basis of an extremely huge and complex network or ensemble learning, together with
over-massive data. For the purpose of making full use of the advantages of CNN, we propose
a novel traffic sign recognition architecture. Before sent to CNN for feature extraction, the
average image of the traffic signs is subtracted to ensure illumination invariance to some
extent.
14
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
recognition rate will be. Let us consider the use of CNN for image classification in more
detail.
Conv(32),conv(32)
(activation:ReLu)
Pool(2×2)
Input Image
(32×32)
The image is passed through a series of convolutional, non-linear,pooling layers and fully
connected layers and then generates the output.
15
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
The network will consist of several convolutional networks mixed with non-linear and
pooling layers .When the image passes through one convolutional layer, the output of the first
layer becomes the input of the second layer and this happens with every further
convolutional layer.
After completion of series of convolutional, non linear and pooling layers, it is necessary to
attach a fully connected layer. This layer takes the output information from the
convolutional \networks. Attaching a fully connected layer to the end of the network results
in an N dimensional vector, where N is the amount of classes from which the model selects
the desired class.
16
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
Testing
ing
Evaluation Validation
17
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
CHAPTER 4
4.1.3.1 Shuffling
we shuffle the training data to increase randomness and variety in training
dataset, in order for the model to be more stable. We will use sklearn to shuffle our data.
4.1.3.2 Grayscalling
Grayscale images instead of color improves the ConvNet's accuracy. We
use OpenCV to convert the training images into grey scale.
18
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
This technique simply spreads out the most frequent intensity values in an image,
resulting in enhancing images with low contrast. Applying this technique will be very
helpful in our case since the dataset in hand has real world images, and many of them
has low contrast. We use skimage to apply local histogram equalization to the training
images
4.1.3.4 Normalization
Normalization is a process that changes the range of pixel intensity values.
Usually the image data should be normalized so that the data has mean zero and equal
variance.
We'll use Convolutional Neural Networks to classify the images in this dataset. The reason
behind choosing ConvNets is that they are designed to recognize visual patterns directly from
pixel images with minimal preprocessing. They automatically learn hierarchies of invariant
features at every level from data. We will implement two of the most famous ConvNets. Our
goal is to reach an accuracy of +95% on the validation set.
1. We specify the learning rate of 0.001, which tells the network how quickly to
update the weights.
2. We minimize the loss function using the Adaptive Moment Estimation (Adam)
Algorithm. Adam is an optimization algorithm. Adam algorithm computes adaptive
learning rates for each parameter.
3. we will run minimize() function on the optimizer which use backprobagation to
update the network and minimize our training loss.
4.1.4.1 LeNet
LeNet-5 is a convolutional network designed for handwritten and machine-
printed character recognition
19
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
Input => Convolution => ReLU => Pooling => Convolution => ReLU => Pooling =>
FullyConnected => ReLU => FullyConnected
20
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
CHAPTER 5
5.1 CODE
training_file = "./traffic-signs-data/train.p"
validation_file= "./traffic-signs-data/valid.p"
testing_file = "./traffic-signs-data/test.p"
21
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
next(signnames,None)
for row in signnames:
signs.append(row[1])
csvfile.close()
22
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
Parameters:
images: An np.array compatible with plt.imshow.
lanel (Default = No label): A string to be used as a label for each image.
cmap (Default = None): Used to display gray images.
"""
plt.figure(figsize=(15, 16))
for i in range(6):
plt.subplot(1, 6, i+1)
indx = random.randint(0, len(dataset))
#Use gray scale color map if there is only one channel
cmap = 'gray' if len(dataset[indx].shape) == 2 else cmap
plt.imshow(dataset[indx], cmap = cmap)
plt.xlabel(signs[dataset_y[indx]])
plt.ylabel(ylabel)
plt.xticks([])
plt.yticks([])
plt.tight_layout(pad=0, h_pad=0, w_pad=0)
plt.show()
23
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
def gray_scale(image):
"""
Convert images to gray scale.
Parameters:
image: An np.array compatible with plt.imshow.
"""
return cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
def local_histo_equalize(image):
"""
Apply local histogram equalization to grayscale images.
Parameters:
image: A grayscale image.
"""
24
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
kernel = morp.disk(30)
img_local = rank.equalize(image, selem=kernel)
return img_local
def preprocess(data):
"""
Applying the preprocessing steps to the input data.
Parameters:
data: An np.array compatible with plt.imshow.
"""
gray_images = list(map(gray_scale, data))
equalized_images = list(map(local_histo_equalize, gray_images))
25
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
n_training = data.shape
normalized_images = np.zeros((n_training[0], n_training[1], n_training[2]))
for i, img in enumerate(equalized_images):
normalized_images[i] = image_normalize(img)
normalized_images = normalized_images[..., None]
return normalized_images
class LaNet:
# Activation:
self.conv1 = tf.nn.relu(self.conv1)
26
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
# Activation:
self.conv2 = tf.nn.relu(self.conv2)
27
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
self.fully_connected1 = tf.add((tf.matmul(self.fully_connected0,
self.connected1_weights)), self.connected1_bias)
# Activation:
self.fully_connected1 = tf.nn.relu(self.fully_connected1)
# Activation.
self.fully_connected2 = tf.nn.relu(self.fully_connected2)
# Training operation
self.one_hot_y = tf.one_hot(y, n_out)
self.cross_entropy = tf.nn.softmax_cross_entropy_with_logits(self.logits,
self.one_hot_y)
self.loss_operation = tf.reduce_mean(self.cross_entropy)
self.optimizer = tf.train.AdamOptimizer(learning_rate = learning_rate)
self.training_operation = self.optimizer.minimize(self.loss_operation)
# Accuracy operation
self.correct_prediction = tf.equal(tf.argmax(self.logits, 1), tf.argmax(self.one_hot_y, 1))
self.accuracy_operation = tf.reduce_mean(tf.cast(self.correct_prediction, tf.float32))
28
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
29
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
EPOCHS = 30
BATCH_SIZE = 64
DIR = 'Saved_Models'
30
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
CHAPTER 6
RESULTS
We use the testing set to measure the accuracy of the model over unknown examples.
On training
31
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
Model saved
As we can see, we've been able to reach a maximum accuracy of 95.3% on the validation set
over 30 epochs, using a learning rate of 0.001 on training the LeNet model
path = './traffic-signs-data/new_test_images/'
for image in os.listdir(path):
img = cv2.imread(path + image)
img = cv2.resize(img, (32,32))
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
new_test_images.append(img)
new_IDs = [13, 3, 14, 27, 17]
print("Number of new testing examples: ", len(new_test_images))
plt.figure(figsize=(15, 16))
for i in range(len(new_test_images)):
plt.subplot(3, 5, i+1)
plt.imshow(new_test_images[i])
plt.xlabel(signs[new_IDs[i]])
plt.ylabel("New testing image")
plt.xticks([])
plt.yticks([])
plt.tight_layout(pad=0, h_pad=0, w_pad=0)
plt.show()
33
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
Parameters:
X_data: Input data.
top_k (Default = 5): The number of top softmax probabilities to be generated.
"""
num_examples = len(Input_data)
y_pred = np.zeros((num_examples, top_k), dtype=np.int32)
y_prob = np.zeros((num_examples, top_k))
with tf.Session() as sess:
LeNet_Model.saver.restore(sess, os.path.join(DIR, "LeNet"))
y_prob, y_pred = sess.run(tf.nn.top_k(tf.nn.softmax(LeNet_Model.logits), k=top_k),
feed_dict={x:Input_data, keep_prob:1, keep_prob_conv:1})
return y_prob, y_pred
test_accuracy = 0
for i in enumerate(new_test_images_preprocessed):
accu = new_IDs[i[0]] == np.asarray(y_pred[i[0]])[0]
if accu == True:
test_accuracy += 0.2
print("New Images Test Accuracy = {:.1f}%".format(test_accuracy*100))
plt.figure(figsize=(15, 16))
for i in range(len(new_test_images_preprocessed)):
plt.subplot(5, 2, 2*i+1)
plt.imshow(new_test_images[i])
plt.title(signs[y_pred[i][0]])
plt.axis('off')
plt.subplot(5, 2, 2*i+2)
plt.barh(np.arange(1, 6, 1), y_prob[i, :])
labels = [signs[j] for j in y_pred[i]]
plt.yticks(np.arange(1, 6, 1), labels)
plt.show()
34
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
Keep left -
Speed limit - -
Road work -
Keep Right -
Yield -
30 km/hr -
20 km/hr -
50 km/hr -
80 km/hr -
60 km/hr -
Keep right -
Turn right -
50 km/hr -
30 km/hr -
Stop -
Gen.caution -
Traffic sig -
Road nar -
Crossing -
Pedestrians -
Keep right -
Slippery rd - -
Priority rd - -
Go left -
No entry -
35
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
CHAPTER 7
CONCLUSION
In this work I figured out what is deep learning. I assembled and trained the CNN model to
classify photographs of traffic sign. I measured how the accuracy depends on the number of
epochs in order to detect potential overfitting problem.
In this process of traffic sign recognition our first step is feature extraction followed by image
classification by utilizing variety of traffic signs using CNN classifier. Thus this project
uncovers the fundamental idea of CNN algorithm required to accomplish image classification
from traffic sign recognition.
My next step would be to try this model on more datasets and try to apply it to practical tasks.
I would also like to experiment with the neural network design in order to see how a higher
efficiency can be achieved in various problems. We expect to arrive at a better recognition
system for traffic sign recognition that allows convolution networks to be trained with very
few labeled samples.
Using LeNet, we've been able to reach a very high accuracy rate. We can observe that the
models saturate after nearly 10 epochs, so we can save some computational resources and
reduce the number of epochs to 10. We can also try other preprocessing techniques to further
improve the model's accuracy.. We can further improve on the model using hierarchical
CNNs to first identify broader groups (like speed signs) and then have CNNs to classify finer
features (such as the actual speed limit) This model will only work on input examples where
the traffic signs are centered in the middle of the image. It doesn't have the capability to
detect signs in the image corners.
36
TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING PROJECT REPORT
REFERENCES
1. Bahlmann, C., Zhu, Y., Ramesh, V., Pellkofer, M., and Koehler, T. (2005). A system
for traffic sign detection, tracking, and recognition using color, shape, and motion
information.
2. Gao, X. W., Podladchikova, L., Shaposhnikov, D., Hong, K., and Shevtsova, N.
(2006). Recognition of traffic signs based on their colour and shape features extracted
using human vision models.
3. Broggi, A., Cerri, P., Medici, P., Porta, P. P., and Ghisio, G. (2007). Real time road
signs recognition.
4. Andrzej Ruta, Yongmin Li, Xiaohui Liu, (2010). Robust class similarity measure for
traffic sign recognition.
37