2018-2019 Latest ML, DS, Ai Python & Hadoop Project Abstracts

For: - B. E | B. Tech | M. E | M. Tech | MCA | BCA | Diploma |MS |M.
Sc |
IEEE
REAL TIME PROJECTS& TRAINING GUIDE
SOFTWARE & EMBEDDED
www.makefinalyearproject.com
PROJECTS TITLES FOR ACADEMIC YEAR 2018-2019

#19, MN Complex, 2nd Cross, Sampige Main Road, Malleswaram, Bangalore – 560003
Call Us: 9590544567 / 7019280372 / 9739066172, www.makefinalyearproject.com
www.igeekstechnologies.com Land Mark: Opposite Joyalukkas Gold Showroom, Near to Mantri Mall
IEEE Python and Hadoop BIG DATA,ML PROJECT LIST 2018 AND 2019
IGTHP001 Classification of Attack Types for Intrusion Detection Systems using a
Machine Learning Algorithm(IEEE-2018)
ABSTRACT
In this paper, we present the results of our experiments to evaluate the
performance of detecting different types of attacks (e.g., IDS, Malware, and
Shellcode). We analyze the recognition performance by applying the
Random Forest algorithm to the various datasets that are constructed from
the Kyoto 2006+ dataset, which is the latest network packet data collected
for developing Intrusion Detection Systems. We conclude with discussions
and future research projects.
IGTHP002 Mobile Service Experience Prediction Using Machine Learning Methods

(IEEE-2018)
ABSTRACT
With the introduction of 4.5G, mobile operators have focused their

efforts, infrastructure investments, tariffs and advertisements on the
improvement of mobile data rates and services. Mobile services provided
by mobile operators are influenced by various factors like the regional
coverage of the operator, usage traffic, time and weather conditions. As a
result, there may be differences between the quality of mobile services
that the operators offer to their customers and those that the customers
can actually access. The purpose of this study is to suggest a modelling
approach for the prediction of the mobile service types that customers can
experience based on machine learning techniques. To do this, based on
2017 speed tests data of three operators, alternative classification models
are constructed for the prediction of the mobile service type. By
comparing the performances of the models, best classification models
were determined for different service categories. Using the data obtained
from mobile speed tests performed on a limited number of locations, the
models developed here enable the prediction of the possible service types
that customers can experience in all locations in which the operators
serve.
IGTHP003
Phishing Web Sites Features Classification Based on Extreme Learning
Machine(IEEE-2018)
ABSTRACT
Phishing are one of the most common and most dangerous attacks among
cybercrimes. The aim of these attacks is to steal the information used by
individuals and organizations to conduct transactions. Phishing websites
contain various hints among their contents and web browser-based
information. The purpose of this study is to perform Extreme Learning
Machine (ELM) based classification for 30 features including Phishing
Websites Data in UC Irvine Machine Learning Repository database. For
results assessment, ELM was compared with other machine learning
methods such as Support Vector Machine (SVM), Naïve Bayes (NB) and
detected to have the highest accuracy of 95.34%
IGTHP004 Prediction of Bitcoin Prices with Machine Learning Methods Using Time
Series Data(IEEE-2018)
ABSTRACT
In this study, Bitcoin prediction is performed with Linear Regression (LR)

and Support Vector Machine (SVM) from machine learning methods by
using time series consisting of daily Bitcoin closing prices between 2012-
2018. The prediction model with include the least error is obtained by
testing with different parameter combinations such as SVM with including
linear and polynomial kernel functions. Filters with different weight
coefficients are used for different window lengths. For different window
lengths, Bitcoin price prediction is made using filters with different weight
coefficients. 10-fold cross-validation method in training phase is used in
order to construct a model with high performance independent of the
data set. The performance of the obtained model is measured by means
of statistical indicators such as Mean Absolute Error (MAE), Mean Squared
Error (MSE), Root Mean Squared Error (RMSE), and Pearson Correlation. It
is seen that the price prediction performance of the proposed SVM model
for Bitcoin data set is higher than that of the LR model.
Spam/ham e-Mail Classification Using Machine Learning Methods
IGTHP005 Based on Bag of Words Technique(IEEE-2018)
ABSTRACT
Nowadays, we use frequently e-mails, one of the communication channels

in electronic environment. It play an important role in our lives because of
many reasons such as personal communications, business-focused
activities, marketing, advertising etc. E-mails make life easier because of
meeting many different types of communication needs. On the other hand
they can make life difficult when they are used outside of their purposes.
Spam emails can be not only annoying receivers, but also dangerous for
receiver's information security. Detecting and preventing spam e-mails
has been a separate issue. In this study, the texts of the links which is in
the e-mail body are handled and classified by the machine learning
methods and Bag of Word Technique. We analyzed the effect of different
N-Grams on classification performance and the success of different
machine learning techniques in classifying spam e-mail by using accuracy
metric.
IGTHP006 Big Data Analytics Predicting Risk of Readmissions of Diabetic Patients
ABSTRACT
Healthcare holds paramount importance where analytics can be applied
to achieve insights about patients, identify bottlenecks and enhance the
business efficiency. Readmission rates cater to the quality of treatment
provided by the hospitals. Readmission results from improper medication,
early discharge, unmonitored discharge, meager care of hospital staff. To
identify high risk of readmission through data analytics leads to
accessibility to healthcare providers to develop programs to improve the
quality of care and institute targeted interventions. The proper
implementation of these analytic methods aid in proper utilization of
resources in hospitals thus reduces the readmission rate and the cost
incurred due to re-hospitalization. Evolving predictive modeling solutions
is highly challenging for recognizing risks of readmission in healthcare
informatics. The procedure involves integration of numerous factors such
as clinical factors, socio-demographic factors, health conditions, disease
parameters, hospitality quality parameters and various other parameters
that can be specific to requirement of each individual health provider. Big
data consists of large data sets that require high computational processing
to procure the data patterns, trends and associations. The effectiveness of
big data and its analytics in predicting the risk of readmission in diabetic’s
patients has been dealt in the research. The aim of this project is to
determine the risk predictors that can cause readmission among diabetic
patients and detailed analysis has been performed to predict risk of
readmission of diabetic patients.
IGTHP007 CallCab A Unified Recommendation System for Carpooling and Regular

Taxicab Services
ABSTRACT
Carpooling taxicab services hold the promise of providing additional
transportation supply, especially in extreme weather or rush hour when
regular taxicab services are insufficient. Although many recommendation
systems about regular taxicab services have been proposed recently, little
research, if any, has been done to assist passengers to find a successful
taxicab ride with carpooling. In this paper, we present the first systematic
work to design a unified recommendation system for both regular and
carpooling services, called CallCab, based on a data driven approach. In
response to a passenger’s request, CallCab aims to recommend either (i)
a vacant taxicab for a regular service with no detour, or (ii) an occupied
taxicab heading to the similar direction for a carpooling service with less
detour, yet without assuming any knowledge of destinations of
passengers already on occupied taxicabs. To analyze these unknown
destinations of occupied taxicabs, CallCab generates and refines taxicab
trip distributions based on GPS datasets and context information collected
in the existing taxicab infrastructure. To improve CallCab’s efficiency to
process such a big dataset, we augment the efficient MapReduce model
with a Measure phase tailored for our application. We evaluate CallCab
with a real world dataset of taxicabs.
IGTHP008 Big-Data in Climate Change Models - A novel approach with Hadoop

MapReduce
ABSTRACT
The goal of this work is to present a software package which is able to

process binary climate data through spawning Map-Reduce tasks while
introducing minimum computational overhead and without modifying
existing application code. The package is formed by the combination of
two tools, Pipistrello, a Java utility that allows users to execute Map-
Reduce tasks over any kind of binary file, Tina a lightweight Python library
that building on top of Pipistrello is able to process scientific dataset,
including NetCDF files. We benchmarked the combination of this two tools
using a test Apache Hadoop Cluster (4 nodes) and a ”relatively” small data
set (200 GB), obtaining encouraging results. When using larger clusters
and larger storage space, Tina and Pipistrello should be able to scale up
and analyze hundreds of Terabytes of scientific data in a faster, easier and
efficient way.
IGTHP009 Using Back Propagation Artificial Neural Networks in Conveyor Health

Monitoring and Failure Prediction
ABSTRACT
Belt conveyor is the main equipment in any large scale industrial complex.
And it is of great significance to its normal operation. At present, the belt
conveyors fault detection and health monitoring methods are still not
perfect. Both the discovery and identification of the belt conveyors faults
are also not timely. This paper focuses on roller fault among others,
exploring a new kind of automatic fault detection and identification
method based on Artificial Neural Network technology, through
convolutional train-test method to extract fault features, improving fault
recognition accuracy. This method through field verification in the mine
achieves good results, which proves its feasibility.
IGTHP010 Machine Learning Methods for Classifying Human Physical Activity from
On-Body Accelerometers
ABSTRACT
The use of on-body wearable sensors is widespread in several academic

and industrial domains. Of great interest are their applications in
ambulatory monitoring and pervasive computing systems; here, some
quantitative analysis of human motion and its automatic classification are
the main computational tasks to be pursued. In this paper, we discuss how
human physical activity can be classified using on-body accelerometers,
with a major emphasis devoted to the computational algorithms
employed for this purpose. In particular, we motivate our current interest
for classifiers based on Hidden Markov Models (HMMs). An example is
illustrated and discussed by analyzing a dataset of accelerometer time
series.
IGTHP011 Twitter Sentiment Analysis
ABSTRACT
Twitter is a micro-blogging website that allows people to share and express

their views about topics, or post messages. There has been a lot of work in
the Sentiment Analysis of twitter data. This project involves classification
of tweets into two main sentiments positive and negative. In this project,
the use of features such as unigram, bigram, POS tagging, and effects of
data pre-processing like stemming is observed. Naive Bayes, Support
Vector Machines (SVM) and Maximum Entropy (MaxEnt) are used as the
main classifiers. As we shall see in the sections below, SVM with features
of unigrams, bigrams and stemming, outperforms Naive Bayes
IGTHP012 Automatically Mining Facets for Queries from Their Search Results
ABSTRACT:
Our system addresses the problem of finding query facets which are
multiple groups of words or phrases that explain and summarize the
content covered by a query. We make use of semi-supervised Machine
Learning algorithm such as Multivariate Regression analysis (A variant of
SVM) to determine the important aspects of a query which are usually
presented and repeated in the query’s top retrieved documents in the style
of lists, and query facets can be mined out by aggregating these significant
lists. We propose a systematic solution, which we refer to as QDMiner, to
automatically mine query facets by extracting and grouping frequent lists
from free text, HTML tags, and repeat regions within top search results. To
achieve our goal, we extract the queries from various sources and train our
ML model to automatically identify and group the facets of top queries. Our
initial results show that a large number of lists do exist and useful query
facets can be mined by QDMiner. We further analyze the problem of list
duplication, and find better query facets can be mined by modeling fine-
grained similarities between lists and penalizing the duplicated lists.
IGTHP013 Efficiency of data mining models to predict academic performance and

a cooperative learning model
ABSTRACT
Two purposes of this study are 1) to select a data mining model to predict
learners' academic performance in computer programming subject to
group learners for cooperative learning by comparing the efficiency of the
models created from data mining with classification technique and 2) to
develop a model for cooperative learning via web using the selected data
mining model to group learners. The efficiency of three models created
from data mining with classification technique by using three algorithms
that are K-Nearest Neighbor, Support Vector Machines (SVM) and
Decision Trees are compared and a machine learning model with best
efficiency is selected. The accuracy of the model is measured by taking the
subjects which have a major impact on learning computer programming
subject. Therefore, this model is selected to group learners with STAD
technique for cooperative learning through web. The result also shows
that ID3 is inappropriate to predict learners' performance. The data
mining model created from our proposed system shows that math's GPA
has the most influential for academic performance in computer
programming subject. The model for the cooperative learning model via
web using our proposed system to group learners consists of 5
components that are data management module, prediction and grouping
module, learning resources, cooperative community and quiz module. The
results also show that in the case of using the selected model to group
learners and in the case of grouping learners by the lecturers, the learning
progressive-score in the first case is higher
IGTHP014 Towards Effective Bug Triage with Software Data Reduction Techniques
ABSTRACT
Software companies spend over 45 percent of cost in dealing with

software bugs. An inevitable step of fixing bugs is bug triage, which aims
to correctly assign a developer to a new bug. To decrease the time cost in
manual work, text classification techniques are applied to conduct
automatic bug triage. In this paper, we address the problem of data
reduction for bug triage, i.e., how to reduce the scale and improve the
quality of bug data. We combine instance selection with feature selection
to simultaneously reduce data scale on the bug dimension and the word
dimension. To determine the order of applying instance selection and
feature selection, we extract attributes from historical bug data sets and
build a predictive model for a new bug data set. We empirically investigate
the performance of data reduction on totally 600,000 bug reports of two
large open source projects, namely Eclipse and Mozilla. The results show
that our data reduction can effectively reduce the data scale and improve
the accuracy of bug triage. Our work provides an approach to leveraging
techniques on data processing to form reduced and high-quality bug data
in software development and maintenance.
IGTHP015 INTEGRITY AUDITING AND DEDUPLICATING DATA IN CLOUD
ABSTRACT
As the cloud computing technology develops during the last

decade, outsourcing data to cloud service for storage becomes an
attractive trend, which benefits in sparing efforts on heavy data
maintenance and management. Nevertheless, since the outsourced cloud
storage is not fully trustworthy, it raises security concerns on how to
realize data deduplication in cloud while achieving integrity auditing. In
this work, we study the problem of integrity auditing and secure
deduplication on cloud data. Specifically, aiming at achieving both data
integrity and deduplication in cloud, we propose two secure systems,
namely SecCloud and SecCloud+. SecCloud introduces an auditing entity
with a maintenance of a MapReduce cloud, which helps clients generate
data tags before uploading as well as audit the integrity of data having
been stored in cloud. Compared with previous work, the computation by
user in SecCloud is greatly reduced during the file uploading and auditing
phases. SecCloud+ is designed motivated by the fact that customers
always want to encrypt their data before uploading, and enables integrity
auditing and secure deduplication on encrypted data.
IGTHP016 Analyzing and Scripting Indian Election strategies using Big Data via
Apache Hadoop framework
ABSTRACT
When data grows beyond the capacity of currently existing database tools,
it begins to be referred as Big Data. Big Data possess a grand challenge for
both data analytics and database. This data is very huge in volume; it gets
created at very high speed. Our work is concerned with handling huge
amount of data that is concerned with different formats of elections that
are been contested in India. We have created a structured database which
includes thirteen different attributes providing information related to
different candidates who contested MP elections from different districts
of Punjab. We have opted for Apache Hadoop framework for mining and
extracting relations from the database. Apache Hadoop framework makes
use of Map-Reduce technology. The objective of this paper is to assist
common electorates of Punjab state to take best decision on the basis of
previous track record of politician or political party and decide who to vote
for to get better governance.
IGTHP017 A Novelistic Approach to Analyze Weather Conditions and its Prediction

using Deep learning techniques(IEEE-2017)
ABSTRACT
To predict the Weather conditions based on the features of the data

collected over the past data and to design a model which can allow to
predict the future occurrence of the event and also gives the accuracy of
the different models used.
The project aims to forecast the chances of rainfall by using predictive

analysis in Data mining. The proposed system serves as a tool that trains
the model on the current data and predicts the future Weather in an
efficient manner. Predictive analytic models capture relationships among
many factors in a data set to assess risk with a particular set of conditions
to assign a score or a weight using statistical techniques.
IGTHP018 Job Satisfaction and Employee Turnover in Organizations using Machine

Learning Techniques A Firm Level Perspective.
ABSTRACT
To predict the actual reasons of the employee turnover over the past data
and to design a model which can allow the HR to predict the future
occurrence of the event. The company wants to understand what factors
contributed most to employee turnover and to create a model that can
predict if a certain employee will leave the company or not. The goal is to
create or improve different retention strategies on targeted employees.
Overall, the implementation of this model will allow management to
create better decision-making actions.
IGTHP019 A Novelistic Supervised Learning Approach in Financial Revenue Analysis

for constructing a Feasible Model to predict the Donors for Charity.
(IEEE-2017)
ABSTRACT
To construct a model that accurately predicts whether an individual has a
good income. Understanding an individual's income can help a non-profit
better understand how large of a donation to request, or whether or not
they should reach out to begin with. While it can be difficult to determine
an individual's general income bracket directly from public sources, we can
infer this value from other publicly available features
DATASET:
The dataset for this project originates from the [UCI Machine
LearningRepository](https//archive.ics.uci.edu/ml/datasets/Census+Inco
me). The dataset was donated by Ron Kohavi and Barry Becker, after
being published in the article _"Scaling Up the Accuracy of Naive-Bayes
Classifiers A Decision-Tree Hybrid".
IGTHP020 Dynamic Reconstruction of Vehicle Detection using a High Resolution

Images by Data Mining Approach.
PROBLEM STATEMENT:
The goal of this project is to detect vehicles from a dashboard video using
Deep Learning implementations like YOLO and SSD that utilize convolutional
neural network.
OBJECTIVE
The basic objective of this project is to apply the concepts of HOG and
Machine Learning to detect a Vehicle from a dashboard video.
DATASET:
The most important thing for any machine learning problem is the labelled
data set and here we need to have two sets of data Vehicle and Non Vehicle
Images. The images were taken from some already available datasets like
GTIand KITTI Vision
IGTHP021 A Compendium for Prediction of Accidents Severity by Data mining

Approach.(IEEE-2018)
ABSTRACT
Annually due to road traffic accidents 1.25 million peoples die and 20-50
million peoples hurt non-fatal injuries (WHO report, 2015). According to
the road traffic accident data provided by states, Maharashtra records the
third highest number of fatal accidents (13,212) (NHAI report, 2016).
However, this trend can change in future as it is hard to predict the rate at
which road traffic accidents occur as it can occur in any situation.
Therefore, we need to investigate the hidden pattern that influences the
traffic accident severity levels using data mining techniques.
DATASET :
The Dataset has 36240 entries and 10 features.
PROBLEM STATEMENT :
To determine the severities of the accident based on the features of the
data collected over the past data and to design a model which can allow
to predict the severity of the accidents and also gives the accuracy of the
different models used.
IGTHP022 A Novel Approach to Extract Important Keywords from Documents

Applying Latent Semantic Analysis(IEEE-2018)
ABSTRACT
The basic principle of Classic traditional information retrieval model is the
machine matching of the key word, namely retrieval based on keywords.
This work proposes a pre-clustering-based latent semantic analysis
algorithm for document retrieval. The algorithm can solve the problem of
time consuming computation of the similarity between the query vector
and each text vector in the traditional latent semantic algorithm for
document retrieval. It first clusters the documents using the latent
semantic analysis. In view of the characteristics of document retrieval, it
proposes a new method for calculating the feature weights and adopts the
method of pre-clustering to preprocess document collection. The results of
the experiment show that the new algorithm can reduce the search time
and improve the retrieval efficiency.
DATASET
“Sci.space” news group from 20 news groups dataset, available in the
Scikit-Learn library. It contains 400 news articles related to space.
IGTHP023 An NLP Approach to Information Extraction by Text Mining in a

Cognitive Integrated System.
ABSTRACT:
Data is the basic means of representation of facts in an understandable
manner which can be globally accepted. Data is required for the flow of
information such that, it provides a platform for expressing the views and
specific perspectives of an individual. It is the essential unit of
communication over the generations which has facilitated in the need for
preserving and utilizing the data. Over the years, there has been a rapid
escalation in the information gradient supported by convenient methods
of data storage and representation based on the underlying application.
From this abundant data the main challenge encountered by a user is to
capture data of his interest. Retrieving a part of the data or extracting a
relevant section from a given document is an active area of research and
lead to the introduction of search techniques based on Key terms and
pattern matching. Information Extraction (IE) is the process of extracting
useful data from the already existing data by employing the statistical
techniques of Natural Language Processing (NLP) . It is defined as the act of
identifying, collecting and regularizing relevant information from the given
text and producing the same in a suitable output structure. Although the
extraction process has been automated over years, the need for training
the system to work as per the rapid changes within the specified time range
is very much important.
IGTHP024 Dynamic Reconstruction of Vehicle Detection using a High Resolution

Images by Data Mining Approach.
PROBLEM STATEMENT
The goal of this project is to detect vehicles from a dashboard video using
Deep Learning implementations like YOLO and SSD that utilize convolutional
neural network.
OBJECTIVE
The basic objective of this project is to apply the concepts of HOG and
Machine Learning to detect a Vehicle from a dashboard video.
DATASET :
The most important thing for any machine learning problem is the labelled
data set and here we need to have two sets of data Vehicle and Non Vehicle
Images. The images were taken from some already available datasets like
GTIand KITTI Vision.
IGTHP025 Two-fold Computation Techniques for Twitter Sentimental Analysis using

NLP approach.
ABSTRACT
Social Computing is an innovative and growing computing exemplar for the
analysis and modeling of social activities taking place on various platforms.
It is used to produce intellectual and interactive applications to derive
efficient results. The wide availability of social media sites provides
individuals to share their sentiments or opinions about a particular event,
product or issue. Mining of such informal and homogeneous data is highly
useful to draw conclusions in various fields. Though, the highly unstructured
format of the opinion data available on web makes the mining process
challenging. Textual information present on web is majorly classified into
either of the two categories fact data and sentiment data. Fact data are the
objective terminologies concerning different entities, issues or events.
Whereas sentiment data are the subjective terms, that define individual’s
opinions or beliefs for a particular entity, product or event. Sentiment analysis
is the process of recognizing and classifying different sentiments conveyed
online by the individuals to derive the writer's approach towards a specific
product, topic or event is positive, negative or neutral. Sentiment analysis has
three major component of study as follows sentiment holder i.e. subject,
sentiment itself i.e. belief and object i.e. the topic about which the subject has
shared the sentiment. An object is an entity that represents a definite person,
item, product, issue, event, topic or any organization. Sentiment analysis is
carried out at different levels ranging from coarse level to fine level. The
coarse level sentiment analysis determines the sentiment of the whole
manuscript or document. The fine level sentiment analysis, whereas focuses
on the attributes. Sentiment analysis of Twitter data is carried out on sentence
level which comes in between coarse level and fine level.
DATASET:
The dataset was taken from http//cs.stanford.edu/people/alecmgo/.
Dataset has 1.6million entries.
IGTHP026 Predictive Analytics Model to Diagnose Breast Cancer Tissues using SVM
classifier.(IEEE-2017)
ABSTRACT
Breast cancer is the second largest cause of cancer deaths among women.
At the same time, it is also among the most curable cancer types if it can be
diagnosed early. Research efforts have reported with increasing
confirmation that the support vector machines (SVM) have greater accurate
diagnosis ability. In this paper, breast cancer diagnosis based on a SVM-
based method combined with feature selection has been proposed.
Experiments have been conducted on different training-test partitions of
the Wisconsin breast cancer dataset (WBCD), which is commonly used
among researchers who use machine learning methods for breast cancer
diagnosis. The performance of the method is evaluated using classification
accuracy, sensitivity, specificity, positive and negative predictive values,
receiver operating characteristic (ROC) curves and confusion matrix. The
results show that the highest classification accuracy (99.51%) is obtained for
the SVM model that contains five features, and this is very promising
compared to the previously reported results.
DATASET :
The Dataset is published by Kaggle and taken from the University of
California Irvine (UCI) machine learning repository. The data is taken from
the Breast Cancer Wisconsin Center. It has 570 entries of the data and 32
features in total.
IGTHP027 A Diagnosis System Framework for the Medical decision making in the
field of Heart disease detection.
ABSTRACT:
The goal of this project is to build a model that can predict the heart disease
occurrence, based on the features that describes the disease. In order to
achieve the goal, we used data sets that was collected by Cleveland Clinic
Foundation in Switzerland. The dataset used in this project is part of a
database contains 14 features from Cleveland Clinic Foundation for heart
disease. The dataset shows Different levels of heart disease presence from
1 to 4 and 0 for the absence of the disease. We have 303 rows of people
data with 13 continuous observation of different symptoms. In this study,
we look into different classic machine learning models, and their discoveries
in diseases risks. We have developed two algorithms using linear regression
and decision trees, on Cleveland dataset.
DATASET:
The dataset used in this project is part of a database contains 14 features
from Cleveland Clinic Foundation for heart disease. The dataset shows
different levels of heart disease presence from 1 to 4 and 0 for the absence
of the disease. Experiments with the Cleveland database have
concentrated on simply attempting to distinguish presence (values 1, 2, 3,
4) from absence (value 0). We have 303 rows of people data with 13
continuous observation of different symptoms.
IGTHP028 Automatic detection of plant disease for paddy leaves using Image
Processing Techniques
ABSTRACT
Agricultural productivity is something on which economy highly depends.
This is the one of the reasons that disease detection in plants plays an
important role in agriculture field, as having disease in plants are quite
natural. If proper care is not taken in this area then it causes serious effects
on plants and due to which respective product quality, quantity or
productivity is affected. For instance a disease named little leaf disease is a
hazardous disease found in pine trees in United States. Detection of plant
disease through some automatic technique is beneficial as it reduces a large
work of monitoring in big farms of crops, and at very early stage itself it
detects the symptoms of diseases i.e. when they appear on plant leaves.
This proposed work presents an algorithm for image segmentation
technique which is used for automatic detection and classification of plant
leaf diseases. It also covers survey on different diseases classification
techniques that can be used for plant leaf disease detection. Image
segmentation, which is an important aspect for disease detection in plant
leaf disease, is done by using genetic algorithm.
IGTHP029 AUTOMATIC NUMBER PLATE RECOGNITION USING DEEP CNN Algorithm
ABSTRACT
License Plate recognition is one of the techniques used for vehicle

identification purposes. The sole intention of this project is to find the most
efficient way to recognize the registration information from the digital
image (obtained from the camera). This process usually comprises of three
steps. First step is the license plate localization, regardless of the license-
plate size and orientation. The second step is the segmentation of the
characters and last step is the recognition of the characters from the license
plate. Thus, this project uncovers the fundamental idea of various
algorithms required to accomplish character recognition from the license
plate during Template Matching. This feature of the algorithm mentioned
above helped in achieving faster character recognition of the license plate.
This process of character recognition consists of steps like Image
processing, Defragmentation, Resizing and Character localization that are
required to be performed on the image in order for Template Matching to
be done. As the vehicle registration number recognition is completely
depends on the vehicle plate detection and localization, the proposed
method shall use Convolution Neural Network (CNN).
IGTHP030 AUTOMATIC NUMBER PLATE RECOGNITION USING CNN-RNN Model
ABSTRACT
License Plate recognition is one of the techniques used for vehicle
identification purposes. The sole intention of this project is to find the most
efficient way to recognize the registration information from the digital
image (obtained from the camera). This process usually comprises of three
steps. First step is the license plate localization, regardless of the license-
plate size and orientation. The second step is the segmentation of the
characters and last step is the recognition of the characters from the license
plate. we propose a unified ConvNet-RNN model to recognize real-world
captured license plate photographs. By using a Convolutional Neural
Network (ConvNet) to perform feature extraction and using a Recurrent
Neural Network (RNN) for sequencing, we address the problem of sliding
window approaches being unable to access the context of the entire image
by feeding the entire image as input to the ConvNet. This has the added
benefit of being able to perform end-to-end training of the entire model on
labelled, full license plate images. Experimental results comparing the
ConvNet-RNN architecture to a sliding window-based approach be
demonstrated in the proposed work.
IGTHP031 SLAM A 2-fold Simultaneous localization and mapping of a Landmark
ABSTRACT:
In this project, you'll implement SLAM (Simultaneous Localization and
Mapping) for a 2 dimensional world! You’ll combine what you know about
robot sensor measurements and movement to create a map of an environment
from only sensor and motion data gathered by a robot, over time. SLAM gives
you a way to track the location of a robot in the world in real-time and identify
the locations of landmarks such as buildings, trees, rocks, and other world
features. This is an active area of research in the fields of robotics and
autonomous systems.
DATASET:
We'll be localizing a robot in a 2D grid world. The basis for simultaneous
localization and mapping (SLAM) is to gather information from a robot's sensors
and motions over time, and then use information about measurements and
motion to re-construct a map of the world.
Head Office: No.1 Rated company in Bangalore for all
software courses and Final Year Projects
IGEEKS Technologies
No:19, MN Complex, 2nd Cross,
Sampige Main Road, Malleswaram, Bangalore
Karnataka (560003) India. Above HOP Salon,
Opp. Joyalukkas, Ma lleswaram,
Landmark: Near to Mantri Ma ll, Malleswaram
Bangalore.
Email: nanduigeeks2010@gmail.com ,
nandu@igeekstechnologies.com
Office Phone:
9590544567 / 7019280372
Contact Person:
Mr. Nandu Y,
Director-Projects,
Mobile: 9590544567,7019280372
E-mail: nandu@igeekstechnologies.com
nanduigeeks2010@gmail.com
Partners Address:
RAJAJINAGAR: JAYANAGAR:
#531, 63rd Cross, No 346/17, Manandi Court,
12th Main, after sevabhai hospital, 3rd Floor, 27th Cross,
5th Block, Rajajinagar, Jayanagar 3rd Block East,
Bangalore-10. Bangalore - 560011,
Landmark: Near Bashyam circle. Landmark: Near BDA Complex.
More than 12 years’ experience in IEEE Final Year Project Center,

IGEEKS Technologies Supports you in Java, IOT, Python, Bigdata
Hadoop, Machine Learning, Data Mining, Networking, Embedded,
VLSI, MATLAB, Power Electronics, Power System Technologies.
For Titles and Abstracts visit our website www.makefinalyearproject.com

2018-2019 Latest ML, DS, Ai Python & Hadoop Project Abstracts

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

2018-2019 Latest ML, DS, Ai Python & Hadoop Project Abstracts

Загружено:

Авторское право:

Доступные форматы

For: - B. E | B. Tech | M. E | M. Tech | MCA | BCA | Diploma |MS |M.

PROJECTS TITLES FOR ACADEMIC YEAR 2018-2019

IGTHP002 Mobile Service Experience Prediction Using Machine Learning Methods

With the introduction of 4.5G, mobile operators have focused their

In this study, Bitcoin prediction is performed with Linear Regression (LR)

Nowadays, we use frequently e-mails, one of the communication channels

IGTHP006 Big Data Analytics Predicting Risk of Readmissions of Diabetic Patients

IGTHP007 CallCab A Unified Recommendation System for Carpooling and Regular

IGTHP008 Big-Data in Climate Change Models - A novel approach with Hadoop

The goal of this work is to present a software package which is able to

IGTHP009 Using Back Propagation Artificial Neural Networks in Conveyor Health

The use of on-body wearable sensors is widespread in several academic

Twitter is a micro-blogging website that allows people to share and express

IGTHP013 Efficiency of data mining models to predict academic performance and

Software companies spend over 45 percent of cost in dealing with

IGTHP015 INTEGRITY AUDITING AND DEDUPLICATING DATA IN CLOUD

As the cloud computing technology develops during the last

IGTHP017 A Novelistic Approach to Analyze Weather Conditions and its Prediction

To predict the Weather conditions based on the features of the data

The project aims to forecast the chances of rainfall by using predictive

IGTHP018 Job Satisfaction and Employee Turnover in Organizations using Machine

IGTHP019 A Novelistic Supervised Learning Approach in Financial Revenue Analysis

IGTHP020 Dynamic Reconstruction of Vehicle Detection using a High Resolution

IGTHP021 A Compendium for Prediction of Accidents Severity by Data mining

IGTHP022 A Novel Approach to Extract Important Keywords from Documents

IGTHP023 An NLP Approach to Information Extraction by Text Mining in a

IGTHP024 Dynamic Reconstruction of Vehicle Detection using a High Resolution

IGTHP025 Two-fold Computation Techniques for Twitter Sentimental Analysis using

IGTHP029 AUTOMATIC NUMBER PLATE RECOGNITION USING DEEP CNN Algorithm

License Plate recognition is one of the techniques used for vehicle

IGTHP030 AUTOMATIC NUMBER PLATE RECOGNITION USING CNN-RNN Model

IGTHP031 SLAM A 2-fold Simultaneous localization and mapping of a Landmark

More than 12 years’ experience in IEEE Final Year Project Center,

For Titles and Abstracts visit our website www.makefinalyearproject.com

Вам также может понравиться