Академический Документы
Профессиональный Документы
Культура Документы
Sc |
IEEE
REAL TIME PROJECTS& TRAINING GUIDE
SOFTWARE & EMBEDDED
www.makefinalyearproject.com
IGTHP003
Phishing Web Sites Features Classification Based on Extreme Learning
Machine(IEEE-2018)
ABSTRACT
Phishing are one of the most common and most dangerous attacks among
cybercrimes. The aim of these attacks is to steal the information used by
individuals and organizations to conduct transactions. Phishing websites
contain various hints among their contents and web browser-based
information. The purpose of this study is to perform Extreme Learning
Machine (ELM) based classification for 30 features including Phishing
Websites Data in UC Irvine Machine Learning Repository database. For
results assessment, ELM was compared with other machine learning
methods such as Support Vector Machine (SVM), Naïve Bayes (NB) and
detected to have the highest accuracy of 95.34%
IGTHP004 Prediction of Bitcoin Prices with Machine Learning Methods Using Time
Series Data(IEEE-2018)
ABSTRACT
ABSTRACT
Healthcare holds paramount importance where analytics can be applied
to achieve insights about patients, identify bottlenecks and enhance the
business efficiency. Readmission rates cater to the quality of treatment
provided by the hospitals. Readmission results from improper medication,
early discharge, unmonitored discharge, meager care of hospital staff. To
identify high risk of readmission through data analytics leads to
accessibility to healthcare providers to develop programs to improve the
quality of care and institute targeted interventions. The proper
implementation of these analytic methods aid in proper utilization of
resources in hospitals thus reduces the readmission rate and the cost
incurred due to re-hospitalization. Evolving predictive modeling solutions
is highly challenging for recognizing risks of readmission in healthcare
informatics. The procedure involves integration of numerous factors such
as clinical factors, socio-demographic factors, health conditions, disease
parameters, hospitality quality parameters and various other parameters
that can be specific to requirement of each individual health provider. Big
data consists of large data sets that require high computational processing
to procure the data patterns, trends and associations. The effectiveness of
big data and its analytics in predicting the risk of readmission in diabetic’s
patients has been dealt in the research. The aim of this project is to
determine the risk predictors that can cause readmission among diabetic
patients and detailed analysis has been performed to predict risk of
readmission of diabetic patients.
ABSTRACT
Carpooling taxicab services hold the promise of providing additional
transportation supply, especially in extreme weather or rush hour when
regular taxicab services are insufficient. Although many recommendation
systems about regular taxicab services have been proposed recently, little
research, if any, has been done to assist passengers to find a successful
taxicab ride with carpooling. In this paper, we present the first systematic
work to design a unified recommendation system for both regular and
carpooling services, called CallCab, based on a data driven approach. In
response to a passenger’s request, CallCab aims to recommend either (i)
a vacant taxicab for a regular service with no detour, or (ii) an occupied
taxicab heading to the similar direction for a carpooling service with less
detour, yet without assuming any knowledge of destinations of
passengers already on occupied taxicabs. To analyze these unknown
destinations of occupied taxicabs, CallCab generates and refines taxicab
trip distributions based on GPS datasets and context information collected
in the existing taxicab infrastructure. To improve CallCab’s efficiency to
process such a big dataset, we augment the efficient MapReduce model
with a Measure phase tailored for our application. We evaluate CallCab
with a real world dataset of taxicabs.
ABSTRACT
ABSTRACT
Belt conveyor is the main equipment in any large scale industrial complex.
And it is of great significance to its normal operation. At present, the belt
conveyors fault detection and health monitoring methods are still not
perfect. Both the discovery and identification of the belt conveyors faults
are also not timely. This paper focuses on roller fault among others,
exploring a new kind of automatic fault detection and identification
method based on Artificial Neural Network technology, through
convolutional train-test method to extract fault features, improving fault
recognition accuracy. This method through field verification in the mine
achieves good results, which proves its feasibility.
IGTHP010 Machine Learning Methods for Classifying Human Physical Activity from
On-Body Accelerometers
ABSTRACT
ABSTRACT
IGTHP012 Automatically Mining Facets for Queries from Their Search Results
ABSTRACT:
Our system addresses the problem of finding query facets which are
multiple groups of words or phrases that explain and summarize the
content covered by a query. We make use of semi-supervised Machine
Learning algorithm such as Multivariate Regression analysis (A variant of
SVM) to determine the important aspects of a query which are usually
presented and repeated in the query’s top retrieved documents in the style
of lists, and query facets can be mined out by aggregating these significant
lists. We propose a systematic solution, which we refer to as QDMiner, to
automatically mine query facets by extracting and grouping frequent lists
from free text, HTML tags, and repeat regions within top search results. To
achieve our goal, we extract the queries from various sources and train our
ML model to automatically identify and group the facets of top queries. Our
initial results show that a large number of lists do exist and useful query
facets can be mined by QDMiner. We further analyze the problem of list
duplication, and find better query facets can be mined by modeling fine-
grained similarities between lists and penalizing the duplicated lists.
ABSTRACT
Two purposes of this study are 1) to select a data mining model to predict
learners' academic performance in computer programming subject to
group learners for cooperative learning by comparing the efficiency of the
models created from data mining with classification technique and 2) to
develop a model for cooperative learning via web using the selected data
mining model to group learners. The efficiency of three models created
from data mining with classification technique by using three algorithms
that are K-Nearest Neighbor, Support Vector Machines (SVM) and
Decision Trees are compared and a machine learning model with best
efficiency is selected. The accuracy of the model is measured by taking the
subjects which have a major impact on learning computer programming
subject. Therefore, this model is selected to group learners with STAD
technique for cooperative learning through web. The result also shows
that ID3 is inappropriate to predict learners' performance. The data
mining model created from our proposed system shows that math's GPA
has the most influential for academic performance in computer
programming subject. The model for the cooperative learning model via
web using our proposed system to group learners consists of 5
components that are data management module, prediction and grouping
module, learning resources, cooperative community and quiz module. The
results also show that in the case of using the selected model to group
learners and in the case of grouping learners by the lecturers, the learning
progressive-score in the first case is higher
IGTHP014 Towards Effective Bug Triage with Software Data Reduction Techniques
ABSTRACT
ABSTRACT
IGTHP016 Analyzing and Scripting Indian Election strategies using Big Data via
Apache Hadoop framework
ABSTRACT
When data grows beyond the capacity of currently existing database tools,
it begins to be referred as Big Data. Big Data possess a grand challenge for
both data analytics and database. This data is very huge in volume; it gets
created at very high speed. Our work is concerned with handling huge
amount of data that is concerned with different formats of elections that
are been contested in India. We have created a structured database which
includes thirteen different attributes providing information related to
different candidates who contested MP elections from different districts
of Punjab. We have opted for Apache Hadoop framework for mining and
extracting relations from the database. Apache Hadoop framework makes
use of Map-Reduce technology. The objective of this paper is to assist
common electorates of Punjab state to take best decision on the basis of
previous track record of politician or political party and decide who to vote
for to get better governance.
ABSTRACT
ABSTRACT
To predict the actual reasons of the employee turnover over the past data
and to design a model which can allow the HR to predict the future
occurrence of the event. The company wants to understand what factors
contributed most to employee turnover and to create a model that can
predict if a certain employee will leave the company or not. The goal is to
create or improve different retention strategies on targeted employees.
Overall, the implementation of this model will allow management to
create better decision-making actions.
PROBLEM STATEMENT:
The goal of this project is to detect vehicles from a dashboard video using
Deep Learning implementations like YOLO and SSD that utilize convolutional
neural network.
OBJECTIVE
The basic objective of this project is to apply the concepts of HOG and
Machine Learning to detect a Vehicle from a dashboard video.
DATASET:
The most important thing for any machine learning problem is the labelled
data set and here we need to have two sets of data Vehicle and Non Vehicle
Images. The images were taken from some already available datasets like
GTIand KITTI Vision
ABSTRACT
Annually due to road traffic accidents 1.25 million peoples die and 20-50
million peoples hurt non-fatal injuries (WHO report, 2015). According to
the road traffic accident data provided by states, Maharashtra records the
third highest number of fatal accidents (13,212) (NHAI report, 2016).
However, this trend can change in future as it is hard to predict the rate at
which road traffic accidents occur as it can occur in any situation.
Therefore, we need to investigate the hidden pattern that influences the
traffic accident severity levels using data mining techniques.
DATASET :
The Dataset has 36240 entries and 10 features.
PROBLEM STATEMENT :
To determine the severities of the accident based on the features of the
data collected over the past data and to design a model which can allow
to predict the severity of the accidents and also gives the accuracy of the
different models used.
ABSTRACT:
Data is the basic means of representation of facts in an understandable
manner which can be globally accepted. Data is required for the flow of
information such that, it provides a platform for expressing the views and
specific perspectives of an individual. It is the essential unit of
communication over the generations which has facilitated in the need for
preserving and utilizing the data. Over the years, there has been a rapid
escalation in the information gradient supported by convenient methods
of data storage and representation based on the underlying application.
From this abundant data the main challenge encountered by a user is to
capture data of his interest. Retrieving a part of the data or extracting a
relevant section from a given document is an active area of research and
lead to the introduction of search techniques based on Key terms and
pattern matching. Information Extraction (IE) is the process of extracting
useful data from the already existing data by employing the statistical
techniques of Natural Language Processing (NLP) . It is defined as the act of
identifying, collecting and regularizing relevant information from the given
text and producing the same in a suitable output structure. Although the
extraction process has been automated over years, the need for training
the system to work as per the rapid changes within the specified time range
is very much important.
PROBLEM STATEMENT
The goal of this project is to detect vehicles from a dashboard video using
Deep Learning implementations like YOLO and SSD that utilize convolutional
neural network.
OBJECTIVE
The basic objective of this project is to apply the concepts of HOG and
Machine Learning to detect a Vehicle from a dashboard video.
DATASET :
The most important thing for any machine learning problem is the labelled
data set and here we need to have two sets of data Vehicle and Non Vehicle
Images. The images were taken from some already available datasets like
GTIand KITTI Vision.
ABSTRACT
Social Computing is an innovative and growing computing exemplar for the
analysis and modeling of social activities taking place on various platforms.
It is used to produce intellectual and interactive applications to derive
efficient results. The wide availability of social media sites provides
individuals to share their sentiments or opinions about a particular event,
product or issue. Mining of such informal and homogeneous data is highly
useful to draw conclusions in various fields. Though, the highly unstructured
format of the opinion data available on web makes the mining process
challenging. Textual information present on web is majorly classified into
either of the two categories fact data and sentiment data. Fact data are the
objective terminologies concerning different entities, issues or events.
Whereas sentiment data are the subjective terms, that define individual’s
opinions or beliefs for a particular entity, product or event. Sentiment analysis
is the process of recognizing and classifying different sentiments conveyed
online by the individuals to derive the writer's approach towards a specific
product, topic or event is positive, negative or neutral. Sentiment analysis has
three major component of study as follows sentiment holder i.e. subject,
sentiment itself i.e. belief and object i.e. the topic about which the subject has
shared the sentiment. An object is an entity that represents a definite person,
item, product, issue, event, topic or any organization. Sentiment analysis is
carried out at different levels ranging from coarse level to fine level. The
coarse level sentiment analysis determines the sentiment of the whole
manuscript or document. The fine level sentiment analysis, whereas focuses
on the attributes. Sentiment analysis of Twitter data is carried out on sentence
level which comes in between coarse level and fine level.
DATASET:
The dataset was taken from http//cs.stanford.edu/people/alecmgo/.
Dataset has 1.6million entries.
IGTHP026 Predictive Analytics Model to Diagnose Breast Cancer Tissues using SVM
classifier.(IEEE-2017)
ABSTRACT
Breast cancer is the second largest cause of cancer deaths among women.
At the same time, it is also among the most curable cancer types if it can be
diagnosed early. Research efforts have reported with increasing
confirmation that the support vector machines (SVM) have greater accurate
diagnosis ability. In this paper, breast cancer diagnosis based on a SVM-
based method combined with feature selection has been proposed.
Experiments have been conducted on different training-test partitions of
the Wisconsin breast cancer dataset (WBCD), which is commonly used
among researchers who use machine learning methods for breast cancer
diagnosis. The performance of the method is evaluated using classification
accuracy, sensitivity, specificity, positive and negative predictive values,
receiver operating characteristic (ROC) curves and confusion matrix. The
results show that the highest classification accuracy (99.51%) is obtained for
the SVM model that contains five features, and this is very promising
compared to the previously reported results.
DATASET :
The Dataset is published by Kaggle and taken from the University of
California Irvine (UCI) machine learning repository. The data is taken from
the Breast Cancer Wisconsin Center. It has 570 entries of the data and 32
features in total.
IGTHP027 A Diagnosis System Framework for the Medical decision making in the
field of Heart disease detection.
ABSTRACT:
The goal of this project is to build a model that can predict the heart disease
occurrence, based on the features that describes the disease. In order to
achieve the goal, we used data sets that was collected by Cleveland Clinic
Foundation in Switzerland. The dataset used in this project is part of a
database contains 14 features from Cleveland Clinic Foundation for heart
disease. The dataset shows Different levels of heart disease presence from
1 to 4 and 0 for the absence of the disease. We have 303 rows of people
data with 13 continuous observation of different symptoms. In this study,
we look into different classic machine learning models, and their discoveries
in diseases risks. We have developed two algorithms using linear regression
and decision trees, on Cleveland dataset.
DATASET:
The dataset used in this project is part of a database contains 14 features
from Cleveland Clinic Foundation for heart disease. The dataset shows
different levels of heart disease presence from 1 to 4 and 0 for the absence
of the disease. Experiments with the Cleveland database have
concentrated on simply attempting to distinguish presence (values 1, 2, 3,
4) from absence (value 0). We have 303 rows of people data with 13
continuous observation of different symptoms.
IGTHP028 Automatic detection of plant disease for paddy leaves using Image
Processing Techniques
ABSTRACT
Agricultural productivity is something on which economy highly depends.
This is the one of the reasons that disease detection in plants plays an
important role in agriculture field, as having disease in plants are quite
natural. If proper care is not taken in this area then it causes serious effects
on plants and due to which respective product quality, quantity or
productivity is affected. For instance a disease named little leaf disease is a
hazardous disease found in pine trees in United States. Detection of plant
disease through some automatic technique is beneficial as it reduces a large
work of monitoring in big farms of crops, and at very early stage itself it
detects the symptoms of diseases i.e. when they appear on plant leaves.
This proposed work presents an algorithm for image segmentation
technique which is used for automatic detection and classification of plant
leaf diseases. It also covers survey on different diseases classification
techniques that can be used for plant leaf disease detection. Image
segmentation, which is an important aspect for disease detection in plant
leaf disease, is done by using genetic algorithm.
ABSTRACT
ABSTRACT
License Plate recognition is one of the techniques used for vehicle
identification purposes. The sole intention of this project is to find the most
efficient way to recognize the registration information from the digital
image (obtained from the camera). This process usually comprises of three
steps. First step is the license plate localization, regardless of the license-
plate size and orientation. The second step is the segmentation of the
characters and last step is the recognition of the characters from the license
plate. we propose a unified ConvNet-RNN model to recognize real-world
captured license plate photographs. By using a Convolutional Neural
Network (ConvNet) to perform feature extraction and using a Recurrent
Neural Network (RNN) for sequencing, we address the problem of sliding
window approaches being unable to access the context of the entire image
by feeding the entire image as input to the ConvNet. This has the added
benefit of being able to perform end-to-end training of the entire model on
labelled, full license plate images. Experimental results comparing the
ConvNet-RNN architecture to a sliding window-based approach be
demonstrated in the proposed work.
ABSTRACT:
In this project, you'll implement SLAM (Simultaneous Localization and
Mapping) for a 2 dimensional world! You’ll combine what you know about
robot sensor measurements and movement to create a map of an environment
from only sensor and motion data gathered by a robot, over time. SLAM gives
you a way to track the location of a robot in the world in real-time and identify
the locations of landmarks such as buildings, trees, rocks, and other world
features. This is an active area of research in the fields of robotics and
autonomous systems.
DATASET:
We'll be localizing a robot in a 2D grid world. The basis for simultaneous
localization and mapping (SLAM) is to gather information from a robot's sensors
and motions over time, and then use information about measurements and
motion to re-construct a map of the world.
Head Office: No.1 Rated company in Bangalore for all
software courses and Final Year Projects
IGEEKS Technologies
No:19, MN Complex, 2nd Cross,
Sampige Main Road, Malleswaram, Bangalore
Karnataka (560003) India. Above HOP Salon,
Opp. Joyalukkas, Ma lleswaram,
Landmark: Near to Mantri Ma ll, Malleswaram
Bangalore.
Email: nanduigeeks2010@gmail.com ,
nandu@igeekstechnologies.com
Office Phone:
9590544567 / 7019280372
Contact Person:
Mr. Nandu Y,
Director-Projects,
Mobile: 9590544567,7019280372
E-mail: nandu@igeekstechnologies.com
nanduigeeks2010@gmail.com
Partners Address:
RAJAJINAGAR: JAYANAGAR:
#531, 63rd Cross, No 346/17, Manandi Court,
12th Main, after sevabhai hospital, 3rd Floor, 27th Cross,
5th Block, Rajajinagar, Jayanagar 3rd Block East,
Bangalore-10. Bangalore - 560011,
Landmark: Near Bashyam circle. Landmark: Near BDA Complex.