Академический Документы
Профессиональный Документы
Культура Документы
*************************************************************************************
1) Overview:-
2) Introduction:-
It might have happened so many times that you or someone yours need doctors
help immediately, but they are not available due to some reason. The Heart
Disease Prediction application is an end user support and online consultation
1
© Copyright to SLC: Illegal copies of this material is prohibited.
Softpro Learning Center Development Group
project. Here, we propose a web application that allows users to get instant
guidance on their heart disease through an intelligent system online.
.“
3)OBJECTIVE :-
The prime objective of this project is to construct a working model which has the
capability of predicting the value of houses, we will need to separate the dataset
into features and the target variable. The features, ‘RM’, ‘LSTAT’, and ‘PTRATIO’,
give us quantitative information about each data point. The target
variable, ‘MEDV’, will be the variable we seek to predict. These are stored in
features and prices, respectively.
This project aims in constructing a mathematical model using Multiple Regression
to estimate the selling price of the house based on a set of predictor variables.
• Analysis Software Used – SAS (Statistical Analysis Software)
4)SCOPE :-
In future we can also include, latitude, longitude and elevation of the house in the
model to predict the house price with more accuracy. Future work can also include
demographics variable like income, number of children, education, age of the family
group etc in the model, to explain the variability in the house pricing and to
predict house pricing more effectively.
The dataset (Dehi NCR Housing Price) was taken from the Delhi Public Library is a
national depository library in the Indian state of Delhi and is freely available for
download from the Delhi State Website Repository. The dataset consists of 506
observations of 14 attributes. The median value of house price in $1000s, denoted
by MEDV, is the outcome or the dependent variable in our model. Below is a brief
description of each Model and the outcome in our dataset:
2
© Copyright to SLC: Illegal copies of this material is prohibited.
Softpro Learning Center Development Group
3
© Copyright to SLC: Illegal copies of this material is prohibited.
Softpro Learning Center Development Group
4
© Copyright to SLC: Illegal copies of this material is prohibited.
Softpro Learning Center Development Group
Let us visualize the distribution and density of the outcome, MEDV. The black curve represents the
density. In addition, the boxplot is also plotted to bring an additional perspective. We see that the
median value of housing price is skewed to the right, with a number of outliers to the right. It may be
useful to transform ‘MEDV’ column using functions like natural logrithm, while modeling the
hypothesis for regression analysis.
5
© Copyright to SLC: Illegal copies of this material is prohibited.
Softpro Learning Center Development Group
Summary Of architecture:
LIST OF DEPENDENT AND INDEPENDENT VARIABLES
-We have 8 independent variables and 1 dependent variable.we screen variables
based on correlation coefficient with price and amount of variability explained by
the model (R-square).
STASTISTICAL APPROACH
6
© Copyright to SLC: Illegal copies of this material is prohibited.
Softpro Learning Center Development Group
1) The housing price is transformed using natural log and appears very close to
normal distribution. This ensures linearity relationship between housing price and
other predictor variables.
2) The distribution is not that much skewed compared to before transformation.
ANOVA TABLE:
7
© Copyright to SLC: Illegal copies of this material is prohibited.
Softpro Learning Center Development Group
8
© Copyright to SLC: Illegal copies of this material is prohibited.
Softpro Learning Center Development Group
5) Conclusion:-
We are able to predict house price with around 90% accuracy for most of the
cases and we have a good R-square of 0.83, which means 83% of the variability is
explained by the model and we are also able to explain the interpretation of the
estimates of the model .
9
© Copyright to SLC: Illegal copies of this material is prohibited.