Вы находитесь на странице: 1из 37

-----------------------------------------------------------------Group members:

AINIL AFIQAH BT. MOHD HAMDAN


AUWATIF BT. U-UDEEB MAHFOUZ JAZOULY
NUR HAYATUL NISAB BT. MAT SARIP
Prepared for :
MR. MOHD NOOR AZAM B. NAFI
Group :
D2 CS221 5A

SAS Programming

STA 610
Introduction

Introduction :
Source of Dataset : Journal of Statistics Education
- Data Archive.
(http://www.amstat.org/publications/jse/jse_data
_archive.htm)
This data set contains information for individual
residential properties sold in Ames, Iowa from
2006 to 2010.

Methodology

Result &
Analysis

Conclusion

Description of Dataset : See table


Reference

SAS Programming

STA 610
Introduction

Introduction :
Objectives :
Methodology

In order to conduct this project, we have set several


objectives:
a) To check model adequacy: normality, homogeneity

Result &
Analysis

of variance and independency assumption.


b) To check the significance level of the model.

Conclusion

c) To identify if there is relationship between


dependent variable and independent variable.

Reference

SAS Programming

STA 610
Introduction

Introduction :
Objectives :
Methodology

In order to conduct this project, we have set several


objectives:
a)

To check the existence of multicolinearity.

Result &
Analysis

b) To find the best model.


Conclusion

c)

To compare Group means and the overall mean

d) To study the correlation between the variable and


Sales Price.

Reference

SAS Programming

STA 610
Introduction

Methodology :
I.
II.
III.
IV.
V.
VI.

ONE-WAY ANOVA
ONE SAMPLE T-TEST
TWO INDEPENDENT SAMPLE
LINEAR REGRESSION
PIE AND BAR CHART
CORRELATION COEFFICIENT

Methodology

Result &
Analysis

Conclusion

Reference

STA 610

SAS Programming

Introduction

Result & Analysis :


Descriptive Statistics :

Residential Low
Density(1)
$6,161,600
Floating Village
Residential(2)
$3,360,710
Residential
Medium
Density(3),
Commercial (4),
Residential
High Density(0)
$3,675,339, $
1,327,476 and
$1,189,400

Methodology

Result &
Analysis

Conclusion

Reference

SAS Programming

STA 610
Introduction

Result & Analysis :


Descriptive Statistics :
Methodology

Single-family
Detached (0):
62.72%
Townhouse
Inside Unit :
(2) 1.83%

Result &
Analysis

Conclusion

Reference

SAS Programming

STA 610
Introduction

Result & Analysis :


Descriptive Statistics :
Methodology

Single-family
Detached (0):
62.72%
Townhouse
Inside Unit :
(2) 1.83%

Result &
Analysis

Conclusion

Reference

SAS Programming

STA 610
Introduction

Result & Analysis :


Inferential Statistics :
to study the effect of Zoning, LotArea, FirstFloor,
SecondFloor, GarageArea, LivArea, KitQual,
BldgType variables on the SalesPrice.
CORR procedure is required : computed Pearson
correlation
See output

Methodology

Result &
Analysis

Conclusion

Reference

SAS Programming

STA 610
Introduction

Result & Analysis :


Inferential Statistics :
REG procedure : relationship between the
variables

Methodology

Result &
Analysis

Conclusion

Reference

SAS Programming

STA 610
Introduction

Result & Analysis :


Inferential Statistics :
REG procedure : relationship between the
variables

Methodology

Result &
Analysis

Conclusion

Reference

SAS Programming

STA 610
Introduction

Result & Analysis :


Inferential Statistics :
REG procedure : relationship between the
variables

Methodology

Result &
Analysis

Conclusion

Reference

SAS Programming

STA 610
Introduction

Result & Analysis :


Inferential Statistics :
REG procedure : relationship between the
variables

Methodology

Result &
Analysis

Conclusion

Reference

SAS Programming

STA 610
Introduction

Result & Analysis :


Inferential Statistics :
REG procedure : relationship between the
variables

Methodology

Result &
Analysis

Conclusion

Reference

SAS Programming

STA 610
Introduction

Result & Analysis :


Inferential Statistics :
REG procedure : able to check the
multicollinearity in the model .
See ouput

Methodology

Result &
Analysis

Conclusion

Reference

SAS Programming

STA 610
Introduction

Result & Analysis :


Inferential Statistics :
REG procedure : to check the model adequacy.
Normality Assumption :

Methodology

Result &
Analysis

Conclusion

Reference

SAS Programming

STA 610
Introduction

Result & Analysis :


Inferential Statistics :
REG procedure : to check the model adequacy.
Homogeneity Assumption :

Methodology

Result &
Analysis

Conclusion

Reference

SAS Programming

STA 610
Introduction

Result & Analysis :


Inferential Statistics :
REG procedure : to check the model adequacy.
Independence Assumption :

Methodology

Result &
Analysis

Conclusion

Reference

SAS Programming

STA 610
Introduction

Result & Analysis :


Inferential Statistics :
REG procedure : used to select the best predictor
variable to be included in the model.
main approach is on the Backward Selection

Methodology

Result &
Analysis

Final model : ( Best parameter )

= 13172 + 2.171092 + 55.714203 + 30.910425


+ 39.991686 + 4004911 + 5529612 + 1863113
+ 2670071 2184680 3419381

Regression Model is significant. See hypothesis


Coefficient of Multiple Determination,2 = 0.8396
Model Adequacy is checked again.

Conclusion

Reference

SAS Programming

STA 610
Introduction

Result & Analysis :


Inferential Statistics :
ANOVA procedure : used to perform analysis of
variance for these data.
From the output given :
Hypothesis:
0 The regression model is not significant
1 The regression model is a significant
Test Statistic : p-value= <0.0001.
Decision
: Since the p-value (<0.0001) < =
0.05,reject0 .
Conclusion : The regression model is significant

Methodology

Result &
Analysis

Conclusion

Reference

STA 610

SAS Programming

Introduction

Result & Analysis :


Inferential Statistics :

t - TEST procedure : used to study the means of sales


price between the zonings.
The two sample t test : compared the mean of
sample from Floating Village Residential(2)minus the
sample from Commercial(4) See other comparison
From the output given:

: 2 = 4 , 2 4
( = 0.05)
Test statistic : = 9.10
Decision rule : 0.025,28.443 = 2.04667 .Since

Methodology

Result &
Analysis

Hypothesis

= 9.26 > 0.025,28.443 = 2.04667 ,


reject.

Conclusion

:mean of sales price of Floating Village


Residential(2) and Commercial(4) are
significantly different.

Conclusion

Reference

STA 610

SAS Programming

Introduction

Result & Analysis :


Inferential Statistics :

t - TEST procedure : used to study the overall mean


given by the zonings One Sampled t Test
From the output given :
Hypothesis

:: = $200,000 , : $200,000

Methodology

Result &
Analysis

( = 0.05)

Test statistic

: = < .0001

Decision rule : 0.025,228 = 2.05169 (Interpolation)


Since = < .0001 < = 0.05,
reject.

Conclusion

Reference

Conclusion

: Mean of sales price is significantly


different from $200,000.

SAS Programming

STA 610
Introduction

Conclusion :
Since the data is normally distributed, the

Methodology

parametric test are proceeded.

Model adequacy checking assumptions are all


satisfied.

Result &
Analysis

There are several factors that significantly affect


the sale price.

Conclusion

The best model is : = 13172 + 2.171092 +


55.714203 + 30.910425 + 39.991686 +
4004911 + 5529612 + 1863113 +
2670071 2184680 3419381

Reference

SAS Programming

STA 610
Introduction

Reference :
Buchecker, M., Calhoun, S. and Repole, W. (2004).

Methodology

SAS Programming I: Essentials. USA: SAS Institute

Inc.
Kutner, M. H.,Nachtsheim, C. J., Neter, J. and Li, W.

Result &
Analysis

(2005). Applied Linear Statistical Models. New

York: McGraw Hill.

Conclusion

Daniel W. W. (1990). Applied Nonparametric


Statistics. USA: Brooks/Cole Cengage Learning.

Reference

The End

Question & Answer

Output

STA 610

SAS Programming

Introduction :
Variable
X1=Zoning

X2=LotArea
X3=FirstFloor
X4=SecondFloor
X5=GarageArea
X6=LivArea
X7=KitQual

X8=BldgType

Y=SalePrice

Description
The general zoning
classification of the
sale.

Lot size
Area of First Floor
Area of Second Floor
Area of garage
Size of living area
Kitchen quality

Type of building

Selling price

Unit
0

Residential High Density

1
Residential Low Density
2
Floating Village Residential
3
Residential Medium Density
4
Commercial
Square feet (sqft)
Square feet (sqft)
Square feet (sqft)
Square feet (sqft)
Square feet (sqft)
0
Excellent
1
Good
2
Typical/Average
0
Single-family Detached
1
Townhouse End Unit
2
Townhouse Inside Unit
3
Duplex
4
Two-family Conversion
Dollar ($)

Methodology

Result &
Analysis

Conclusion

Reference

SAS Programming

STA 610
Introduction

Result & Analysis :


Inferential Statistics :
Methodology

Result &
Analysis

Conclusion

Reference

SAS Programming

STA 610
Introduction

Result & Analysis :


Inferential Statistics :
Methodology

Result &
Analysis

Conclusion

Reference

SAS Programming

STA 610
Introduction

Result & Analysis :


Inferential Statistics :
Methodology

Result &
Analysis

Conclusion

Reference

STA 610

SAS Programming

Introduction

Result & Analysis :


Inferential Statistics :
Methodology

Hypothesis

:0 = 0,

1 : 0

for at least one j

Result &
Analysis

Test Statistic :p-value = <0.0001.


Decision

:Since the p-value (<0.0001)


< = 0.05 , reject0 .

Conclusion

:The regression model is significant.

Conclusion

Reference

SAS Programming

STA 610
Introduction

Result & Analysis :


Inferential Statistics :
REG procedure : to check the model adequacy.
Normality Assumption :

Methodology

Result &
Analysis

Conclusion

Reference

SAS Programming

STA 610
Introduction

Result & Analysis :


Inferential Statistics :
REG procedure : to check the model adequacy.
Homogeneity Assumption :

Methodology

Result &
Analysis

Conclusion

Reference

SAS Programming

STA 610
Introduction

Result & Analysis :


Inferential Statistics :
REG procedure : to check the model adequacy.
Independence Assumption :

Methodology

Result &
Analysis

Conclusion

Reference

STA 610

SAS Programming

Introduction

Result & Analysis :


Inferential Statistics :
Methodology

The SAS System


The TTEST Procedure
Variable: SalePrice (SalePrice)
Zoning

Method

Mean

95% CL Mean

Std Dev

95% CL StdDev

197689

174778

220600

44560.9

33187.6

67818.6

78086.8

62254.6

93919.0

30792.8

22933.5

46864.4

38300.6

30800.9

50659.9

Diff (1-2)

Pooled

119602

92842.8

146361

Diff (1-2)

Satterthwaite

119602

92711.0

146493

Method

Variances

DF t Value Pr > |t|

Pooled

Equal

32

9.10 <.0001

28.443

9.10 <.0001

16

9.10 <.0001

Satterthwaite Unequal
Cochran

Unequal

Result &
Analysis

Conclusion

Reference

STA 610

SAS Programming

Introduction

Result & Analysis :


Inferential Statistics :
t tabulated

Zoning

t- value

df

0 and 1

-4.44

17.86

-2.10226

0 and 2

-4.95

21.99

-2.07406

0 and 3

0.03

15.642

2.12394

0 and 4

2.95

16.359

2.11641

1 and 2

-1.24

32.656

-2.03642

1 and 3

6.06

62.535

1.99916

= .

1 and 4

9.56

44.307

2.01648

2 and 3

6.20

28.793

2.04562

2 and 4

9.10

28.443

2.04667

3 and 4

4.01

39.196

2.02269

Decision

Conclusion

Failed to

There are no

reject

significant

differences

Methodology

Result &
Analysis

Conclusion
Reject

There are
significant
differences

Reference

STA 610

SAS Programming

Introduction

Result & Analysis :


Inferential Statistics :
Methodology

One - Mean Comparison of Sale Price Given BY Zoning


The TTEST Procedure
Variable: SalePrice (SalePrice)

Result &
Analysis
Frequency: Zoning Zoning

Mean

Std Dev

Std Err

Minimum

Maximum

229

127594

57758.0

3816.8

12789.0

289000

Mean
127594

95% CL Mean
120073

Std Dev

135114

57758.0

DF

t Value

Pr > |t|

228

-18.97

<.0001

Conclusion

95% CL StdDev
52908.2

63594.2

Reference

Вам также может понравиться