Вы находитесь на странице: 1из 6

STA-201 HW-2

SID-999459827

Problem 2.1: Roman Numerals

Obs name

party

start

end age

roman_numerals_age

1 Ford

MCMLXXIV

MCMLXXVII

61

MCMXIII

2 Carter

MCMLXXVII

MCMLXXXI

52

MCMXXV

MCMLXXXI MCMLXXXIX

69

MCMXII

3 Reagan R
4 Bush41 R

MCMLXXXIX

MCMXCIII

64

MCMXXV

5 Clinton D

MCMXCIII

MMI

46

MCMXLVII

6 Bush43 R

MMI

MMIX

54

MCMXLVII

7 Obama D

MMIX

47

MCMLXII

Problem 2.2: Change in CPI relative to previous month( printed first 5 observations only)

Obs month

date

cpi

change_in_cpi

1 15341 177.1

2 15372 177.8

0.7

3 15400 178.8

1.0

4 15431 179.8

1.0

5 15461 179.8

0.0

Problem 2.3: Consumer Price Index test the normality distribution assumption.

The SAS System


The UNIVARIATE Procedure
Variable: change_in_cpi
Moments
N
Mean

145 Sum Weights


0.3978 Sum Observations

145
57.681

STA-201 HW-2

SID-999459827

Moments
Std Deviation

0.87823553 Variance

0.77129765

Skewness

-1.1336079 Kurtosis

4.74175215

Uncorrected SS 134.012363 Corrected SS

111.066861

Coeff Variation 220.773135 Std Error Mean

0.07293349

Basic Statistical Measures


Location
Mean

Variability

0.397800 Std Deviation

0.87824

Median 0.400000 Variance


Mode

0.77130

1.000000 Range

6.54800

Interquartile Range 1.09000


Tests for Location: Mu0=0
Test

Statistic

Student's t

Sign

p Value

5.454284 Pr > |t|

Signed Rank S

<.0001

32.5 Pr >= |M| <.0001


2914 Pr >= |S|

Quantiles (Definition 5)
Quantile

Estimate

100% Max

2.400

99%

2.183

95%

1.809

90%

1.334

75% Q3

1.000

50% Median

0.400

25% Q1

-0.090

10%

-0.477

5%

-0.878

1%

-2.210

0% Min

-4.148

<.0001

STA-201 HW-2

SID-999459827

Extreme Observations
Lowest

Highest

Value Obs Value Obs


-4.148

83 1.853

63

-2.210

82 1.886 134

-2.197

84 2.158 111

-1.600

47 2.183

78

-1.100

58 2.400

45

Missing Values
Missing Count
Value
.

11

Percent Of
All Obs Missing Obs
7.05

100.00

Conclusion- The distribution looks heavy tailed. Hence the change in CPI doesnt seem to follow a normal
distribution.

Problem 2.4a) The weight used to calculate the average age of presidents.

The MEANS Procedure

STA-201 HW-2

SID-999459827

Analysis Variable : age


N

Mean

Std Dev

Minimu
m

Maximu
m

6 57.11428 22.51907 46.00000 69.000000


57
13
00
0

b) The data was initially sorted in descending for the variable party and then the average age was calculated
using weights.
party=D

Analysis Variable : age


N

Mean

Std Dev

Minimum

Maximum

2 48.0000000 9.7979590 46.0000000 52.0000000

party=R
Analysis Variable : age
N

Mean

Std Dev

Minimum

Maximum

4 61.8695652 17.5367110 54.0000000 69.0000000

Problem 2.5: The Average percentage of Californians who ride bike to work.

Analysis Variable : Bike_Share_of_Commuters Bike Share


of Commuters
N

Mean

Std Dev

439 1.0104833 405.1797391

Minimum

Maximum

0 16.6000000

STA-201 HW-2

SID-999459827

CODE/* creating your own library*/


libname amruta "C:\Users\amrmad\Documents\amruta";
run;
/* Problem 2.1 */
data amruta.presidents;
input name $ party $ start end age ;
datalines;
Ford R 1974 1977 61
Carter D 1977 1981 52
Reagan R 1981 1989 69
Bush41 R 1989 1993 64
Clinton D 1993 2001 46
Bush43 R 2001 2009 54
Obama D 2009 . 47
;
/*Problem 2.2 */
data amruta.romans;
set amruta.presidents;
roman_numerals = start- age;
format start end roman_numerals ROMAN10.;
run;
ods rtf file= 'amruta.romans.rtf';
proc print;
title 'Problem 2.1:Roman Numerals';
run;
ods rtf close;
proc import datafile ="C:\Users\amrmad\Documents\amruta\cpidata2.xls" out=amruta.cpi;
run;
data amruta.cpi_new; /*Converting the data into suitable form */
set amruta.cpi;
drop year--half2;
array monthly{12} jan--dec;
do month = 1 to 12;
date=mdy(month,1,year);
cpi = monthly{month};
output;
end;
data amruta.calculate_cpi; /* Calculate cpi change with the help of the function dif */
set amruta.cpi_new;
/*change_in_cpi = (cpi-lag( cpi ))/cpi;*/
change_in_cpi= dif(cpi);
run;
ods rtf file= 'amruta.calculate_cpi.rtf';
proc print data = amruta.calculate_cpi (obs=5);
title 'Problem 2.2:Change in CPI relative to previous month';

STA-201 HW-2

SID-999459827

run;
ods rtf close;
/* Problem 2.3 */
PROC UNIVARIATE DATA=amruta.calculate_cpi;
QQPLOT change_in_cpi;
HISTOGRAM;*/
RUN;
proc print;
run;
/* Problem 2.4*/
data amruta.presidents1;
set amruta.presidents;
year=end-start;
run;
ods rtf file= 'amruta.presidents1.rtf';
proc means data = amruta.presidents1;
weight year;
var age;
run;
ods rtf close;
proc sort data= amruta.presidents OUT=amruta.democrats ;
BY party ;
RUN ;
data amruta.democrats;
set amruta.democrats;
year= end-start;
run;
proc means data =amruta.democrats;
by party;
weight year;
var age;
run;
/*Problem 2.5*/
proc import datafile ="C:\Users\amrmad\Documents\amruta\bikecommuters.xlsx" out=amruta.bike;
sheet='sheet1';
run;
ods rtf file= 'amruta.bike.rtf';
proc means data = amruta.bike;
weight Total_Workers;
var Bike_Share_of_Commuters;
run;
ods rtf close;
/*end of code*/