Square, Q-Q Plots &normality

Psych 494
Fall 2001
Solution 4
1.
Variable=X1
W:Normal 0.827419 Pr<W 0.0007
Normal Probability Plot

27.5+ * * ++
| ++++++++++
| ++++++*+*
| ************ **
| *+*+*+**+*+
2.5+ +++*+++++
+----+----+----+----+----+----+----+----+----+----+
-2 -1 0 +1 +2
Variable=X2
W:Normal 0.924988 Pr<W 0.0697

17+ +++*+
| * * *+*+
| * *++++
| **+++
9+ +++*
| +++****
| *+*** *
| * *+++*
1+ * ++++
+----+----+----+----+----+----+----+----+----+----+
-2 -1 0 +1 +2
Variable=X3
W:Normal 0.967717 Pr<W 0.6204

17+ * *++++*
| *+*+++
| +**+*+
| ******
| *****
| *+*+**+
| +++++
3+ +++*++ *
+----+----+----+----+----+----+----+----+----+----+
-2 -1 0 +1 +2
The Q-Q plots show that only X1 has gross departure from normality. The Shapiro-Wilk statistics confirm this observation.
2)
Chi-square Plot
16 +
|
| *
|
|
|
14 +
|
|
|
|
|
12 +
|
|
|
|
|
10 +
|
D |
i |
s |
t | *
a 8 +
n |
c |
e |
| *
|
6 +
|
|
| *
|
|
4 + * *
| * * *
|
| *
|
| * *
2 +
|
| * *
| *** *
| ** **
| *
0 + * *
-+------+------+------+------+------+------+------+------+------+------+-
0 1 2 3 4 5 6 7 8 9 10
ChiSq
Most of the observations (except for observation 9) lie along the line. This suggests that the assumption of multivariate
normality is tenable.
TOPLOT
1.4236522 0.8807247
4.0135936 3.8113869
3.0763677 2.8907255
3.6648708 3.7549258
1.0878828 0.6533776
1.9635806 1.143333
0.7558302 0.5598385
4.4149844 3.8723804
9.8374093 15.262769
0.1848318 0.0355722
2.3659739 1.3338642
1.7767839 1.019959
0.5843744 0.4707491
1.2543524 0.6645781
3.3554449 3.7399316
2.5857116 2.3258664
0.922479 0.6167535
2.8213339 2.3763783
0.4011734 0.1499929
6.2513886 6.8316784
7.40688 8.3442448
1.5973096 0.9699873
4.8904101 3.9779303
2.1593473 1.2612803
5.4773439 5.0517725
CHI80
4.6416277
Around 20 observations are expected to lower than the quantile 4.64. The actual count is 21, which is very close to what is
expected under normality. Based on the chi-square plot and the probability contour plot, we can say that it is reasonable to assume
trivariate normality.
3)
A=1, B=2, etc. Plot of X1*X2. A=1, B=2, etc. Plot of X1*X3.
30 + 30 +
| A | A
| A | A
| |
| |
| |
20 + 20 +
| |
| AA | A A
X1 | A A X1 | A A
| AA A A | A AAA
| A A A A | A A A A
10 + A A A B A 10 + A AB A A
| A A A | A AA
| A | A
| A | A
| |
| |
0 + 0 +
-+---------------+---------------+ -+-------+-------+-------+-------+
0 10 20 0 5 10 15 20
X2 X3
A=1, B=2, etc. Plot of X2*X3.
20 +
|
| A
|
X2 | A
| A A A
| A
| A
| A A
10 +
|
| A A A
| A
| AA BA A A
|
| A
| A A A
| A
0 +
-+-------+-------+-------+-------+
0 5 10 15 20
X3
It can be argued that at least one of the observations (9) is an outlier. This is evident from the scatter plots and the large
statistical distance. Observation 21 is also a likely candidate.
4. a) H0: µ ' = [10, 10, 10]

H1: µ ' ≠ [10, 10, 10].
T_SQ T_SQCRIT PVALUE

30.373968 10.224691 0.0004974
The T2 = 30.37 with a p-value of .0005. At an alpha level of .05, we reject the null hypothesis that the means are all equal to 10.
b)
VAL
24.898543
6.757645
4.6820876
VEC
0.458362 0.0873568 0.884462
0.7521083 -0.568348 -0.333637
0.473537 0.8181376 -0.326211
The length of the longest axis is 2 24.90 10.22 / 23 = 6.65 The next longest axis is 2 6.76 10.22 / 23 = 3.47 while the
shortest axis has a length of 2 4.68 10.22 / 23 = 2.89 . The direction of the ellipsoid is given by the eigenvectors.
c) Simultaneous Confidence Intervals for µ1 , µ 2 , µ 3 and µ1 + µ 2 + µ 3 :
T-Square Simulateneous Confidence Intervals
SCLM
9.2549748 13.243286
4.7246222 10.188421
8.4131009 12.755595
23.647662 34.932338
Bonferroni Simulateneous Confidence Intervals
SCLM
9.552743 12.945518
5.1325507 9.7804928
8.7373125 12.431383
24.490179 34.089821
The 4 intervals based on the Bonferroni method are shorter compared to those using T2. In most situations given the same
confidence level, Bonferroni intervals are more precise. Although one may be inclined to always use the Bonferroni, some
situations dictate that simultaneous confidence intervals be based on T2. In situations where no particular contrast is of interest
prior to the implementation of the study, the T2 method is ideal since this method maintains same the confidence level even if data
snooping is involved. In some situations where the number of contrasts is relatively large, one may be better off with the T2
method, regardless of when the contrasts are formulated.
SAS Program for Homeowrk 4:
options ls=78;
data milk;
infile 'milk.dat';
input x1-x3;
* Univariate Test For Normality;
proc univariate normal plot;
*Multivariate Normality and Chi-square Plot;
proc iml;
START distance(X);
n=nrow(x);
p=ncol(x);
one=J(N,1);
xbar=t(X[:,]);
I=I(n);
SSCP=X`*(I-one*t(one)*1/n)*X;
s = SSCP/(N-1);
d=j(nrow(x),1);
means=one*t(xbar);
do i=1 to nrow(x);
d[i]= (x[i,]- means[i,])*inv(s)*t(x[i,]-means[i,]);
end;
y=ranktie(d);
chisq=cinv((y-.5)/n,p);
toplot=chisq||d;
call pgraf(toplot,'*','ChiSq','Distance','Chi-square Plot');

print toplot;
finish;
use milk;
read all var _num_ into X;
run distance(x);
chi80=cinv(.8,3);
print chi80;
proc plot hpercent=50 vpercent=50;

plot x1*x2;
plot x1*x3;
plot x2*x3;
data milk1;
set milk;
if _n_ ne 9 and _n_ ne 21;
proc iml;
use milk1;
read all var{x1 x2 x3} into x;
START stat(X,Xbar,S);
N=nrow(X);
one=J(N,1);
Xbar=t(X[:,]);
I=I(n);
SSCP=X`*(I-one*t(one)*1/n)*X;
S = SSCP/(N-1);
FINISH stat;
START test(x,xbar,s,mu,alpha);
n=nrow(x);
p=ncol(x);
t_sq=n*(xbar-mu)`*inv(s)*(xbar-mu);
t_sqcrit=(n-1)*p/(n-p)*finv(1-alpha,p,n-p);
pvalue=1-probf((n-p)*t_sq/((n-1)*p),p,n-p);
print t_sq t_sqcrit pvalue;
FINISH test;
mu={10,10,10};
alpha=.05;
run stat(x,xbar,s);
run test(x,xbar,s,mu,alpha);
call eigen(val,vec,s);
print val, vec;
start sclm_t2(xbar,s,n,a,alpha);
p=nrow(s);
i=i(p);
nclm=nrow(a);
sclm=j(nclm,2);
crit=p*(n-1)*finv(1-alpha,p,n-p)/(n*(n-p));
do i=1 to nclm;
me=sqrt(crit*a[i,]*s*t(a[i,]));
sclm[i,1]=a[i,]*xbar-me;
sclm[i,2]=a[i,]*xbar+me;
end;
print ,'T-Square Simulateneous Confidence Intervals', sclm;

finish sclm_t2;
a={1 0 0,
0 1 0,
0 0 1,
1 1 1};
n=23;
alpha=.05;
run stat(x,xbar,s);
run sclm_t2(xbar,s,n,a,alpha);
start sclm_b(xbar,s,n,a,alpha);
p=nrow(s);
i=i(p);
nclm=nrow(a);
sclm=j(nclm,2);
crit=tinv(1-alpha/(2*nclm),n-1);
do i=1 to nclm;
me=crit*sqrt(a[i,]*s*t(a[i,])/n);
sclm[i,1]=a[i,]*xbar-me;
sclm[i,2]=a[i,]*xbar+me;
end;
print , 'Bonferroni Simulateneous Confidence Intervals', sclm;
finish sclm_b;
run sclm_b(xbar,s,n,a,alpha);

Square, Q-Q Plots &normality

Загружено:

Сведения о документе

Исходное описание:

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Square, Q-Q Plots &normality

Загружено:

Авторское право:

Доступные форматы

Psych 494

Normal Probability Plot

Normal Probability Plot

Normal Probability Plot

4. a) H0: µ ' = [10, 10, 10]

T_SQ T_SQCRIT PVALUE

T-Square Simulateneous Confidence Intervals

Bonferroni Simulateneous Confidence Intervals

SAS Program for Homeowrk 4:

* Univariate Test For Normality;

proc univariate normal plot;

*Multivariate Normality and Chi-square Plot;

call pgraf(toplot,'*','ChiSq','Distance','Chi-square Plot');

proc plot hpercent=50 vpercent=50;

print ,'T-Square Simulateneous Confidence Intervals', sclm;

Вам также может понравиться