Академический Документы
Профессиональный Документы
Культура Документы
Fall 2001
Solution 4
1.
Variable=X1
W:Normal 0.827419 Pr<W 0.0007
Variable=X2
W:Normal 0.924988 Pr<W 0.0697
Variable=X3
W:Normal 0.967717 Pr<W 0.6204
The Q-Q plots show that only X1 has gross departure from normality. The Shapiro-Wilk statistics confirm this observation.
2)
Chi-square Plot
16 +
|
| *
|
|
|
14 +
|
|
|
|
|
12 +
|
|
|
|
|
10 +
|
D |
i |
s |
t | *
a 8 +
n |
c |
e |
| *
|
6 +
|
|
| *
|
|
4 + * *
| * * *
|
| *
|
| * *
2 +
|
| * *
| *** *
| ** **
| *
0 + * *
-+------+------+------+------+------+------+------+------+------+------+-
0 1 2 3 4 5 6 7 8 9 10
ChiSq
Most of the observations (except for observation 9) lie along the line. This suggests that the assumption of multivariate
normality is tenable.
TOPLOT
1.4236522 0.8807247
4.0135936 3.8113869
3.0763677 2.8907255
3.6648708 3.7549258
1.0878828 0.6533776
1.9635806 1.143333
0.7558302 0.5598385
4.4149844 3.8723804
9.8374093 15.262769
0.1848318 0.0355722
2.3659739 1.3338642
1.7767839 1.019959
0.5843744 0.4707491
1.2543524 0.6645781
3.3554449 3.7399316
2.5857116 2.3258664
0.922479 0.6167535
2.8213339 2.3763783
0.4011734 0.1499929
6.2513886 6.8316784
7.40688 8.3442448
1.5973096 0.9699873
4.8904101 3.9779303
2.1593473 1.2612803
5.4773439 5.0517725
CHI80
4.6416277
Around 20 observations are expected to lower than the quantile 4.64. The actual count is 21, which is very close to what is
expected under normality. Based on the chi-square plot and the probability contour plot, we can say that it is reasonable to assume
trivariate normality.
3)
A=1, B=2, etc. Plot of X1*X2. A=1, B=2, etc. Plot of X1*X3.
30 + 30 +
| A | A
| A | A
| |
| |
| |
20 + 20 +
| |
| AA | A A
X1 | A A X1 | A A
| AA A A | A AAA
| A A A A | A A A A
10 + A A A B A 10 + A AB A A
| A A A | A AA
| A | A
| A | A
| |
| |
0 + 0 +
-+---------------+---------------+ -+-------+-------+-------+-------+
0 10 20 0 5 10 15 20
X2 X3
A=1, B=2, etc. Plot of X2*X3.
20 +
|
| A
|
X2 | A
| A A A
| A
| A
| A A
10 +
|
| A A A
| A
| AA BA A A
|
| A
| A A A
| A
0 +
-+-------+-------+-------+-------+
0 5 10 15 20
X3
It can be argued that at least one of the observations (9) is an outlier. This is evident from the scatter plots and the large
statistical distance. Observation 21 is also a likely candidate.
The T2 = 30.37 with a p-value of .0005. At an alpha level of .05, we reject the null hypothesis that the means are all equal to 10.
b)
VAL
24.898543
6.757645
4.6820876
VEC
0.458362 0.0873568 0.884462
0.7521083 -0.568348 -0.333637
0.473537 0.8181376 -0.326211
The length of the longest axis is 2 24.90 10.22 / 23 = 6.65 The next longest axis is 2 6.76 10.22 / 23 = 3.47 while the
shortest axis has a length of 2 4.68 10.22 / 23 = 2.89 . The direction of the ellipsoid is given by the eigenvectors.
c) Simultaneous Confidence Intervals for µ1 , µ 2 , µ 3 and µ1 + µ 2 + µ 3 :
SCLM
9.2549748 13.243286
4.7246222 10.188421
8.4131009 12.755595
23.647662 34.932338
SCLM
9.552743 12.945518
5.1325507 9.7804928
8.7373125 12.431383
24.490179 34.089821
The 4 intervals based on the Bonferroni method are shorter compared to those using T2. In most situations given the same
confidence level, Bonferroni intervals are more precise. Although one may be inclined to always use the Bonferroni, some
situations dictate that simultaneous confidence intervals be based on T2. In situations where no particular contrast is of interest
prior to the implementation of the study, the T2 method is ideal since this method maintains same the confidence level even if data
snooping is involved. In some situations where the number of contrasts is relatively large, one may be better off with the T2
method, regardless of when the contrasts are formulated.
options ls=78;
data milk;
infile 'milk.dat';
input x1-x3;
proc iml;
START distance(X);
n=nrow(x);
p=ncol(x);
one=J(N,1);
xbar=t(X[:,]);
I=I(n);
SSCP=X`*(I-one*t(one)*1/n)*X;
s = SSCP/(N-1);
d=j(nrow(x),1);
means=one*t(xbar);
do i=1 to nrow(x);
d[i]= (x[i,]- means[i,])*inv(s)*t(x[i,]-means[i,]);
end;
y=ranktie(d);
chisq=cinv((y-.5)/n,p);
toplot=chisq||d;
use milk;
read all var _num_ into X;
run distance(x);
chi80=cinv(.8,3);
print chi80;
data milk1;
set milk;
if _n_ ne 9 and _n_ ne 21;
proc iml;
use milk1;
read all var{x1 x2 x3} into x;
START stat(X,Xbar,S);
N=nrow(X);
one=J(N,1);
Xbar=t(X[:,]);
I=I(n);
SSCP=X`*(I-one*t(one)*1/n)*X;
S = SSCP/(N-1);
FINISH stat;
START test(x,xbar,s,mu,alpha);
n=nrow(x);
p=ncol(x);
t_sq=n*(xbar-mu)`*inv(s)*(xbar-mu);
t_sqcrit=(n-1)*p/(n-p)*finv(1-alpha,p,n-p);
pvalue=1-probf((n-p)*t_sq/((n-1)*p),p,n-p);
print t_sq t_sqcrit pvalue;
FINISH test;
mu={10,10,10};
alpha=.05;
run stat(x,xbar,s);
run test(x,xbar,s,mu,alpha);
call eigen(val,vec,s);
print val, vec;
start sclm_t2(xbar,s,n,a,alpha);
p=nrow(s);
i=i(p);
nclm=nrow(a);
sclm=j(nclm,2);
crit=p*(n-1)*finv(1-alpha,p,n-p)/(n*(n-p));
do i=1 to nclm;
me=sqrt(crit*a[i,]*s*t(a[i,]));
sclm[i,1]=a[i,]*xbar-me;
sclm[i,2]=a[i,]*xbar+me;
end;
a={1 0 0,
0 1 0,
0 0 1,
1 1 1};
n=23;
alpha=.05;
run stat(x,xbar,s);
run sclm_t2(xbar,s,n,a,alpha);
start sclm_b(xbar,s,n,a,alpha);
p=nrow(s);
i=i(p);
nclm=nrow(a);
sclm=j(nclm,2);
crit=tinv(1-alpha/(2*nclm),n-1);
do i=1 to nclm;
me=crit*sqrt(a[i,]*s*t(a[i,])/n);
sclm[i,1]=a[i,]*xbar-me;
sclm[i,2]=a[i,]*xbar+me;
end;
print , 'Bonferroni Simulateneous Confidence Intervals', sclm;
finish sclm_b;
run sclm_b(xbar,s,n,a,alpha);