Вы находитесь на странице: 1из 3

MTL 766 (Multivariate Statistical Analysis)

Tutorial Sheet No. 1


1. The largest US industrial corporations yield the following data

Company Sales (x1 ) Prots (x2 ) Assets (x3 )


GM 126974 4224 173297
Ford 96933 3835 160893
Exxon 86656 3510 83219
IBM 63438 3758 77734
GE 55264 3939 128344
Mobil 50976 1809 39080
Philip Morris 39069 2946 38528
Chrysler 36156 359 51038
Du Pont 35209 2480 34715
Texaco 32416 2413 25636

All gures are in millions of dollars


(a) Compute x , S and R for (x1 , x2 , x3 ).
e
(b) Use the distance measure d(x, y ) = (x y )T S 1 (x y ) to compute the company that is nearest to
e e e e e e
mean vector x.
e
2. Are the following valid statistical distance functions for distance from the origin? Explain.
2
(a) x21 + 4x22 + x1 x2 = (distance)
2
(b) x21 2x22 = (distance)
2
(c) 13x21 + 13x22 + 10x23 8x1 x2 + 4x1 x3 4x2 x3 = (distance)

3. Show that the sample covariance matrix S of data on p variables is a semi denite matrix. It is positive
denite unless observations on a variable is a linear function of observations on the remaining p 1 variables.

4. Two dierent visual stimuli S1 and S2 produced responses in both the left eye (L) and right eye (R) of
subjects having Multiple Sclerosis. The following is data on 3 variables viz
x1 = age, x2 = total response of both eyes to S1 , x3 = total response of both eyes to S2 , for 8 subjects

Subject x1 x2 x3
1 23 148.0 205.4
2 25 195.2 262.8
3 25 158.0 209.8
4 38 190.2 243.8
5 57 165.6 229.2
6 58 238.4 304.4
7 58 164.0 216.8
8 59 199.8 250.2

(a) Write the above data in standardized units.


(b) Suppose a distance measure of a standardized data point P (x1 , x2 , x3 ) from the center of standardized
data in (a) is dened
4 2 0
d(O, P ) = xT Ax , where A = 2 4 0 .
e e 0 0 1

1
Find a transformation from x y using eigenvectors of A, that makes transformed variables y =
e e e
y1
y2 uncorrelated.
y3
(c) What is the transformed distance of a point y from the center?
e
(d) Find the three principal axis of the largest hyper-ellipsoid that covers all data points in the standardized
data as in (a).
(e) Can you propose a better distance measure for transformed supposedly uncorrelated data obtained on
variable y ? You may need to use transformed y values of 8 standardized x as in (a).
e e e
5. Prove the following properties for the square root of a symmetric positive denite matrix A of order k,
denoted by A1/2 .
(a) A1/2 is a symmetric matrix.
(b) A1/2 A1/2 = A
k
(c) (A1/2 )1 = i=1 1 ei eT = P 1/2 P
i (denoted byA1/2 ) where P = [e1 , e2 , . . . , ek ],
i
ee e e e
1/2 = diag( 1 , 1 , . . . , 1 ) and (i , ei ), i = 1, 2, . . . , k are eigenvalues, normalized eigenvector
1 2 k
e
pairs of A.
(d) A1/2 A1/2 = I, A1/2 A1/2 = A1
6. Prove
(a) E(X + Y ) = E(X) + E(Y )
(b) E(AXB) = AE(X)B
for any random matrices X, Y and constant matrices A, B.
7. Let X T = (X1 , X2 , X3 ) be a random vector with covariance matrix . If X1 and X2 are independent, nd
covariance matrix for Z T = (Z1 , Z2 , Z3 , Z4 ) where Z1 = X1 2X2 , Z2 = X1 + X2 + X3 , Z3 = X1 + 2X2 X3
and Z4 = 3X1 4X2 .
8. Show that cov(a1 X1 + . . . + ap Xp , b1 X1 + . . . + bp Xp ) = aT b, where aT = (a1 , . . . , ap ), bT = (b1 , . . . , bp ) and
e e e e
is the covariance matrix of X T = (X1 , . . . , Xp ).
e
X1 ( ) X3 ( )
e X 1 1
9. Let X = , where X (1)
= 1
and X (2)
= X4 . Let A = and
e X2 e X2 e X5
1 1
( )
e
1 1 1
B= . If X has mean TX = [2, 4, 1, 3, 0]
1 1 2 e
4 1 1/2 1/2 0
1 3 1 1 0


and cov(X ) = 1/2 1 6 1 1
e nd
1/2 1 1 4 0
0 0 1 0 2

(a) E(X (1) )


e
(b) E(AX (1) )
e
(c) cov(X (1) )
e
(d) cov(AX (1) )
e

2
(e) E(BX (2) )
e
(f) cov(X (1) , X (2) )
e e
(g) cov(AX (1) , BX (2) )
e e
AX (1)
(h) covariance matrix of e
BX (1)
e
10. In question 1 for this tutesheet, the sample mean and covariance matrix was computed in part (a). Find
the sample covariance matrix for the variables (sales-prot,sales-2prot,Assets+prot). What is the sample
correlation between Assets and (sales-prot)?