Вы находитесь на странице: 1из 59

3.

The Multivariate Normal Distribution


3.1 Introduction
A generalization of the familiar bell shaped normal density to several
dimensions plays a fundamental role in multivariate analysis
While real data are never exactly multivariate normal, the normal density
is often a useful approximation to the true population distribution because
of a central limit eect.
One advantage of the multivariate normal distribution stems from the fact
that it is mathematically tractable and nice results can be obtained.
1
To summarize, many real-world problems fall naturally within the framework
of normal theory. The importance of the normal distribution rests on its dual
role as both population model for certain natural phenomena and approximate
sampling distribution for many statistics.
2
3.2 The Multivariate Normal density and Its Properties
Recall that the univariate normal distribution, with mean and variance
2
,
has the probability density function
f(x) =
1

2
2
e
[(x)/]
2
/2
< x <
The term
_
x

_
2
= (x )(
2
)
1
(x )
This can be generalized for p 1 vector x of observations on serval variables
as
(x )

1
(x )
The p 1 vector represents the expected value of the random vector X,
and the p p matrix is the variance-covariance matrix of X.
3
A p-dimensional normal density for the random vector X

= [X
1
, X
2
, . . . , X
p
]
has the form
f(x) =
1
(2)
p/2
||
1/2
e
(x)

1
(x)/2
where < x
i
< , i = 1, 2, . . . , p. We should denote this p-dimensional
normal density by N
p
(, ).
4
Example 3.1 (Bivariate normal density) Let us evaluate the p = 2 variate
normal density in terms of the individual parameters
1
= E(X
1
),
2
=
E(X
2
),
11
= Var(X
1
),
22
= Var(X
2
), and
12
=
12
/(

11

22
) =
Corr(X
1
, X
2
).
Result 3.1 If is positive denite, so that
1
exists, then
e = e implies
1
e =
1

e
so (, e) is an eigenvalue-eigenvector pair for corresponding to the pair
(1/, e) for
1
. Also
1
is positive denite.
5
6
Constant probability density contour
= { all x such that (x )

1
(x ) = c
2
}
= surface of an ellipsoid centered at .
Contours of constant density for the p-dimensional normal distribution are
ellipsoids dened by x such the that
(x )

1
(x ) = c
2
These ellipsoids are centered at and have axes c

i
e
i
, where e
i
=
i
for
i = 1, 2, . . . , p.
7
Example 4.2 (Contours of the bivariate normal density) Obtain the axes
of constant probability density contours for a bivariate normal distribution when

11
=
22
8
The solid ellipsoid of x values satisfying
(x )

1
(x )
2
p
()
has probability 1 where
2
p
() is the upper (100)th percentile of a chi-square
distribution with p degrees of freedom.
9
Additional Properties of the Multivariate Normal
Distribution
The following are true for a normal vector X having a multivariate normal
distribution:
1. Linear combination of the components of X are normally distributed.
2. All subsets of the components of X have a (multivariate) normal distribution.
3. Zero covariance implies that the corresponding components are independently
distributed.
4. The conditional distributions of the components are normal.
10
Result 3.2 If X is distributed as N
p
(, ), then any linear combination of
variables a

X = a
1
X
1
+a
2
X
2
+ +a
p
X
p
is distributed as N(a

, a

a). Also
if a

X is distributed as N(a

, a

a) for every a, then X must be N


p
(, ).
Example 3.3 (The distribution of a linear combination of the component
of a normal random vector) Consider the linear combination a

X of a
multivariate normal random vector determined by the choice a

= [1, 0, . . . , 0].
Result 3.3 If X is distributed as N
p
(, ), the q linear combinations
A
(qp)
X
p1
=
_

_
a
11
X
1
+ + a
1p
X
p
a
21
X
1
+ + a
2p
X
p
.
.
.
a
q1
X
1
+ + a
qp
X
p
_

_
are distributed as N
q
(A, AA

). Also X
p1
+d
p1
, where d is a vector of
constants, is distributed as N
p
( +d, ).
11
Example 3.4 (The distribution of two linear combinations of the
components of a normal random vector) For X distributed as N
3
(, ),
nd the distribution of
_
X
1
X
2
X
2
X
3
_
=
_
1 1 0
0 1 1
_
_
_
X
1
X
2
X
3
_
_
= AX
12
Result 3.4 All subsets of X are normally distributed. If we respectively partition
X, its mean vector , and its covariance matrix as
X
(p1)
=
_

_
X
1
(q 1)

X
2
(p q) 1
_

(p1)
=
_

1
(q 1)

2
(p q) 1
_

_
and

(pp)
=
_

11

12
(q 1) (q (p q))

21

22
((p q) q) ((p q) (p q))
_

_
then X
1
is distributed as N
q
(
1
,
11
).
Example 3.5 (The distribution of a subset of a normal random vector)
If X is distributed as N
5
(, ), nd the distribution of [X
2
, X
4
]

.
13
Result 3.5
(a) If X
1
and X
2
are independent, then Cov(X
1
, X
2
) = 0, a q
1
q
2
matrix of
zeros, where X
1
is q
1
1 random vector and X
2
is q
2
1. random vector
(b) If
_
X
1
X
2
_
is N
q
1
+q
2
__

1

2
_
,
_

11

12

21

22
__
, then X
1
and X
2
are
independent if and only if
12
=
21
= 0.
(c) If X
1
and X
2
are independent and are distributed as N
q
1
(
1
,
11
)
and N
q
2
(
2
,
22
), respectively, then
_
X
1
X
2
_
has the multivariate normal
distribution
N
q
1
+q
2
__

1

2
_
,
_

11
0
0
22
__
14
Example 3.6 (The equivalence of zero covariance and independence for
normal variables) Let X
31
be N
3
(, ) with
=
_
_
4 1 0
1 3 0
0 0 2
_
_
Are X
1
and X
2
independent ? What about (X
1
, X
2
) and X
3
?
Result 3.6 Let X =
_
X
1
X
2
_
be distributed as N
p
(, ) with
_

1

2
_
, =
_

11

12

21

22
_
, and |
22
| > 0. Then the conditional distribution of X
1
, given
that X
2
= x
2
is normal and has
Mean =
1
+
12

1
22
(x
2

2
)
and
Covariance =
11

12

1
22

21
Note that the covariance does not depend on the value x
2
of the conditioning
variable.
15
Example 3.7 (The conditional density of a bivariate normal distribution)
Obtain the conditional density of X
1
, give that X
2
= x
2
for any bivariate
distribution.
Result 3.7 Let X be distributed as N
p
(, ) with || > 0. Then
(a) (X )

1
(X ) is distributed as
2
p
, where
2
p
denotes the chi-square
distribution with p degrees of freedom.
(b) The N
p
(, )distribution assign probability 1 to the solid ellipsoid
{x : (x )

1
(x )
2
p
()}, where
2
p
() denote the upper (100)th
percentile of the
2
p
distribution.
16
Result 3.8 Let X
1
, X
2
, . . . , X
n
be mutually independent with X
j
distributed
as N
p
(
j
, ). (Note that each X
j
has the same covariance matrix .) Then
V
1
= c
1
X
1
+ c
2
X
2
+ + c
n
X
n
is distributed as N
p
_
n

j=1
c
j

j
, (
n

j=1
c
2
j
)
_
. Moreover, V
1
and V
2
= b
1
X
1
+
b
2
X
2
+ + b
n
X
n
are jointly multivariate normal with covariance matrix
_

_
(
n

j=1
c
2
j
) b

c
b

c
21
(
n

j=1
b
2
j
)
_

_
Consequently, V
1
and V
2
are independent if b

c =
n

j=1
c
j
b
j
= 0.
17
Example 3.8 (Linear combinations of random vectors) Let X
1
, X
2
, X
3
and X
4
be independent and identically distributed 3 1 random vectors with
=
_
_
3
1
1
_
_
and =
_
_
3 1 1
1 1 0
1 0 2
_
_
(a) nd the mean and variance of the linear combination a

X
1
of the three
components of X
1
where a = [a
1
a
2
a
3
]

.
(b) Consider two linear combinations of random vectors
1
2
X
1
+
1
2
X
2
+
1
2
X
3
+
1
2
X
4
and
X
1
+X
2
+X
3
3X
4
.
Find the mean vector and covariance matrix for each linear combination of
vectors and also the covariance between them.
18
3.3 Sampling from a Multivariate Normal Distribution and
Maximum Likelihood Estimation
The Multivariate Normal Likelihood
Joint density function of all p 1 observed random vectors X
1
, X
2
, . . . , X
n
_
Joint density
of X
1
, X
2
, . . . , X
n
_
=
n

j=1
_
1
(2)
p/2
||
1/2
e
(x
j
)

1
(x
j
)/2
_
=
1
(2)
np/2
||
n/2
e

n
P
j=1
(x
j
)

1
(x
j
)/2
=
1
(2)
np/2
||
n/2
e
tr
"

n
P
j=1
(x
j

x)(x
j

x)

+n(

x)(

x)

!#
_
2
19
Likelihood
When the numerical values of the observations become available, they may
be substituted for the x
j
in the equation above. The resulting expression,
now considered as a function of and for the xed set of observations
x
1
, x
2
, . . . , x
n
, is called the likelihood.
Maximum likelihood estimation
One meaning of best is to select the parameter values that maximize
the joint density evaluated at the observations. This technique is called
maximum likelihood estimation, and the maximizing parameter values are
called maximum likelihood estimates.
Result 3.9 Let A be a k k symmetric matrix and x be a k 1 vector. Then
(a) x

Ax = tr(x

Ax) = tr(Axx

)
(b) tr(A) =
n

i=1

i
, where the
i
are the eigenvalues of A.
20
Maximum Likelihood Estimate of and
Result 3.10 Given a p p symmetric positive denite matrix B and a scalar
b > 0, it follows that
1
||
b
e
tr(
1
B)/2

1
|B|
b
(2b)
pb
e
bp
for all positive denite
pp
, with equality holding only for = (1/2b)B.
Result 3.11 Let X
1
, X
2
, . . . , X
n
be a random sample from a normal population
with mean and covariance . Then
=

X and

=
1
n
n

j=1
(X
j


X)(X
j


X)

=
n 1
n
S
are the maximum likelihood estimators of and , respectively. Their
observed value x and (1/n)
n

j=1
(x
j
x)(x
j
x)

, are called the maximum


likelihood estimates of and .
21
Invariance Property of Maximum likelihood estimators
Let

be the maximum likelihood estimator of , and consider the parameter
h(), which is a function of . Then the maximum likelihood estimate of
h() is given by h(

).
For example
1. The maximum likelihood estimator of

1
is

1
, where =

X and

=
n1
n
S are the maximum likelihood estimators of and respectively.
2. The maximum likelihood estimator of

ii
is


ii
, where

ii
=
1
n
n

j=1
(X
ij


X
i
)
2
is the maximum likelihood estimator of
ii
= Var(X
i
).
22
Sucient Statistics
Let X
1
, X
2
, . . . , X
n
be a random sample from a multivariate normal
population with mean and covariance . Then

X and S =
1
n 1
n

j=1
(X
j


X)(X
j


X)

are sucient statistics


The importance of sucient statistics for normal populations is that all of
the information about and in the data matrix X is contained in

X and
S, regardless of the sample size n.
This generally is not true for nonnormal populations.
Since many multivariate techniques begin with sample means and covariances,
it is prudent to check on the adequacy of the multivariate normal assumption.
If the data cannot be regarded as multivariate normal, techniques that depend
solely on

X and S may be ignoring other useful sample information.
23
3.4 The Sampling Distribution of

X and S
The univariate case (p = 1)


X is normal with mean =(population mean) and variance
1
n

2
=
population variance
sample size
For the sample variance, recall that (n1)s
2
=
n

j=1
(X
j


X)
2
is distributed
as
2
times a chi-square variable having n 1 degrees of freedom (d.f.).
The chi-square is the distribution of a sum squares of independent standard
normal random variables. That is, (n 1)s
2
is distributed as
2
(Z
2
1
+
+ Z
2
n1
) = (Z
1
)
2
+ + (Z
n1
)
2
. The individual terms Z
i
are
independently distributed as N(0,
2
).
24
Wishart distribution
W
m
(|) = Wishart distribution with m d.f.
= distribution of
n

j=1
Z
j
Z

j
where Z
j
are each independently distributed as N
p
(0, ).
Properties of the Wishart Distribution
1. If A
1
is distributed as W
m
1
(A
1
|) independently of A
2
, which is
distributed as W
m
2
(A
2
|), then A
1
+A
2
is distributed as W
m
1
+m
2
(A
1
+
A
2
|). That is, the the degree of freedom add.
2. If A is distributed as W
m
(A|), then CAC

is distributed as
W
m
(CAC

|CC

).
25
The Sampling Distribution of

X and S
Let X
1
, X
2
, . . . , X
n
be a random sample size n from a p-variate normal
distribution with mean and covariance matrix . Then
1.

X is distributed as N
p
(,
1
n
).
2. (n 1)S is distributed as a Wishart random matrix with n 1 d.f.
3.

X and S are independent.
26
4.5 Large-Sample Behavior of

X and S
Result 3.12 (Law of Large numbers) Let Y
1
, Y
2
, . . . , Y
n
be independent
observations from a population with mean E(Y
i
) = , then

Y =
Y
1
+ Y
2
+ + Y
n
n
converges in probability to as n increases without bound. That is, for any
prescribed accuracy > 0, P[ <

Y < ] approaches unity as n .
Result 3.13 (The central limit theorem) Let X
1
, X
2
, . . . , X
n
be independent
observations from any population with mean and nite covariance . Then

n(

X ) has an approximate N
p
(0, )distribution
for large sample sizes. Here n should also be large relative to p.
27
Large-Sample Behavior of

X and S
Let X
1
, X
2
, . . . , X
n
be independent observations from a population with mean
and nite (nonsingular) covariance . Then

n(

X )is approximately N
p
(0, )
and
n(

X )

S
1
(

X ) is approximately
2
p
for n p large.
28
3.6 Assessing the Assumption of Normality
Most of the statistical techniques discussed assume that each vector
observation X
j
comes from a multivariate normal distribution.
In situations where the sample size is large and the techniques dependent
solely on the behavior of

X, or distances involve

X of the form n(

X
)

S(

X ), the assumption of normality for the individual observations is
less crucial.
But to some degree, the quality of inferences made by these methods
depends on how closely the true parent population resembles the multivariate
normal form.
29
Therefore, we address these questions:
1. Do the marginal distributions of the elements of X appear to be normal ?
What about a few linear combinations of the components X
j
?
2. Do the scatter plots of observations on dierent characteristics give the
elliptical appearance expected from normal population ?
3. Are there any wild observations that should be checked for accuracy ?
30
Evaluating the Normality of the Univariate Marginal
Distributions
Dot diagrams for smaller n and histogram for n > 25 or so help reveal
situations where one tail of a univariate distribution is much longer than
other.
If the histogram for a variable X
i
appears reasonably symmetric , we can
check further by counting the number of observations in certain interval, for
examples
A univariate normal distribution assigns probability 0.683 to the interval
(
i

ii
,
i
+

ii
)
and probability 0.954 to the interval
(
i
2

ii
,
i
+ 2

ii
)
Consequently, with a large same size n, the observed proportion p
i1
of the
observations lying in the interval ( x
i

s
ii
, x
i
+

s
ii
) to be about 0.683,
and the interval ( x
i
2

s
ii
, x
i
+ 2

s
ii
) to be about 0.954
31
Using the normal approximating to the sampling of p
i
, observe that either
| p
i1
0.683| > 3
_
(0.683)(0.317)
n
=
1.396

n
or
| p
i2
0.954| > 3
_
(0.954)(0.046)
n
=
0.628

n
would indicate departures from an assumed normal distribution for the ith
characteristic.
32
Plots are always useful devices in any data analysis. Special plots called
QQ plots can be used to assess the assumption of normality.
Let x
(1)
x
(2)
x
(n)
represent these observations after they are
ordered according to magnitude. For a standard normal distribution, the
quantiles q
(j)
are dened by the relation
P[Z q
(j)
] =
_
q(j)

2
e
z
2
/2
dz = p
(j)
=
j
1
2
n
Here p
(j)
is the probability of getting a value less than or equal to q
(j)
in a
single drawing from a standard normal population.
The idea is to look at the pairs of quantiles (q
(j)
, x
(j)
) with the same
associated cumulative probability (j
1
2
)/n. If the data arise from a normal
population, the pairs (q
(j)
, x
(j)
) will be approximately linear related, since
q
(j)
+ is nearly expected sample quantile.
33
Example 3.9 (Constructing a Q-Q plot) A sample of n = 10 observation
gives the values in the following table:
The steps leading to a Q-Q plot are as follows:
1. Order the original observations to get x
(1)
, x
(2)
, . . . , x
(n)
and their
corresponding probability values (1
1
2
)/n, (2
1
2
)/n, . . . , (n
1
2
)/n;
2. Calculate the standard quantiles q
(1)
, q
(2)
, . . . , q
(n)
and
3. Plot the pairs of observations (q
(1)
, x
(1)
), (q
(2)
, x
(2)
), . . . , (q
(n)
, x
(n)
), and
examine the straightness of the outcome.
34
Example 4.10 (A Q-Q plot for radiation data) The quality -control
department of a manufacturer of microwave ovens is required by the federal
government to monitor the amount of radiation emitted when the doors of the
ovens are closed. Observations of the radiation emitted through closed doors of
n = 42 randomly selected ovens were made. The data are listed in the following
table.
35
The straightness of the Q-Q plot can be measured ba calculating the
correlation coecient of the points in the plot. The correlation coecient for
the Q-Q plot is dened by
r
Q
=
n

j=1
(x
(j)
x)(q
(j)
q)

j=1
(x
(j)
x)
2

j=1
(q
(j)
q)
2
and a powerful test of normality can be based on it. Formally we reject the
hypothesis of normality at level of signicance if r
Q
fall below the appropriate
value in the following table
36
Example 3.11 (A correlation coecient test for normality) Let us calculate
the correlation coecient r
Q
from Q-Q plot of Example 3.9 and test for
normality.
37
Linear combinations of more than one characteristic can be investigated.
Many statistician suggest plotting
e

1
x
j
where S e
1
=

1
e
1
in which

1
is the largest eigenvalue of S. Here x

j
= [x
j1
, x
j2
, . . . , x
jp
] is
the jth observation on p variables X
1
, X
2
, . . . , X
p
. The linear combination
e
p
x
j
corresponding to the smallest eigenvalue is also frequently singled out for
inspection
38
Evaluating Bivariate Normality
By Result 3.7, the set of bivariate outcomes x such that
(x )

1
(x )
2
2
(0.5)
has probability 0.5.
Thus we should expect roughly the same percentage, 50%, of sample
observations lie in the ellipse given by
{all x such that (x x)

S
1
(x x)
2
2
(0.5)}
where is replaced by xand
1
by its estimate S
1
. If not, the normality
assumption is suspect.
Example 3.12 (Checking bivariate normality) Although not a random sample,
data consisting of the pairs of observations (x
1
= sales, x
2
= prots) for the 10
largest companies in the world are listed in the following table. Check if (x
1
, x
2
)
follows bivariate normal distribution.
39
A somewhat more formal method for judging normality of a data set is based
on the squared generalized distances
d
2
j
= (x
j
x)

S
1
(x
j
x)
When the parent population is multivariate normal and both n and n p
are greater than 25 or 30, each of the squared distance d
2
1
, d
2
2
, . . . , d
2
n
should
behave like a chi-square random variable.
Although these distances are not independent or exactly chi-square
distributed, it is helpful to plot them as if they were. The resulting
plot is called a chi-square plot or gamma plot, because the chi-square
distribution is a special case of the more general gamma distribution. To
construct the chi-square plot
1. Order the square distance in the equation above from smallest to largest
as d
2
(1)
d
2
(2)
d
2
(n)
.
2. Graph the pairs (q
c,p
((j
1
2
)/n), d
2
(j)
), where q
c,p
((j
1
2
)/n) is the 100(j
1
2
)/n quantile of the chi-square distribution with p degrees of freedom.
40
Example 3.13 (Constructing a chi-square plot) Let us construct a chi-square
plot of the generalized distances given in Example 3.12. The order distance and
the corresponding chi-square percentile for p = 2 and n = 10 are listed in the
following table:
41
42
Example 3.14 (Evaluating multivariate normality for a four-variable data
set) The data in Table 4.3 were obtained by taking four dierent measures of
stiness, x
1
, x
2
, x
3
, and x
4
, of each of n = 30 boards. the rst measurement
involving sending a shock wave down the board, the second measurement
is determined while vibrating the board, and the last two measurements are
obtained from static tests. The squared distances d
j
= (x
j
x)

S
1
(x
j
x) are
also presented in the table
43
44
3.7 Detecting Outliers and Cleaning Data
Outliers are best detected visually whenever this is possible
For a single random variable, the problem is one dimensional, and we look
for observations that are far from the others.
In the bivariate case, the situation is more complicated. Figure 4.10 shows a
situation with two unusual observations.
In higher dimensions, there can be outliers that cannot be detected from
the univariate plots or even the bivariate scatter plots. Here a large value
of (x
j
x)

S
1
(x
j
x) will suggest an unusual observation. even though it
cannot be seen visually.
45
46
Steps for Detecting Outliers
1. Math a dot plot for each variable.
2. Make a scatter plot for each pair of variables.
3. Calculate the standardize variable z
jk
= (x
jk
x
k
)/

s
kk
for j = 1, 2, . . . , n
and each column k = 1, 2, . . . , p. Examine these standardized values for large
or small values.
4. Calculate the generalized squared distance (x
j
x)

S
1
(x
j
x). Examine
these distances for unusually values. In a chi-square plot, these would be the
points farthest from the origin.
47
Example 3.15 (Detecting outliers in the data on lumber) Table 4.4 contains
the data in Table 4.3, along with the standardized observations. These data
consist of four dierent measurements of stiness x
1
, x
2
, x
3
and x
4
, on each
n = 30 boards. Detect outliers in these data.
48
49
3.8 Transformations to Near Normality
If normality is not a viable assumption, what is the next step ?
Ignore the ndings of a normality check and proceed as if the data were
normality distributed. ( Not recommend)
Make nonnormal data more normal looking by considering
transformations of data. Normal-theory analyses can then be carried
out with the suitably transformed data.
Appropriate transformations are suggested by
1. theoretical consideration
2. the data themselves.
50
Helpful Transformations To Near Normality
Original Scale Transformed Scale
1. Counts, y

y
2. Proportions, p logit =
1
2
log
_
p
1 p
_
3. Correlations, r Fishers z(r) =
1
2
log
_
1+r
1r
_
Box and Cox transformation
x
()
=
_
x

= 0
lnx = 0
or y
()
j
=
x

j
1

_
_
n

i=1
x
i
_
1/n
_
1
, j = 1, . . . , n
Given the observations x
1
, x
2
, . . . , x
n
, the Box-Cox transformation for the
choice of an appropriate power is the solution that maximizes the express
() =
n
2
ln
_
_
1
n
n

j=1
(x
()
j


x
()
)
2
_
_
+ ( 1)
n

j=1
lnx
j
where

x
()
=
1
n
n

j=1
_
x

j
1

_
.
51
Example 3.16 (Determining a power transformation for univariate data)
We gave readings of microwave radiation emitted through the closed doors of
n = 42 ovens in Example 3.10. The Q-Q plot of these data in Figure 4.6
indicates that the observations deviate from what would be expected if they
were normally distributed. Since all the positive observations are positive, let
us perform a power transformation of the data which, we hope, will produce
results that are more nearly normal. We must nd that value of maximize the
function ().
52
53
Transforming Multivariate Observations
With multivariate observations, a power transformation must be selected for
each of the variables.
Let
1
,
2
, . . . ,
p
be the power transformations for the p measured
characteristics. Each
k
can be selected by maximizing
() =
n
2
ln
_
_
1
n
n

j=1
(x
(
k
)
jk
x
(
k
)
k
)
2
_
_
+ (
k
1)
n

j=1
lnx
jk
where x
1k
, x
2k
, . . . , x
nk
are n observations on the kth variable, k =
1, 2, . . . , p. Here
x
(
k
)
k
=
1
n
n

j=1
_
x

k
jk
1

k
_
Let

1
,

2
, . . . ,

p
be the values that individually maximize the equation
above. Then the jth transformed multivariate observation is
x
(

)
j
=
_
_
x

1
j1
1

1
,
x

2
j2
1

2
, ,
x

p
jp
1

p
_
_

54
The procedure just described is equivalent to making each marginal
distribution approximately normal. Although normal marginals are not
sucient to ensure that the joint distribution is normal, in practical
applications this may be good enough.
If not, the value

1
,

2
, . . . ,

p
can be obtained from the preceding
transformations and iterate toward the set of values

= [
1
,
2
, . . . ,
p
],
which collectively maximizes
(
1
,
2
, . . . ,
p
) =
n
2
ln|S()| + (
1
1)
n

j=1
lnx
j1
+ (
2
1)
n

j=1
lnx
j2
+ + (
p
1)
n

j=1
lnx
jp
where S() is the sample covariance matrix computed from
x
()
j
=
_
x

1
j1
1

1
,
x

2
j2
1

2
, ,
x

p
jp
1

p
_

, j = 1, 2, . . . , n
55
Example 3.17 (Determining power transformations for bivariate data)
Radiation measurements were also recorded though the open doors of the
n = 42 micowave ovens introduced in Example 3.10. The amount of radiation
emitted through the open doors of these ovens is list Table 4.5. Denote the
door-close data x
11
, x
21
, . . . , x
42,1
and the door-open data x
12
, x
22
, . . . , x
42,2
.
Consider the joint distribution of x
1
and x
2
, Choosing a power transformation
for (x
1
, x
2
) to make the joint distribution of (x
1
, x
2
) approximately bivariate
normal.
56
57
58
If the data includes some large negative values and have a single long tail, a
more general transformation should be applied.
x
()
=
_

_
{(x + 1)

1}/ x 0, = 0
ln(x + 1) x 0, = 0
{(x + 1)
2
1}/(2 ) x < 0, = 2
ln(x + 1) x < 0, = 2
59

Вам также может понравиться