Вы находитесь на странице: 1из 4

Data Science Math Skills

Paul Bendich and Daniel Egger


Duke University

Sigma Notation: Mean and Variance


Video companion

1 Introduction
Important equations for this video:

X = {x1 , ..., xn }
n
1X
x = xi
n i=1
" n #
1 X
x 2 = (xi x )2
n i=1

The symbol x is the mean of x, and x 2 is the variance of x. The standard deviation
is denoted x .

2 Mean
Example:

Z = {1, 5, 12}
|Z| = 3
1 + 5 + 12 18
z = = =6
3 3
The mean z is also denoted (z) or simply .

Symbolic example:

Y = {y1 , y2 , y3 , y4 }
1
y = (y1 + y2 + y3 + y4 )
4 !
4
1 X
= yi
4 i=1

1
Data Science Math Skills
Paul Bendich and Daniel Egger
Duke University

In general, suppose you have a set

X = {x1 , x2 , ..., xn },

then the mean of X is


n
!
1 X
x = xi .
n i=1

The variable i is a counter. The variable n is a number, which tells you when to stop counting.

3 Mean centering

Z = {1, 5, 12}
z = 6

0 1 5 6 12 R

Z 0 = {1 6, 5 6, 12 6}
= {5, 1, 6}
z 0 = 0

5 1 0 6 R

Mean centering data produces a new data set, which has the same relationships, but the
mean is zero.

2
Data Science Math Skills
Paul Bendich and Daniel Egger
Duke University

4 Variance

Z = {1, 5, 12}
z = 6

W = {5, 6, 7}
w = 6

5 6 7

0 1 5 6 12 R

Set Z (blue) is more spread out than set W (olive).

If X = {x1 , ..., xn }, the variance of X is


" n #
2 1 X 2
x = (xi x ) .
n i=1

The standard deviation is given by


p
x = x 2 .

Z and W have the same mean, but Z is more spread out, so z should be greater than w .

" 3 #
1 X
w2 = (wi w )2
3 i=1
1
(5 6)2 + (6 6)2 + (7 6)2

=
3
1
(1)2 + 02 + 12

=
3
2
=
3
r
2
w =
3

3
Data Science Math Skills
Paul Bendich and Daniel Egger
Duke University

" 3
#
1 X
z2 = (zi z )2
3 i=1
1
(1 6)2 + (5 6)2 + (12 6)2

=
3
1
(5)2 + (1)2 + 62

=
3
62
=
3
r
62
w =
3
z2  w2 , so Z is much more spread out than W .

Вам также может понравиться