Вы находитесь на странице: 1из 12

https://www.google.nl/url?

sa=i&rct=j&q=&esrc=s&source=images&cd=&ved=0ahUKEwiZ9PXwh-
vLAhXF7w4KHUxwBaYQjRwIBw&url=https%3A%2F%2Fwww.pinterest.com
%2Fpin
%2F18929260905651518%2F&psig=AFQjCNGK4zRH_WnqZQAsCBknZMbDRb
rROQ&ust=1459517828773858&cad=rjt

Statistics
Lecture 5: Linear Regression
Recap
normal distribution
•  add more and more discrete events
–  example measuring a physical
quantity n-times
•  Has a mean, median and mode
•  symmetric: 50% values are higher than
mean and 50% are lower than mean
https://www.mathsisfun.com/data/standard-normal-distribution.html

1 −( x−µ )
2
2σ 2
N ( x) = e
σ 2π

1/04/16 2
The Standard Normal distribution
shifting the normal distribution to the mean = 0
•  Standardize normal distribution:
–  subtract the mean
–  divide by the standard deviation
•  Standardize by z

x−µ
z=
σ

https://www.mathsisfun.com/data/standard-normal-distribution.html

1/04/16 3
The Standard Normal distribution
In more detail

1/04/16 4
The central limit theorem
The CLT http://www.value-at-risk.net/central-limit-theorem/

•  distribution of an average tends to be Normal, even when the


distribution from which the average is computed is decidedly non-
Normal.
•  foundation for many statistical procedures, including Quality Control
Charts, because the distribution of the phenomenon under study does
not have to be Normal because its average will be.
•  this normal distribution will have the same mean as the parent
distribution, AND, variance equal to the variance of the parent divided
by the sample size.
x−µ
z=
σ n
1/04/16 5
Linear Regression
Finding linear trends in data
•  plotting/fitting a line to data that are linear
related
–  a – slope
–  b – intercept

•  Goal:
y = ax + b
–  error reduction
(wikipedia)
–  predicting/forcasting
–  calibration

1/04/16 6
Linear Regression
Finding linear trends in data – how to
•  plotting/fitting a line to data that are linear
related

y = ax + b
•  most common method:
–  least square methods: minimizing the
squares of the differences between the
mean line and the actual value
∂R 2
R = ∑"# yi − f ( xi , a1, a2 ,..., an )$%
2
=0
∂ai http://onlinestatbook.com/2/regression/intro.html

1/04/16 7
Linear Regression
Least squares
•  After a bit of math the equation for least square straight line fitting:

–  slope: a=
∑ (x − x )(y − y )
i i
2
∑( x − x )
i

–  intercept: b = y − ax

1/04/16 8
Linear Regression
Least Squares
•  coefficient of determination: R2 gives an idea of how well the fit is:
2
SSregression # ∑( x − x ) ( y − y ) &
1
R2 = =% (
SStotal %n σ xσ y (
$ '

•  values range: 0 ≤ R2 ≤ 1
•  R2 = 0; dependent variable cannot be predicted by the model
•  R2 = 1; dependent variable can be predicted without error
•  R2 between 0 and 1 indicated to what extend the dependent variable
can be predicted

2/04/16 9
Linear Regression
Least Squares – some things to note
•  Excel calculates trendline BUT use scatterplot!

•  What if there is no linear relation between x and y


–  try to transform into linear relation

•  How good is your fit?


–  R2 > 0.8 otherwise you should go back to the drawing board
–  Keep in mind your sample size – don’t fit a line through 2 or 3 points!

1/04/16 10
Linear Regression
Example
The sales of a company (in million dollars) for each year are shown in the
table below.

•  x (year) 2012 2013 2014 2015 2016


•  y (sales) 12 19 29 37 45
• 

a) Find the least square regression line y = ax + b.

b) Use the least squares regression line as a model to estimate the


sales of the company in 2019.

2/04/16 11
Linear Regression
Example
The sales of a company (in million dollars) for each year are shown in the
table below.

•  x (year) 2012 2013 2014 2015 2016


•  y (sales) 12 19 29 37 45
• 

a) Find the least square regression line y = ax + b.

b) Use the least squares regression line as a model to estimate the


sales of the company in 2019.

3/04/16 12

Вам также может понравиться