Вы находитесь на странице: 1из 2


Unit #1

S8-1 Carry out investigations of phenomena, using the statistical enquiry cycle:

using existing data sets;

seeking explanations
using informed contextual knowledge, exploratory data analysis, and statistical inference;
communicating findings and evaluating all stages of the cycle.

S8-2 Make inferences from surveys and experiments:

determining estimates and confidence intervals for differences, recognising the relevance of the central limit theorem;
use methods such as resampling to assess the strength of the evidence


Sigma Textbook (3rd ed), Chapter 3 5

Nulake IAS Workbook: Inference 3.10

Homework: for each hour in class, about an hour more is needed to effectively practice and process the work from that lesson, in readiness for
the next one. If you are having difficulty with something related to a lesson DO SOMETHING about it straight away.

1 - 14
Lesson 7
you will be
able to
use the
to process
data sets
at school

Practise parts of the Statistics Investigation Cycle for: Making a Formal Statistical Inference
Use the booklet of data sets (Rugby, Kiwi, cars, Marathon and Babies), their statistics and their graphs to complete
the worksheets about:

Writing good questions

Defining variables

Comparing Key Features: Middle 50%, Shift, Spread, Shape, Special Features, Centre (difference in medians)

Using re-sampling (e.g. bootstrapping) to estimate population parameters

Finding and interpreting Confidence Intervals for differences (in medians or means)
You are strongly advised to do this work and check it, to make sure you have understood the requirements of the
How to use iNZight and interpret its displays for this topic, appears throughout the workbook.

More about Bootstrapping.

Bootstrapping is a method of resampling
This method is used to find the confidence intervals in situations when it is not appropriate or possible to
use the central limit theorem (see page 17 in the workbook):
- When the population is not normal
- When the sample size is small
- When the sample shape , skewed and/or bimodal, suggests that the population it came from is
not normal
- When we want to find confidence intervals for other statistics e.g. medians, quartiles and
interquartile range.

Workbook References
See pages 3 63 in
the workbook:
Writing Questions
Centre / spread
Sample variation p 9 20
Confidence Intervals p
21 - 52
P53 - 60
Ex: #61 - 68

Using iNZight and

Read P53 60
Do Questions
31 34
again using
Do Questions
61 65

Resampling to find a confidence interval, can be done with iNZight.

The data is in csv files, by question numbers, in K-drive/Maths with Stats/Year 13/Y13 Inference.
NB The file format needs to be the right one, for iNZight to run.
7 14

Practice assessment task

Use the Report template in K-drive
Hand work in for marking

This work is assessed internally 2014 - end of week 5?? To be confirmed.

Practice task
Pages 64 - 6

INFERENCE continued
Background - the Central limit Theorem
If all possible sufficiently large samples of the same size n are taken
from a Normal population with mean , and standard deviation
then sample means, x are normally distributed,

with mean

and standard deviation

pages 17-18
Do Questions


+ ksd

Sample proportions and the difference in sample means also follow this pattern.
The formula sheet gives a table of means and standard deviations for different sampling distributions .
Note that the standard deviation of a sampling distribution is often called the standard error.
Confidence Intervals
Because we know sample means (for example) are normally distributed, we can use ONE sample mean to generate
an interval with a known probability of containing the actual population mean.
This interval is called a confidence interval. The general formula for any confidence interval is:
(one sample statistic) k standard errors
where k is the z-score needed to give the desired centrally placed area under the normal curve.
e.g. K = 1.96 gives a 95% CI, and k = 2.58 gives a 99% CI, and k = 1.65 gives a 90% CI

A 95% Confidence Interval for a population mean, has a 95% probability of containing that mean.
It is found by:
(one sample mean) 1.96 standard errors
Notation and terminology different notation is used to distinguish between sample and population values
Sample statistics sample standard deviation s , sample variance s2, sample mean x , sample proportion p

Their equivalent Population parameters standard deviation n , variance

, mean
, proportion
Two sampling distributions are used at the achieved-level:
and standard deviation
1. Means X of samples of size n from a population with mean


are Normally distributed with mean
and standard deviation
2. Proportions p from samples of size n from a population with proportion

are Normally distributed with mean

and standard deviation
Difference between 2 means from samples of size n1 and n2

1 and 2

from a population with means

are Normally distributed with


X X 1 2

and standard deviations

and standard deviation


(1 )

1 and 2

12 22

n1 n 2

s12 s 22

n1 n2

p (1 p )

Confidence interval for

the mean
Read P 21 - 25
Do Questions
15 - 20
[Pages 27-31
Do Questions
21 24]
Confidence interval for
difference in means
Read P33 - 36
Do Questions
25 30
Using iNZight
Read P40 41
Confidence intervals
Do Questions
31 - 34
[Confidence interval
for proportion
Read P45 - 6
35 46]