Вы находитесь на странице: 1из 97

INTRODUCTION TO S1

Clare Parsons

EDEXCEL
NUMERICAL MEASURES
REPRESENTING DATA
PROBABILITY
DISCRETE RANDOM VARIABLES
UNIFORM DISTRIBUTION
BINOMIAL DISTRIBUTION
GEOMETRIC DISTRIBUTION
NORMAL DISTRIBUTION
ESTIMATION (CLT & CIs)
CORRELATION & REGRESSION
HYPOTHESIS TESTING

AQA

OCR

MEI

Today we will cover


Probability, Variation, Binomial & Normal distributions
Recordings of on-line lessons cover these topics plus
Discrete Random Variables
Correlation & Regression
Estimation & Confidence Intervals
Hypothesis Testing
Plus
Masses of resources on Integral http://integralmaths.org/
A series of on-line live sessions in Jan/Feb and May

Session 1 : Probability
Session Content

conditional probability
independent events
using different representations
the laws of probability

A question about 3 cards


I have 3 cards:

black on
both sides

black on one side,


yellow on the other

yellow on
both sides

I shuffle the cards thoroughly and pick one without looking.


I show you one side it is yellow
What is the probability that the card is yellow on both sides?

I shuffle the cards thoroughly and pick one without looking.


I show you one side it is black
What is the probability that the card is black on both sides?

Probability is hard
In Oct 2012, 97 MPs were asked
the probability of getting 2 heads in a row when
spinning a coin.

Three out of five MPs either got the question


wrong or admitted they didnt know.

http://www.bbc.co.uk/news/uk-19801666

Conditional Probability
I have 3 children. One of them is female.
What is the probability that the other two are both
male?
F

F
F

M
F

M
F
M
F
M
F
M

I have 3 children. My eldest is female.


What is the probability that the younger two are
both male?
A - I have 2 sons
B - I have a daughter

A - I have 2 sons
C - My eldest child is
female

Independence
We assumed that the event having a son is
independent of the event having a daughter
P( having a son, given you have a daughter )
= P( having a son)
P( having a daughter given you have a son )
= P( having a daughter)

Where do you stand?


Male

Glasses

Different ways of looking at things


Find:
P(being female and wearing glasses)

P(being male given glasses are worn)


Which representation(s) make it easier
to find these probabilities?
If the first pair of branches on the tree
diagram were about glasses, then the next
pair were about male/female, would you get
the same answers?

Some notation
G

the event is male


the event wearing glasses

P(M)
P(G)

the probability of being male


the probability of wearing glasses

not male (the complement of M)

P(MG)

the probability of being male and wearing


glasses

P(M|G)

the probability of being male given that


glasses are worn

Different Diagrams
A

10
2

10

Conditional probability from a


Venn diagram
A

What is P( B/A )?

What connections can you see?


B

Match the things which


mean the same thing

ADDITION LAW OF PROBABILITY


A

Exam question
Edexcel - January 2012 No. 2
(a) State in words the relationship between two events R and S when P(R S) = 0.

Mutually exclusive
The events A and B are independent with P(A) =

(1)

1
2
and P(A B) = .
4
3

Find
(b) P(B),
(4)
(c) P(A B),
(2)
(d) P(B | A).
(2)

Markscheme
2 (a)

(R and S are mutually) exclusive.

B1
(1)

(b)

2 1
= + P B P A B
3 4

use of Addition Rule

2 1
1
P B P B
3 4
4
5 3
P B
12 4
5
P B =
9

use of independence

M1
M1 A1

A1
(4)

(c)

P(AB) =

3 5 15 5
=

4 9 36 12

M1A1ft
(2)

(d)

P( B A ) =

(1 - (b)) 0.25
0.25

4
9

1
or P( B ) or 9
1
4

M1

A1
(2)
(9 marks)

Finding the probability for the


3 card problem
The key thing is to understand
exactly what you know.
You want the probability that it is
yellow on both sides given that
it is yellow on one side
chosen at random.

P( yellow on both / yellow on one side chosen at random)

P( yellow on both yellow on one side chosen at random)


P(yellow on one side chosen at random)
P(yellow on both sides)
P(yellow on one side)

Now spend a few minutes noting the most significant


things that have occurred to you during this session on the
reflection sheet.

So much more here

http://integralmaths.org

Session 2 : Variation
Session Content:

Averages
Interpreting data presented graphically
Skewness
Need for measures of spread
Standard deviation
Linear scaling

Average Wage
There are 11 employees in Data Limited what could their wages be?
The average employee in Data
Limited already makes

132 5 00, so there is no need


to give raises this year, or next
year" says the company chief

executive.
"The average employee in Data
Limited makes 40 000. That is

a decent salary" says the


employment interviewer for Data
Limited.

The average employee in Data


Limited already makes 60 000,
so there is no need to give
raises this year" says one of the
companys senior executives.

"The average employee in Data


Limited makes only

15 000 a

year. That is a disgracefully low


wage" says the union leader.

Four averages
Mean

Median

Add all data values and


Put the data values in order
divide by how many values
and find the middle one.
there are
Mode
The most common value.

Mid-range
Half way between the
highest and lowest values

A possible set of wages

Real life averages


Average weekly incomes 1998/99 to 2012/13

http://integralmaths.org/course/vie
w.php?id=192

BHC before
housing costs
AHC after
housing costs

Income distribution for the total UK population, 2012/13

https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/206778/full_hbai13.pdf

Income distribution for the total UK population, 2011/12 -

https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/206778/full_hbai1.pdf

What could this be about?

Births Weights of all babies born in


England and Wales in 2013

Skew ?

charactersticsofbirth1final_tcm77-378111

OUTLIERS
Saxon Shum from West Wales (now a healthy 8 month old in Jan 2104) was
born 3 months prematurely weighing

1 lb 12 oz

(0.79kg)

http://www.dailymail.co.uk/news/article-2537061/Photographer-documents-premature-sonsfight-life-born-26-weeks-weighing-just-1lb-12oz.html

On February 11th 2013 Baby George Packer was born (naturally but 2 weeks late) in
Gloucester Royal Infirmary weighing

15 lb 7oz

(7.0kg)

http://www.dailymail.co.uk/femail/article-2301337/What-whopper-At-eye-watering-15lb-7oz-George-thoughtbiggest-baby-born-naturally-Britain.html

OUTLIERS
What is too big or too small?
Values which are more than 1.5 x interquartile range
above upper quartile or below lower quartile

For 2013 birth weights:


Lower Quartile:
Median:
Upper Quartile:
IQR = 3.77 3.02
1.5 x IQR
LQ 1.125
UQ + 1.125

= 0.75 kg
= 1.125 kg
= 1.895 kg
= 4.895 kg

3.02 kg
3.37 kg
3.77 kg

Illustrating Outliers
Births Weights of all babies born in England and Wales 2013

Royal Baby George : 3.88kg


An average London baby in
some ways, not others
http://www.bbc.co.uk/news/uk-23403391

Box plot question


Could there be more than one set of data with 22 values
and a mean of 23 that would give this box plot?

2 Data Sets
Data Set A
4 17 17 17 17 17 18 18 18 20 21
21 28 28 29 30 30 30 30 30 30 36
Data Set B
4 5 6 16 17 17 20 20 20 21 21
21 23 28 30 30 30 34 35 36 36 36

An article about different methods for quartiles :


http://www.amstat.org/publications/JSE/v14n3/langford.html

Whats the difference?


Text

Standard deviation
Standard deviation measures spread by calculating an
average distance of the data values from the mean.

x x

x x
n 1

divisor n also called root mean


square deviation
OR
population standard deviation
divisor n 1 also called sample
standard deviation

A set of 11 data values has a mean of 8 and a standard


deviation of 2.
At least one of the data items is the value 8.This single
value is removed from the set.
What will happen to the standard deviation of the remaining
values?

The standard deviation :


A) Increases
C) Is unchanged

B) decreases
D) insufficient information
is given

x x

n 1

S xx xi x xi
2

standard deviation
root mean square
deviation, rmsd

xi 2 nx 2

variance,

s =

S xx
n 1

S xx
n

mean square deviation,


msd
2 =

s =

S xx
n 1

S xx
n

Linear Scaling
What do you see happening?

Linear Scaling

Now spend a few minutes noting the most significant


things that have occurred to you during this session on the
reflection sheet.

Session 3 : Binomial Distribution


Session Content
classic & contemporary problems
calculating probabilities
working out what the question means
extending into geometric distribution

A gambling problem

In the 17th century, a French nobleman, the


Chevalier de Mere, played two different games of
chance.
Rolling at least one 6 in four throws of a single die

Rolling at least one double 6 in 24 throws of a pair


of dice.

Le Chevaliers Reasoning
On one throw of a die,

1
P(six) =
6
Average number of 6s in
four throws =

1 2
4
6 3

Throwing two dice,

1
P(double six) =
36
Average number of double
6s in 24 rolls =

1 2
24

36 3

Useful friends
De Mere wrote to his friend Pascal

Pascal consulted Fermat

They solved the problem between them

Pepys wager -1693


Which of the following three propositions
has the greatest chance of success?
A. Six fair dice are tossed independently
and at least one 6 appears.
B. Twelve fair dice are tossed independently and at least
two 6s appear.
C. Eighteen fair dice are tossed independently
and at least three 6s appear.
Asked Newton!

A present day problem


Whats the probability that at least one of us in this
room has a mobile phone contaminated with E-coli
(Escherichia coli)?

A simulation
P(having a phone with e coli)
= P(getting a six on a dice)
Problem:
Given a random sample of 4 phones what is the
probability of none, 1, 2, 3, or 4 being
contaminated?
Simulation:
Toss 4 dice 20 times and record the number of
sixes occurring each time

Simulating the dirty phone scenario


Considering 4 randomly chosen phones
Number of
contaminated phones

Observed Relative
Frequency
(from simulation)
Theoretical Probability
phone 1
contaminated

hygienic

phone 2
contaminated
hygienic
contaminated
hygienic

Spotting the pattern


Number
of phones
examined

None

One
Two
Three
Four
contaminated contaminated contaminated contaminated contaminated

1
2

3
4
5

5

6

1 5
2
6 6

Spotting the pattern


Number
of phones
examined

None

One
Two
Three
Four
contaminated contaminated contaminated contaminated contaminated
5

6
2

5

6

5

6

5

6

5
1
6

1 5
2
6 6

1

6
2
2
1

5
1
5

3 3
6 6
6 6
3

1

6

2
2
3
1 5
1
5
1

5
4 6



6 6
6 6
6 6

1

6

Whats the probability that if I test 4 of your mobile


phones at random, exactly two of them are
contaminated with E-coli?
2
2
Whats the Whats the probability
that
if I test 4 of
1 5

at random two6are
with
random
contaminated

0.116

6 two of them
two are contaminated
6 with E-coli,
are contaminated with
4C
Whats 2the probability that if I chose 4 of your
Number of ways of
choosing 2 objects from 4

Connection to binomial expansion


in core maths

The big picture

Spreadsheet

In general on a tree diagram

In general an algebraic formula


P (X = r) =

nC

r (1- p)n-r
p
r

r = 0, 1, 2, . n

where p is the probability of success

X ~ B (n, p)
indicates that the random variable, X, has a binomial
distribution with n trials and probability, p, of success
each time.

Return to Le Chevaliers problem


a) Rolling at least one 6 in four throws of a single dice
b) Rolling at least one double 6 in 24 throws of a
pair of dice

P(at least one double 6 when


24 dice are thrown)
= 1 P( no double sixes)

Where does the Binomial


distribution occur?
The discrete random variable we are interested in is
the number of successes
There are n independent trials

There are 2 distinct outcomes


The probability of success (p) is the same each time

Pepyss problem

Using Binomial Probabilities


You play a game where you have 8 goes and the
probability of winning any go is 0.6.
Equivalent expressions:

wins 0 1 2 3 4 5 6 7 8
losses 8 7 6 5 4 3 2 1 0
the number of wins is more than 5
the number of losses is fewer than 3
the number of wins is at least 6
the number of losses is at most 2
the number of losses is 2 or less

Answers to equivalent expressions

wins 0 1 2 3 4 5 6 7 8
losses 8 7 6 5 4 3 2 1 0
wins 0 1 2 3 4 5 6 7 8
losses 8 7 6 5 4 3 2 1 0

So .
Whats the probability that at least one of us in this room has
a mobile phone contaminated with E-coli ?

The geometric distribution


How many mobile phones would have to be tested before
one contaminated with e-coli would be found?

Not
contaminated
Not
contaminated

Not
contaminated

contaminated
contaminated

contaminated

Phone 1

Phone 2

Phone 3

The geometric distribution


How many mobile phones would have to be tested
before one contaminated with e-coli would be found?
1
P( X 1)
6

5 1
P( X 2)
6 6
2

5 1
P( X 3)
6 6
3

5 1
P( X 4)
6 6

Multiplying by

5

6

(connection to geometric
sequences)

1
X ~ Geo
6

Now spend a few minutes noting the most significant


things that have occurred to you during this session on the
reflection sheet.

Session 4 : Normal Distribution


Session Content:
Normal or not ?
Characteristics
Working with tables and probabilities
Solving problems

Normally Distributed
What do we mean by this?
What does it look like?

The normal distribution is sometimes


called the Gaussian distribution

The average man


The chest circumferences of 5738
Scottish soldiers. Quetelet,1846

Quetelets Data

The normal curve


Symmetrical
Bell shaped
A continuous random variable
plotted on horizontal axis
Area under curve represents
frequency
Usually we scale total area
under curve to be 1 so then
area represents probability
Examples?

Is it normal?
Birth weights

Is it normal?
Weights of 5 year olds

http://www.nationalstemcentre.org.uk/elibrary/resource/671
0/anthropometric-data

Is it normal?
Weights of 15-18 yr old females

Is it normal?
Heights of 15-20yr females

Exploring the normal distribution

OUTLIERS
What is too big or too small?
Values which are more than 1.5 x interquartile range
above upper quartile or below lower quartile

OR
Values which are more than 2 standard deviations away
from the mean (if distribution is approx symmetric)

Useful rules of thumb for any normally distributed variable

95%
99%

Useful rules of thumb for any normally distributed variable

Card sort

The Standard Normal Distribution


For the standard normal distribution Z~N(0,1)
mean = 0
standard deviation =1

Converting to standard normal

Using Geogebra to find normal


probabilities

Normal Distribution Tables

Using the symmetry of the standard


normal distribution

Exam question
AQA - May 2012 No. 5 (part (a))

Finding the probabilities of normally distributed variables


Write down the variable

X is
the weight of the 2.5kg bags

Write down the distribution


Write down the probability wanted
Standardise : X to Z
(subtract the mean and divide by
standard deviation)

P(X < 2.8)

Sketch a normal diagram


Use normal tables
Answer in context

The probability the bag weighs


less than 2.8 kg is 0.6304

Tackling problems

Less Usual Questions

Now spend a few minutes noting the most significant


things that have occurred to you during this session on the
reflection sheet.

Please tell us what you thought!

Вам также может понравиться