0 views

Original Title: StatisticallyBasedReportsWebPL.pdf

Uploaded by Tharanga Devinda Jayathunga

- Teddlie - Mixed Methods Sampling - A Typology With Examples
- MKTG 30 Chapter 8 Quiz
- Landscape Archaeology Survey Project
- Labour Welfare Activities in Bilt
- A Project Report on Brand Positioning of Big Bazaar
- Praveen Mr Project
- research
- Purposive Sampling
- Final Synopsis - Pallab (HRIS)
- Srikanth Project
- ap statistics unit 3 study guide
- MECE513 2006 Lecture 04 A TN5 Work Measure
- Test Bank for Business Statistics 9th Edition by Groebner
- Research Methodology
- GUIDELINES FOR RESEARCH PROJECTS (2)
- Kaur Kaur Kaur Kaur
- Bms Slides-research Process_notes
- QBA 282 Statistics Syllabus
- Survey Sampling
- Chapter 5 Notes - Hineman

You are on page 1of 48

h i (Imaginary)

3

r 3 + 2i

A

V = 1 πr2h

2

3 1

R (Real)

1 2 3 4

dy ?

dx

?

y

n

x

) = -1)

ƒ(x n·x(n

)=

A ƒ’(x

f(x)

a b x

STATISTICS Level 3

NCEA | Walkthrough Guide

Level 3 Statistics | Statistically based reports Walkthrough Guide

Introduction 3

Explanatory, response and confounding variables 5

Observations vs Experiments 7

Causation and Correlation 8

Correlation 11

Checking causation versus correlation in a problem 12

Quick Questions 13

Samples 14

Random sampling methods 17

Non-random sampling methods 20

Quick Questions 22

Sources of Error 23

Sampling and non-sampling errors 23

Confounding variables 27

Blast through the graphs 28

Misleading graphs 30

Comparing Measures 33

The margin of error and the 95% confidence interval 33

Using the confidence interval for claims 36

MoE backwards 37

Dependent probabilities 38

Independent probabilities 40

Quick Questions 42

Key Terms 43

Statistically Based Reports | Introduction

INTRODUCTION

Externals have never been easier. What could be simpler than sitting back, reading

an article, and pulling it to pieces? Not much! That’s the whole gist of good old 3.12.

You’ll be given three (very) short statistically based reports to read, and then you’ll

write an evaluation as to whether it’s legit or not.

Okay, that’s not all there is to it. We’ll also be doing some calculations and looking at

things like sampling methods, survey questions, confidence intervals and the margin

of error. Some of this might be familiar to you from Level 2, but some will be brand

new!

Buckle in!

Before you even breathe in an article’s direction, you have to speak their language.

You need to know how they think. It’s all about their sources and sampling methods.

What kind of studies do they use? How do they get their data? And – most importantly – is

their data legit?

Then you have to check if they’ve gone wrong anywhere. Keep an eye out for lurking

variables, misleading graphs, and whether they’re talking about correlation or

causality.

We’ll round off this three-course meal with a healthy plate of comparing measures.

That’s mostly checking to see whether the stuff they claim is true lines up with the

maths. In short: We’ll be analysing articles, and then evaluating (weighing up) whether

they’re legit or not.

Yep, this is the wordiest of all the stats externals, but don’t let that intimidate you! The

trick here isn’t spewing out a hundred words a minute. It’s about the quality of what

you write, not the quantity.

The trick here is memory. The better you remember all the different sampling methods,

and sources of error, the easier it’ll be to tell whether an article’s a fraud or not. If a

survey’s badly sampled, you know it’s a spoof.

And, of course, when it comes to the calculations – take it slow, check your answers,

and show your working! You can still get part marks on your calculations even if you

got the final answer wrong, so why not lay it all out there?

3 Level 3 Statistics | Statistically Based Reports | © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Introduction

Yes, we know reading articles may be boring, but that doesn’t mean you can skim read

them. Before you start answering questions make sure you read the articles properly

at least once, maybe even twice, to make sure you’ve picked up on all the hidden

information. Highlight the important parts such as survey size, survey group, sampling

methods and results so you have them at the ready to chuck into your answers. And

this will save you having to read through the articles again and again relooking for

that important information you need.

Now, with all of this wording, things can get a little tricky to understand. That’s why,

at StudyTime, we’re pretty much GCs (good citizens), so to help you out, we’ve made

this guide in plain English as much as we can. We’ve also included a glossary for some

of the key terms that you’ll need to master for your exam.

If learning key words first off scares you (or bores you), then focus on understanding

the concepts the first time around, and then memorise the definitions.

In fact, in this guide, we focus on helping you to understand the concepts first. We

use examples and analogies to help you understand statistics in a way that is fun, and

makes sense in the real world.

However, the language we use isn’t always something you can directly write in your

exam! When this is the case, we offer a more scientific definition or explanation (in

a handy blue box) underneath! These boxes are trickier to understand on your first

read through, but contain language you are allowed to write in your exam! Look out

for them to make sure you stay on target!

4 Level 3 Statistics | Statistically Based Reports| © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Key Terms & Sampling Methods

This standard is all about articles and surveys!

b) You evaluate the articles

Evaluate means you find whether they’re true, fair, and the conclusions are legitimate.

But, before we even look at any actual reports, we need to get some core concepts

sorted: the kinds of words statistically based reports use, the goals of those reports,

and the ways studies and surveys are run.

Types of studies – Experiments vs Observations

Causation vs Correlation

Samples – what they are and what a good one looks like

Sampling methods

Don’t worry, even though this assessment is pretty wordy, we’ll run you through all

the specific terms you need to know – in fact, we’re going to steal some important

words from our regular maths lessons.

In this case, variables are the features your survey or experiment are investigating.

For this external, you’ll have to be able to define what explanatory and response

variables are.

The explanatory variable is the one that’s often controlled and changed.

• It’s usually the cause or the explanation of the other variable. It’s supposed to tell

the other one what to do. For that reason it’s also called the independent variable.

• It’s always put on the x axis.

5 Level 3 Statistics | Statistically Based Reports | © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Key Terms & Sampling Methods

The response or outcome variable is the main focus. It’s what we want to measure

and compare.

• We measure how much it changes when the explanatory variable changes.

Because it depends on the explanatory variable, it’s also called the dependent

variable.

• It’s always put on the y axis.

For example, an article might test whether people walk faster if they listen to music.

The explanatory variable would be ‘music’ and the response ‘walking speed’ because

music is the thing we think might cause a change in walking speed.

WALK

FASTER!

Of course, maybe it’s hit you that there might be problems with some of these

‘variables’. I mean, what kind of music are the people in the study listening to? What if

heavy metal makes people walk faster and jazz makes them walk slower? That would

totally change the results of the article.

Those are the sorts of things you have to ponder on with this assessment. Be critical

– question everything!

when we do a study, there are also confounding variables. These are variables which

can change the response variable and make our results unreliable. We also call them

lurking variables because they just lurk around ruining our study.

So if the people from our music and walking-speed study got to choose the music

they were listening to, that could be a confounding variable – maybe only the people

who chose to listen to up-beat music would walk faster. If the researchers running the

study didn’t get any information on this, they’re not going to be able to make good

conclusions from their data.

Confounding variables are any outside factors that change the experiment results.

6 Level 3 Statistics | Statistically Based Reports| © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Key Terms & Sampling Methods

Which are the explanatory and response variables in these situations? Can you think

of any confounding variables?

An ice cream store tends to sell more ice cream on weekends than on weekdays.

People who eat more kale, quinoa and chia seeds are healthier.

Men with moustaches earn more money than men who don’t have moustaches

Observations vs Experiments

If we want to question a report, we need to know where they got their info from in the

first place. How did they test their groups? Was it using:

A statistical experiment?

A randomised statistical experiment?

An observational study?

Hold up! What are those? Let’s look more a bit more deeply at the different types of

testing we can do:

compared. People are divided up into groups, the groups are given different

treatments (versions or levels of the explanatory variable) and then their response

variable is measured and compared.

A control group is a group that doesn’t have any treatment applied to it – the

researchers leave this group’s explanatory variable alone.

CONTROL vs OTHER

GROUP

Think of one group being fed cheese before bed, and another getting nothing. Then

you sit back and see whether the group who ate cheese get weird cheese dreams.

CHEESE vs NO CHEESE

7 Level 3 Statistics | Statistically Based Reports | © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Key Terms & Sampling Methods

of the researcher simply dividing people into the different groups and testing

them, the people are randomly shuffled into each group. This is a really important

difference, and we’ll explain why soon!

An observational study is stuff like surveys and polls. We want to know the effect

of something by observing it occurring naturally (by itself) or without researcher

intervention (doesn’t have his dirty mitts on the explanatory variable).

Maybe we’re looking at whether binge-drinking effects student’s grades. We’re not

going force one group of students to start binge-drinking just to see, are we? No

way! That’s suuuper unethical. So we can’t change the explanatory variable. What

we’d do is observe students, find out about their drinking habits – the explanatory

variable - and compare their grades – the response variable.

DRINKING

&

GRADES

Causal claims

or a randomised experiment?

Well, often the whole reason a researcher bothers doing a studiy is to see whether

something directly affects something else – something we call a causal claim.

A causal claim involves stating that one event directly affects another event.

In other words, we’re saying the explanatory variable does make the response change.

Like the more pieces of gum you eat, the mintier your breath is.

No. of Gum

Fresh breath

8 Level 3 Statistics | Statistically Based Reports| © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Key Terms & Sampling Methods

Thing is, we can’t always justify the researcher’s causal claim – like in an observational

study, they have pretty much no control over the explanatory variable. They just watch

it happening. In cases like those, how can we ever be sure that something else isn’t

affecting the response variable?

So, if we want to make a causal claim, we have to control everything except the response

variable, as much as we can. That means we have to control the independent variable

(so we can’t use an observational study) and we also have to control confounding

variables to make sure nothing else is affecting our results.

We also have to make sure the participants in our experiment are randomly divided

into groups, so that there aren’t big differences between groups based on who they

contain. We can only compare two groups if they’re similar apart from the explanatory

variable, the thing we’re changing on purpose.

For example, you wouldn’t want all the girls in one group and all the boys in another –

those groups would have natural differences, so you couldn’t say that the differences

in the response variable were because of the explanatory variable.

This sounds like a lot to go through, but it all adds up to one simple rule:

If we went with:

whether they had uniform or not, and then looked at their bullying problem.

We did not control the explanatory variable. There would be other differences

between those schools than just uniform. For example, primary schools often

don’t have uniforms, while most high schools do, and there are probably different

levels of bullying at high school versus at primary school (like primary schoolers

are probably less likely to care what clothes you wear than high schoolers).

So we can’t compare the two groups based only on their uniforms, because there

are other differences affecting bullying levels.

vs

9 Level 3 Statistics | Statistically Based Reports | © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Key Terms & Sampling Methods

Because we can’t control other factors, we can’t make a causal claim based on an

observational study.

A statistical experiment – The two groups – uniform and no uniform – are assigned

by the researcher, but not at random. There might be differences between the

groups depending on how they were chosen – like if more girls’ schools were

chosen to wear uniform, and more boys’ schools were in the non-uniform group.

Maybe girls are more likely to judge each other based on clothes than boys, or the

other way around, which would affect bullying levels. If the groups are different

to start with, you can’t prove that any difference between them at the end was

because of the explanatory variable.

vs vs

If the groups are similar, then the causal claim can be justified.

It shuffles them up so that we’d have a mix of ages, sizes, and genders for the non-

uniform group and the uniform group. If these affect bullying levels, this mean they

will hopefully affect both groups evenly. Basically there will always be some difference

between groups, but randomisation minimises it.

vs

This means that randomised experiments are the best study for justifying causal

claims.

To sum up, when a report makes a causal claim, we need to look at two things:

Whether there are any lurking variables that might have an influence (like age),

and how they would affect the results.

Whether the results are from a randomised experiment, not an observational study

10 Level 3 Statistics | Statistically Based Reports| © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Key Terms & Sampling Methods

Why is it important to randomly divide people between the groups?

Correlation

Okay, so we can’t always say that the explanatory variable causes the response

variable – so what do we do when we can’t? What’s the point of a study then?

causality, exactly?

the other, but this doesn’t mean they make each other change.

Basically, correlation means the two variables are related, but we don’t know if one is

causing the other to change, or if something else is causing both of them to change.

This really important because often reports say their explanatory and response

variables have causality, when really, they only have correlation.

For an example of variables that have correlation but not causation, think of

‘temperature outside’ and ‘number of people wearing sunglasses’. On a warmer day,

you would expect to see more people wearing sunglasses, right? But they’re not

wearing sunglasses because it’s warm – they’re wearing them because it’s sunny, and

sunny days just tend to be warmer.

people

wearing

sunglasses

Temperature

The most important thing to remember here is that correlation does not mean

causation. Just because both variables are changing, it doesn’t mean that one is

making the other change! There might be causation, but we would have to do a

randomised study to know that for sure.

We’ll get into seeing what correlation looks like on a graph later, when we go through

all the graph types you’re likely to see. But for now, remember that it’s when two

11 Level 3 Statistics | Statistically Based Reports | © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Key Terms & Sampling Methods

Can you think of any more correlated variables?

Let’s do an example – that’s right, we actually get to look at a question! Here’s a study:

‘Researchers conduct a study on 100 random beachgoers on a sunny day. They ask

the beachgoers how many hours they spent in the sun and then measure the severity

of their sunburn on a scale of 1 to 10 – 10 being ‘fried to a crisp’. The researchers

found that the longer the beachgoers spent in the sun, the more severe their sunburn.

They concluded that staying out in the sun longer causes more severe sunburn.’

‘Explain whether there is statistical evidence to support the claim and identify and

discuss the effect of a possible confounding variable.’

First of all, we’d go through and write down the explanatory and response variables,

the type of study, and then a possible confounding/lurking variable. That’s a good

place to start with any report you get given, so let’s do it now.

Response variable: severity of sunburn

Type of study: Observational – the researchers didn’t change how long the people

stayed in the sun

Confounding variable: Whether they slip-slop-slapped n’ wrapped or not.

Time to ‘explain’. You would probably guess that the more time you spend in the sun,

12 Level 3 Statistics | Statistically Based Reports| © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Key Terms & Sampling Methods

the more burnt you’ll get. However, just because the claim makes sense, it doesn’t

mean the report is valid! It may just happen to have that result.

You’re not meant to be asking whether the report got the right answer, you’re meant

to be asking if they got it in the right way - is there statistical evidence to support the

claim?

causal relationship claim.

Now, you need to go a step further, and explain why it’s an observational study.

We know it is because the number of hours the beachgoers spent in the sun wasn’t

controlled by the researchers. And then there’s the confounding variable. We’ve

identified it, now we need to discuss its effect.

Some people may have worn sunscreen, hats, or sat in the shade while on the

beach, which would have significantly impacted whether they got sunburn or not.

Maybe people who stayed out longer were also less likely to wear sunscreen, so

they would have gotten more burnt. The beachgoers were not asked to identify

these variables to the researchers, so there is no knowledge to what effect they

may have had on the results of the study.

We do all this to dismiss the report’s causal claim. Basically, we can say that the

evidence in the report does not support the researchers’ causal claim, it can only be

used to show a correlation between time spent in the sun and how badly someone

gets sunburnt.

Statistical evidence gives a degree of certainty towards certain results or claims from

a statistical survey or study. This can be in the form of statistical data used to support

a claim about a population and inferences drawn from this data.

? Quick Questions:

Explain whether or not there is statistical evidence to support the claims. Identify and

discuss the effect of a possible confounding variable, and whether or not there may

be causation.

13 Level 3 Statistics | Statistically Based Reports | © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Samples

50 randomly-selected students fill out a reflection sheet each week. The other

control group of 50 randomly-selected students do not fill out a worksheet

each week. The researchers found that students who filled out the reflection

sheet participated more in class and graded higher.

speed at which students complete a statistics test. Students were asked

how many cups of coffee, tea, or energy drink they had that morning once

the test was complete. Researchers found that students who had one or more

servings of caffeine that morning completed the test faster.

SAMPLES

Okay, so we’ve had a look the way we can get data, and whether we can make causal

claims about it. But before we can run a survey, or an experiment, or whatever, to

collect our data, we have to decide who we’re going to get it from – our sample.

So what are samples, and why do we use them in the first place?

Usually when someone runs a survey, it’s because they want to find information that

applies to everyone in a certain group – for example, all teenagers or all men with

moustaches.

But we can’t get information on absolutely everyone! That would take ages, and it

would be really inconvenient and expensive. In fact, that’s called a census – and it’s

pretty rare to have the chance to do one.

usually practical to do a census.

That’s why the people who write these articles or run these surveys choose a group

of people – a sample - from the wider population, then ask them certain questions

or get them to undertake an experiment to gather data. For example, we might

select only a few men with moustaches, and then use the results to make claims

about all men with moustaches.

14 Level 3 Statistics | Statistically Based Reports| © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Samples

A sample is group taken from the overall population, which we use to make estimates

and generalisations about the population.

So, if we want to make generalisations about the population, our sample has to be as

similar to the population as possible. In fact, it has to represent the population. This is

the golden rule of sampling:

To get a representative sample, we have to first make sure we’re sampling enough

people (or things)! For a sample to be big enough to make claims about the population,

it has to have a sample size of at least 30.

won’t be the true population value; they’ll just be a guess from whatever sample

there was – and that means we can say no deal to any claim the report is making

about the population.

The true population value is the statistic we’d get if we could test the whole population.

In general, if the report is going to be legit, whatever sampling method the researcher

used has to be unbiased.

Bias in general means that because of some problem with our sample or our test, we

tend to get a particular type of answer or result that doesn’t fit the actual population.

Bias is where because of some fault in our method, the responses we get tend to be

of a particular kind that does not match the truth.

For a sample, bias is where the sample isn’t representative, like if it has too many

people from a particular group within the population, or completely excludes a group.

This causes us to get results that don’t match with the population in general.

15 Level 3 Statistics | Statistically Based Reports | © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Samples

For a good unbiased sample, everyone should have the same chance of getting

selected for our sample – that way every group can be represented!

Sampling bias is where there is a specific preference towards one group over others

being selected for the sample.

An unbiased sample means samples are taken at random, with no preference over

certain groups in the population, and everyone has a fair chance of being chosen for

the study.

Researchers can be biased when sampling a certain group of people to survey from

the population, because they might have a particular outcome they want.

Like for example, if the researchers wanted to get a result for their report that shows

that New Zealanders are bad at spelling, they could overly choose six-year-olds for

their sample.

Sure, they’re probably going to get the result they wanted – but their survey will

be biased, because six-year-olds usually are worse at spelling than the general

population, and most people are not six years old. So their result won’t be reliable.

ornge

orringe

orenge

oranngi

They could also be biased accidentally, by not realising they’re only choosing

from a certain group – like if they just survey their friends, who probably all have

stuff in common.

So to see where all this bias can come from and how to avoid it, let’s run through

the possible sampling methods. As we meet each one, we’ll tell you how legit it is,

and why.

Why do we need a representative sample?

What is a biased sample? Try to explain it in your own words.

16 Level 3 Statistics | Statistically Based Reports| © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Samples

All the good sampling methods involve some randomness – like we said, everyone has

to have a chance of being included in the sample for it to be fair and representative!

To take a random sample, we need some list or way of seeing all the things or

people in the population. Like if I wanted to take a sample of kids at school, for

everyone to have a chance of being in my sample, I’d need the whole school roll,

or some people would get missed out. The list you choose members from for your

sample is called a sample frame, which just means “this is a list of all the people that

I chose from to get a sample”.

Simple random sampling involves numbering off your whole list of the population

and then using a random number generator to find which members of the population

will be included in your sample.

It’s super legit because it’s so unbiased (since it’s completely random) and makes sure

we don’t get any wild cards. Everyone gets a fair chance.

Say we were looking at the conditions of flats in Dunedin and we needed a simple

random sample of flats – we’d number them off and randomly choose them using a

random number generator. If we got the number 11, say, house number 11 would be

included in our sample.

01 02 03 04 05 06

07 08 09 10 11 12

Systematic sampling is where the parts of a population are ordered, and then every

nth one is picked, starting from a random point.

This means you need to order your sample frame somehow. You could order them

alphabetically, by age, by seating order – and then pick every 2nd or 3rd, or whatever,

part – depending on how many you want in your sample and how many there are on

17 Level 3 Statistics | Statistically Based Reports | © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Samples

the list. With our Dunedin flats, we could order them according to street numbers and

pick every 4th house.

Wait – if we do that, won’t some houses not ever get picked? Like if we start at house

4, we’ll always get house 8 next, and then house 12. Didn’t we say everyone needs a

chance of being in the sample?

Luckily there’s a smart way around this problem. Basically, if we’re picking every 2nd, every

3rd, every 4th, every kth house – we just use a random number generator to pick a number

between 1 and k. If we’re picking every 4th house, we’d generate a number between 1

and 4, and start there. Say we got the number 3, then we’d pick the 3rd house, the 7th

house, the 11th house, etc… so any house could get picked to be in the sample!

Although this sounds pretty random, BEWARE of natural patterns. For example, if we

start at house 4, every 4th house will have an even street number. That means they’ll

all be houses from the same side of the street, and that’s biased!

A better way would be to pick every 3rd house, so that we’d get a mix of even and odd

numbers. Systematic sampling can therefore involve a little extra thinking to ensure

the sample is actually random. Because of this, it’s often pretty legit, but not as good

as other random sampling methods!

01 02 03 04 05 06

07 08 09 10 11 12

Stratified sampling is when the researcher tries to make the sample look as much like

the population as possible, by grouping it using some other variable.

For example, if the researcher knew that 1/3 of Dunedin houses were red, 1/4 were

blue and yellow, and 1/6 were green. Then he would want to make 1/3 of his sample

houses red, and so on.

1/3 1/6

1/4 1/4

18 Level 3 Statistics | Statistically Based Reports| © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Samples

Basically you divide your sample frame up based on some other feature, like house

colour, and then sample within each group based on how big the group is.

This method is super legit, because it deliberately makes sure groups get represented

to avoid bias, and the sample is still random because within each group you take a

random sample (e.g., by simple random sampling).

Cluster sampling

Cluster sampling is where you grab a whole group and use that as your sample.

Instead of selecting members spread out across your population, you just randomly

take one ‘cluster’ to act as a representative – a cluster being a group in the population

that’s close together.

You might randomly choose a few streets in Dunedin and then grab a bunch of

houses all along those streets to be your sample. Not all that credible, is it? Very

biased. It’s actually quite a scandal, in fact, because the group isn’t typical of the

whole population.

In the picture below, a cluster sample has been taken. BUT – gasp – there are no

green houses in the sample! The poor green houses are going to be horribly

underrepresented. No one’s going to even know they exist!

This is the worst of the random sampling methods, but it can still be okay if you take

quite a few clusters and make sure you’re choosing clusters at random.

19 Level 3 Statistics | Statistically Based Reports | © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Samples

As well as random sampling methods, there are non-random sampling methods. All

of these have problems, so if you see them used in a report, that’s something to

discuss!

Quota sampling

Quota sampling is where the researcher has to have a at least a certain amount of

particular groups in their sample so they’re not underrepresented or ignored.

This method has good intentions, as it looks to ensure the survey represents ALL of the

population. It especially tries to make sure minority groups get represented, because

a random sample could just by chance not contain people from small minorities.

These groups are often based on race, age, or gender. Maybe for one survey, 25% of

their sample needs to be Asian, 25% under 18, and 25% female.

Asian

Under 18

Female

Although this sounds good in theory, there are some tricks pulled by researchers

which can undermine how legit these surveys end up being.

What the sneaky old researcher will do is find people who have as many of those traits

as possible to fill their quota super-fast – like they might try to find people who are Asian

and under 18 and female. If a quarter of the sample are all teenage Asian women, that’s

not fair, is it? Although it may efficiently fill the quota, it doesn’t accurately represent

the population - it doesn’t represent any women who are over 18, for example.

The other problem is that this method isn’t usually random – maybe the researcher is

just trying to spot as many females as possible that walk past, and asking them the

questions. That’s also a source of bias, because only females who walk past can get

picked into the sample!

Basically, if you see the words quota sampling, you should be suspicious of that report!

20 Level 3 Statistics | Statistically Based Reports| © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Samples

Person in the Street has a cool name, but it’s not a cool sampling method.

This is where people are accosted by TV crews and such to get their opinion. It’s a

classic news gimmick – ‘We went out on the streets to see what people really think

about the new flag referendum!’ – but there’s so much that can go wrong.

Certain groups of people, like professionals or schoolkids, are usually at one location

at a particular time. Like if the sample is taken on the street during school time, anyone

who’s in class (as they should be) won’t be able to be represented!

The interviewer is also biased and will only ask people who look like they’ll give a

good answer – and come on, that’s not the true population value, is it?

Self-selected sample

A self-selected sample is basically what it sounds like – where people choose whether

they want to be in the sample or not.

This is often the case in web research, text-votes, phone polls, or voluntary surveys.

It sounds okay, but the thing is, the people who actually bother to text in and vote or

fill out surveys are those with strong opinions, lots of free time, or who’re invested in

the survey somehow.

Think of all the email links to surveys you’ve probably been sent at school – how many

of those did you actually answer?

In other words, the people who participate in the survey, probably don’t reflect the

population as a whole.

Usually, the report you’re given won’t use the words “self-selected” – you’ll have to

figure it out, like if it says people had to phone in to give their answers, you’ll know it

was probably self-selected.

For example, remember how on American Idol (what a throwback) people used to

vote for their favourite singer to get through? Then they’d say, ‘America has chosen our

next idol!’ and the glitter cannons would burst and everyone would start screaming

and crying and partying and whatever.

21 Level 3 Statistics | Statistically Based Reports | © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Samples

American

Watches American Idol

Voter

Thing is, America as a whole hadn’t chosen the winner – only the Americans who

bothered to watch the show, and of those, only the Americans who bothered to text

in and vote. They had to really care about the show to bother doing that. It would be

more accurate to say ‘The fans of the singers on American idol have chosen our next

winner!’ than ‘America has chosen!’.

? Quick Questions

Identify the (implied) population and the sample method used by the following

surveys:

A school would like to know whether all students would like the option of

movies being played in the auditorium during rainy lunchtimes. They survey

the two Year 11 classes in D Block, period 2 on Friday.

The local Mayor wanted to know the opinion of his electorate on banning 1080.

He rang up all of the people he personally knew from the electorate to ask

The local council is conducting a survey to see how many people drink and

drive over Queen’s Birthday weekend. The police breath-test every 5th car until

their sample size of 30 is met.

A TV programme asked viewers the question, ‘Do you like Pina Coladas,

OR getting caught in the rain?’ Viewers were asked to text or email in their

response.

22 Level 3 Statistics | Statistically Based Reports| © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Sources of Error

! SOURCES OF ERROR

Now that we’ve covered the basics of what samples are, how we get them, and the

different types of studies – we can start looking at all the things that can go wrong,

and cause error in a report.

These are the kinds of things you’re gonna be discussing when you evaluate a report –

did it have lots of error, or was it pretty good?

Here, we are going to look deeply into the eyes of the variables, find out their inner

secrets and the type of relationship they share. We’ll also teach you how to read

through the lines in those graphs, figure out whether or not their data is misleading,

and if so, where the fibs are.

A closer look at confounding variables – stuff that’s on the lurk to wreck the sample!

Blast through the graphs

Misleading graphs

You have to know how to spot errors in a survey, because if the data’s screwed up,

it won’t accurately represent the whole population. More importantly, you can’t just

state that there’s something wrong with data, instead, you need to be able to state

why it isn’t accurate. That’s the guts of this assessment. No guts, no glory.

Errors are the reasons why the sample isn’t like the true population value. They come

in two forms:

Sampling errors happen because data is collected from a sample rather than the

whole population – and one sample will never perfectly reflect the population.

Non-sampling errors are basically everything else than can make a sample different

from the population – like bias and non-random sampling methods, or bad surveys.

There’s nothing you can do about sampling errors. Sampling errors can’t be avoided,

unless you somehow study the entire population rather than just a sample – do a

census. This does actually happen in some cases – every five years or so, the NZ census

sends out a survey to everyone in the country to get data of the whole population!

23 Level 3 Statistics | Statistically Based Reports | © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Sources of Error

Like we said though, a census is usually pretty expensive and inconvenient – so if you

can’t do one, your sample is always going to be a bit different from the population

because of sampling variation.

Sampling variation means that every sample is different, so every sample will give

slightly different results. No sample will exactly match the population.

The best thing you can do to avoid sampling errors, is make your sample as large as

you can, and opt for one of the more ‘legit’ sampling methods we discussed. This will

at least keep your sampling errors to a minimum, because it’ll make the sample more

representative.

Ahhh, but non-sampling errors, that’s where the real problem lies, because they’re

harder to see. You can never get rid of non-sampling errors, but you can minimise

them. Errors could be:

accident – leading to bias

2. When people select themselves – only those with strong opinions or with stakes in

the results will bother to respond – also leading to bias

3. The wording or order of questions asked

4. False answers given due to social pressure.

5. Non-response errors – people just not answering your questions

1. Excluded Groups

Think of the sampling methods we just used. In some cases, groups were excluded by

accident and sometimes on purpose. For example,

Cluster sampling, where none of the green houses were in the sample

Person-on-the-street sampling, were TV interviewers might not bother to interview,

say, little kids or elderly people

Online surveys, where only those who have computers can respond. What about

all the others who don’t have that stuff?

24 Level 3 Statistics | Statistically Based Reports| © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Sources of Error

2. Self-Selection

Idol again – only those people who really, really want their favourite singer to win will

text in and vote. Or maybe the owner of a flag-making company really, really wants

the NZ flag to change so he gets more business – so he votes for change at every

opportunity. People who aren’t that invested might not get around to casting their

vote.

This will make a sample unrepresentative because it will only represent the people

who went out of their way to be represented!

3. Iffy Questions

The wording or order of questions may not seem important at first glance, but it has a

lot of influence on how people in the sample respond. If a question isn’t simple yes or

no, or it takes more than a few seconds to answer – get suspicious. Examples include:

‘do you like Pina Coladas and getting caught in the rain?’

These are hard to answer because people don’t know which part they’re

answering! Do I say yes if I like both? What if I like Pina Coladas but not getting

caught in the rain?

way. People will feel they need to answer in a particular way.

Most people aren’t going to say yes to that, even if they do – especially if the

person asking the question is asking it face-to-face!

Leading questions that make a statement before asking, which can sway people

towards thinking in a particular way.

If you had never had a Pina Colada, you might think ‘well, the question says they’re

delicious, so I probably like them’

25 Level 3 Statistics | Statistically Based Reports | © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Sources of Error

Double negative questions. These are tricky because they ask things in a no-no

way.

To say yes, they do like Pina Coladas, the person would have to say no. Weird.

‘How many times did you drink a Pina Colada and get caught in the rain in your whole

life?’

No one keeps track of their whole life that carefully, so they might give a wrong

answer by accident or just have a wild guess.

a) Yes

b) Maybe

c) Possibly

d) No

What’s the difference between ‘Maybe’ and ‘Possibly’? How do I know which one

to pick?

particular way. Even if just a small amount of people are guided by the way questions

are asked instead of the question itself, we have a pretty major problem.

4. Peer/Social Pressure

Social pressure is pretty much what it sounds like. If researchers were interviewing a

group of students on how many hours of homework they do each week, some of the

students might have lied to sound more impressive to the interviewer, especially if

they were being asked face-to-face.

It sounds stupid, but come on, if you were being asked, ‘How many hours of Statistics

homework do you do per week?’ would you really be inclined to say, ‘None at all’ ?

This kind of error creates bias, because it means our answers tend to be more ‘socially

acceptable’ – even if they don’t match the truth!

5. Non-response error

This one’s pretty much what it sounds like – when people don’t answer, either by

missing some questions or just deciding not to do the survey.’

26 Level 3 Statistics | Statistically Based Reports| © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Sources of Error

If the survey says ‘In your opinion, what was the most important event in the Vietnam

War?’ a lot of people are going to think ‘well, I don’t know’ and leave the question

completely blank. People can also decide not to answer personal questions, like ‘do you

sleep naked?’ – they might not want to give away that information! That means you’re

only going to get answers from people who knew enough and were comfortable

enough to give an answer – leading to bias.

What are five types of non-sampling errors?

Confounding Variables

Remember confounding variables from our first section? Well, they’re back, causing

error in our results!

Confounding variables are any outside factors that change the experiment results.

way. Otherwise, because there’s all these external factors lurking around changing

our results, we get error because we aren’t sure if what we see in the results is really

because of our explanatory variable!

‘A random sample of 100 college students were tested to find the relationship between

the students’ weight and their parents’ weight. Researchers were able to show that

there was a relationship between student and parent weight.’

Let’s say the researchers wanted to claim that change in student weight was caused

by their parent’s genes (that’s right, a causal claim).

What are some lurking variables here? Think about it – what are some factors that

might affect student weight, apart from their parent’s genes?

27 Level 3 Statistics | Statistically Based Reports | © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Sources of Error

Some families may have grown up in rural areas, while others may be from the city

… so their kids might have access to different food, which could influence their

weight. Not too hard, yeah? It’s pretty common sense. When looking for confounding

variables, just ask yourself,

‘What are some factors that might affect the response variable, not including the

explanatory variable?’

What should you ask yourself when looking for confounding variables?

Why do they cause error?

Reports aren’t always words – sometimes they have pretty pictures too! On the off

chance you do get some graphs in your exam, you have to know how they’re read.

You would have come across most of these in previous years, but here’s a refresher

just in case.

Box-and-whisker plot

0 1 2 3 4 5 6 7 8 9

The Upper Quartile, or UQ, is the value the upper 25% of the data lies above.

In the same way, the Lower Quartile, or LQ, is the value the lower 25% of the data

lies below.

The Interquartile Range, or IQR, is simply the middle 50% of the data, between

the LQ and UQ. This is the rectangular box in the middle!

If you had two box-and-whisker plots, it’d be a simple matter to compare whether one

group had a bigger spread (larger IQR, longer plot) than the other, or which one had

a higher maximum, higher median, and so on.

28 Level 3 Statistics | Statistically Based Reports| © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Sources of Error

0 10 20 30 40 50 60 70 80 90 100

We promised earlier we’d show you how correlation can be measured on a graph,

and here we go!

Scatter plots are used to display the values of a pair of variables measured from the

same source, or bi-variate data.

For example, you might say the more technology before bed, the less sleep. You’d

have measured ‘amount of technology before bed’ and ‘amount of sleep’ for each

person, and plotted them. That’s a negative relationship, and looks like this:

Linear regressions are the fancy name for patterns in these graphs. Basically, they

show relationships between variables.

correlation) between the two variables. Just as they show negative relationships, they

can also show positive ones, like this lot:

Positive relationships (e.g. more exercise = more sleep) slope upwards towards the

right. Negative relationships (e.g. more technology = less sleep) slope downwards

towards the right.

29 Level 3 Statistics | Statistically Based Reports | © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Sources of Error

Depending on how spread out the dots are, we say the variables have a strong,

moderately strong, weak or no relationship, like the graphs show.

So if a report gave you a scatter plot and said there was a strong positive relationship

between the two variables, you’d expect to see quite a straight line of dots sloping

upwards – otherwise you could disagree with the report.

Bivariate data is data which includes measurements of two different variables of the

same population. The name comes from the fact that ‘bi-‘ means ‘two’ and ‘variate’

refers to ‘variables’.

If a relationship between two variables is positive, which way will its graph slope?

Misleading Graphs

No doubt it’s pretty exciting to see some pictures in your exam. Finally! A graph!

Unless, of course, that graph is misleading. This is when the graphs don’t represent

their data fairly or are purposely biased towards a view. These can specifically be used

to make the results appear more favourable then they actually are. So watch out!

Some examples of misleading graphs (and their effects on the validity of the study!)

include…

Incorrect Proportions

50

40

30

No. accidents

20

10

0

2015 2016

Year

In this graph, the pictures used to represent each year get larger horizontally as well

as vertically. The problem with this is that we’re only supposed to be looking at the

30 Level 3 Statistics | Statistically Based Reports| © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Sources of Error

fact that x is twice as much as x – measuring vertically – but x looks HUGE because it’s

also twice as wide. This leads the viewer to think that x is much larger than it really is.

60

50

40

No. accidents

30

20

10 Year

2015 2016

This isn’t a true representation because the y-axis doesn’t start at zero. Therefore, the

proportions are skewed, and the viewer gets the wrong idea about the relationship

between these groups. This is what the graph should look like – notice how the

difference between the two bars looks smaller?

50

40

30

No. accidents

20

10

0 Year

2015 2016

Incomplete data

16

14

12

10

8

6

4

2

0

1990M09 1990M10 1990M11 1990M12 1990M01 1990M02

31 Level 3 Statistics | Statistically Based Reports | © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Sources of Error

It looks like global warming’s set permanently on the decline – or not! This is only a

tiny fraction of the overall data, and you don’t dare take such a massive conclusion

from such a small piece of information! It only appears as though global warming’s on

the decline because only 6 months of data is shown! If you looked at a bigger graph,

you’d see a much different overall trend!

As well as that, the y-axis is really confusing – what does it represent? What’s the unit?

We can’t read a graph if we don’t know the units!

Crime rate vs Unemployment rate

40 800

35 700

30 600

25 500

20 400

15 300

10 200

5 100

0 0

99 00 00

1 02 03 00

4 05 06 00

7 08 09 010

19 20 2 20 20 2 20 20 2 20 20 2

The biggest effect of this graph is that it’s extremely confusing to read! One y-axis is

meant to apply to each group – so basically, this is two very different graphs in one. The

huge problem here is they’re not being compared on the same scale! That means what

appears to be a relationship is might really only be due to the mixed-up scale. Sneaky!

3D Skewed

GTAV Fallout 4 Battlefield 3 Skyrim

Flappy Bird The Sims 4 Farmville Super Mario 64

Most graphs, especially pie graphs, are misleading when they’re presented in a 3D

32 Level 3 Statistics | Statistically Based Reports| © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Comparing Measures

format. This skews the shape of the graph so that some sections appear larger or

smaller than they really are! Here, the Skyrim section looks huge compared to the

others – but it actually isn’t.

Misread Graphs

Pay attention to the link between the article’s claim and what the graph’s showing.

For example, a study might use a graph on theft rates to prove that crime is on the

decline – but this is a very poor link to make since the graph only represents one kind

of crime. Look out for those sort of bogus matches!

COMPARING MEASURES

So far you’ve been thrown into the deep end with definitions, but now we can get to

the fun stuff, calculations!

The previous sections have gone through all the definitions you need to interpret and

analyse studies, to prove whether or not the data is reliable and the claims from the

study are justified.

In this section we’ll look at how to calculate statistical measures (basically just numbers)

from the information you’re given. This will help you support your written analysis and

statements, so you can really prove those misleading surveys are fibbing!

MoE backwards

MoE for dependent probabilities

MoE for independent probabilities

In a perfect survey, our sample statistic should match the population statistic.

Because of sampling errors, the value in the sample may will always be a little different

to the population value – like we said, samples don’t exactly match the population.

33 Level 3 Statistics | Statistically Based Reports | © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Comparing Measures

Well, it’ll never be perfect, but how do we tell whether it is ‘close enough’? How do

we see how much sampling error we have?

The margin of error is a measure of how uncertain we are about where the truth is –

basically it’s how we say, yeah, we know we have sampling error, but here’s how close

we think we are. We can say:

‘Well, I’m 95% sure the population proportion is within this distance from the sample

proportion.’

A margin of error is a small allowance which is made to allow for normal variability or

small unavoidable errors. It tells us how far the true value may be from the sample value.

‘We can be pretty sure the true population value lies between ____ and ____’

The 95% confidence interval is the sample’s statistic plus or minus the margin of error.

To give a range that we’re 95% confident the population statistic will be in,

because if we were to repeat more samples, 95% of those would be in this range.

So that’s easy, then! If we have the margin of error, we can find the confidence interval.

Take your original p – probability of success, and add and minus the margin of error.

It’s called a 95% confidence interval because we’re confident that if we were to take

hundreds more samples, 95% of them would have their proportions fall within the

margin of error, so we’re also pretty sure the population proportion is in that range.

A confidence interval describes the values we’re 95% confident the true population

value lies between.

34 Level 3 Statistics | Statistically Based Reports| © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Comparing Measures

When you calculate the margin of error and confidence intervals in this standard,

you’ll be working with proportions or percentages. We’ll show you how to do this with

an example:

Let’s look at the sort of question you might need to find a margin of error for.

Say a sample of high school kids ended up telling us that 0.65 or 65% of students have

iPhones. Does this mean that 65% of the whole population of high school students

have iPhones?

Not necessarily! We’d make a margin of error and say that the real statistic (the

population’s) is 95% likely to be a small amount below or above 0.65 or 65%.

But how do we find that small amount? What’s the range above and below 65% of

students that we are 95% sure the population mean falls within?

We need to find the margin of error, and we usually do that using what’s called the

rule of thumb.

1

Margin of Error = ±

√n

The ± is plus or minus – remember, our confidence interval is (sample proportion

+ margin of error) as well as (sample proportion – margin of error)

Now let’s plug in our numbers and go. There were 100 students surveyed.

1

Margin of Error = ±

√100

So we our two end points for the confidence interval are 65-10 = 55%,

and 65+10 =75%.

In other words, we can say that, based on the sample, ‘We are 95% certain that

between 55% and 75% of the people at this school have an iPhone’.

35 Level 3 Statistics | Statistically Based Reports | © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Comparing Measures

because of that, it only works well for proportions between 0.3 and 0.7. For probabilities

outside that range, it overestimates the MoE. But reports often use it anyway, so it’s

also called the ‘reported margin of error’.

So, if the question asks you if the reported margin of error was accurate, you need to

check if the sample probability is between 30% and 70% or not!

How does a confidence interval relate to the margin of error?

When SHOULDN’T you use the rule of thumb method?

Calculating is easy, but then come the questions. Examiners like to ask you things like,

‘Is the claim that the majority of high school students have an iPhone justified?’

What do you say? Well, first of all – what defines majority? Obviously, the majority is

going to be anything over half – if more than half your friends want to eat at Nando’s,

then you go to Nando’s. It’s common sense.

So if the percentage of the event is over 50%, AND – so is the confidence interval

– then yes, the claim is justified!

‘The claim that ‘the majority of high school students have an iPhone’ is justified as the

95% confidence interval is between 55% and 75%, which does not include anything

less than 50%.’

36 Level 3 Statistics | Statistically Based Reports| © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Comparing Measures

MoE backwards

Sometimes the examiners won’t ask you to find the MoE. Sometimes they’ll give you

the margin of error, and tell you to find the number of people in the study!

‘In an ink blot test, ink blots on cards are held up and the individual is asked to identify

the shape. 35% identified an L&P bottle. The test has a margin of error of 3.0%. How

many people were tested?’

So what do we do?? We have the margin of error, we have some statistics – but no

number of people in the study! Fear not – let’s just go back to our MoE equation!

1

Margin of Error = ±

√n

We simply plug our number into this equation, rearrange, and solve to find n!

1

0.03 = ±

√n

First thing we’ll do is times both sides by √n to get rid of that nasty fraction!

0.03√n = 1

√n = 1/0.03

√n = 33.33

Finally, we square both sides, because squaring is the opposite of square rooting!

Deep sigh of relief – we made it! We know there’s 1,111 people in the ink blot study,

not 1,110.89, because you can’t have 0.89 of a person so we round up. Let’s bullet

point those steps!

37 Level 3 Statistics | Statistically Based Reports | © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Comparing Measures

Bring the √n up by timesing it on both sides

Divide both sides by the MoE

Square both sides

Throw a party because you’re on the golden road to excellence!

Dependent Probabilities

Cool, so we’ve covered how to find margins of error and use them to answer questions

about one statistic.

Dependent probabilities are when we compare two groups that are connected

under one question. Basically, we compare the proportions of two options for the

same question.

• Do men with moustaches earn more money than men without moustaches?

• Did people who used ‘This’ brand of sunscreen get less burns than people who

used ‘That’ brand?

Say we conducted a study on sunscreen brands, where our margin of error is something

like ± 2%. In the study, 30% of people used Burn-Be-Gone, and 60% used Sayonara-

Sunburn. All the percentage left over is just no sunscreen, or other brands.

We might want to compare how many people use those two brands. We claim,

‘The percentage of people who use Sayonara-Sunburn is 30% greater than the

percentage of people who use Burn-Be-Gone.’

Now for the margin of error. If it’s ±2%, that gives us a confidence interval of:

58 - 62% for Sayonara-Sunburn.

THAT means Burn-Be-Gone usage could boost up to 32% (30% + 2%). And if it

does, then that’s likely to result in the usage of Sayonara-Sun dropping down to

58% (60% - 2%).

38 Level 3 Statistics | Statistically Based Reports| © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Comparing Measures

Burn-Be-Gone 30%

Burn-Be-Gone 32%

They’re dependent on each other. If one goes up, the other might drop down, which

means the difference between them is going to vary.

Since the difference changes we can’t make an exact claim on the difference.

• We can’t claim, ‘The percentage of people who use Sayonara-Sunburn is exactly

30% greater than the percentage of people who use Burn-Be-Gone.’

If there are dependent probabilities, then the margin of error of the difference

between these two answers is around 2 x the margin of error for the entire study.

Example? We know our difference is 30%. We know our total margin of error is ±2%.

So our margin of error of that difference is:

Margin of Error of Difference = ±4%

Aha! So our confidence interval for the difference between the two percentages is:

Confidence interval = (30% - 4%) to (30% + 4%)

Confidence interval = 26 – 34%

30%

30%

26%

39 Level 3 Statistics | Statistically Based Reports | © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Comparing Measures

This means the confidence interval for the actual difference between the number of

people who use Burn-Be-Gone or Sayonara-Sunburn is between 26 – 34%!

Makes sense, right? We said if Burn-be-Gone went down to 28% then Sayonara-

Sunburn would go up to 62%, so the difference would be 62-38 = 34% - and if Burn-

be-Gone went up to 32% then Sayonara-Sunburn would go down to 58%, so the

difference would be 58%-32% = 26%.

Why do we need to find a confidence interval for the difference between the stats?

Independent Probabilities

If you got two different questions and want to compare their stats – you’ve got

independent probabilities! The two proportions aren’t going to affect each other if

they’re two separate surveys or questions.

Say you conducted a survey way back in 2005 to see if employers are more likely to

hire people with tattoos. 45% of employers showed support, with a MoE of 3.5%. Ten

years later, you conduct another survey looking at the same thing, and this time 48.5%

of employers showed support with a MoE of 2.0%. Is support for tattoos dropping?

For starters, the surveys are independent. They happen ten years apart! So how do

we compare their stats? How do we know if support for tattoos is going up or down?

We can’t use the same rule as for dependent probabilities, because these things don’t

affect each other!

We have a shiny new rule of thumb method for this exact dilemma.

Find the margin of error of the DIFFERENCE between the two surveys, and see if

it is entirely above 0, entirely below 0, or crosses 0!

At the moment, the difference between the two sample statistics are:

48.5 – 45 = 3.5

40 Level 3 Statistics | Statistically Based Reports| © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Comparing Measures

Difference

45%

48,5%

Our aim is now to find the margin of error of this 3.5%. The two surveys don’t have the

same MOE – so what do we do? Use this:

MoE1+MoE2

MoE of difference = ±1.5 ×

2

That fraction on the end is finding the average of the two margin of errors we’ve been

given. Let’s take this equation for a spin – oh, hold up, what are we plugging into it?

MoE1 = 3.5%, because this is the margin of error for the 2005 study

MoE2 = 2.0%, because this is the margin of error for the 2015 study

3.5 + 2.0

MoE of difference = ±1.5 ×

2

Now find the confidence range/interval of this difference. If the range is fully positive,

or fully negative, then we’re pretty confident there’s a real difference between the two

statistics. If the difference includes zero, we know there’s not much difference at all –

there might not even be one - and so we wouldn’t be able to confirm that support for

people with tattoos is going up in this case.

3.5 ± 4.125

–0.575% to 7.625%

Look at that range. One end of it is a negative number and on the other end is a

positive number! That means it includes 0 – it includes the possibility that there might

be no difference between the surveys.

41 Level 3 Statistics | Statistically Based Reports | © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Comparing Measures

All up, this range means that the second statistic might actually have been up to

7.625% higher than the first, but it also might have been up to 0.575% lower – so we

cannot support the claim that support for people with tattoos has risen over ten years!

You might be asking by now, What’s the point of learning all this? or how am I gonna

use this in my exam? or why am I still reading this walkthrough guide?

An article compares two statistics, something like, ‘silver cars are less likely to get in a

serious crash than white cars’, we’d calculate the margin of error of their difference.

Then we’d use this as evidence to support or not support the claim made!

These two statistics need to be CLOSE – like we just had 45% and 48.5%. If there’s

a huge gap – say, 20% and 70% - there’s obviously a difference there, that doesn’t

need a bunch of calculations to back it up.

Always keep the purpose of these things in mind – that’s the reason you’ll be using

all this comparing statistics stuff – you’re looking at whether the people in the articles

compared their statistics correctly! Do your numbers match up with their numbers?

Can any of their claims be justified?

How do you find a MoE for the difference between two dependent proportions?

How do you find a MoE for the difference between two independent proportions?

? Quick Questions:

A street survey found that 41% of the 500 people surveyed have both a body

piercing and a tattoo. Find the margin of error and the confidence interval for

this poll.

Out of the 59% who do not have both a body piercing and a tattoo, 45% say

it makes a person less attractive. What is the margin of error and confidence

interval of this 45% statistic?

It is likely that the majority of people without tattoos and piercings feel a tattoo

makes a person less attractive?

42 Level 3 Statistics | Statistically Based Reports| © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Key Terms

KEY TERMS

Variable:

A feature that’s able to change.

Explanatory Variable:

The thing that’s controlled and changed. It’s usually the cause or the explanation

of the other variable.

Response/outcome Variable:

The focus. We measure how much it changes when the explanatory variable changes.

Confounding variables:

Outside factors that affect the study results and aren’t controlled.

Evaluate:

This means you find whether surveys are true, fair, unbiased and well-represented,

or not.

Causal claim:

Saying that the explanatory variable directly causes the response variable to change.

Sampling:

When we take a small group from the population and use it to represent the entire

population.

Representative:

A sample is representative if it has the same kind of mix of people as the population,

so it can be used to make claims about the population.

Sample frame:

A list of all the members of the population, used to choose a sample.

Bias:

When a sample tends to get a particular kind of answer that is different from the

truth, because we picked a bad sample or asked bad questions.

Sampling bias:

When a sample overrepresents or underrepresents certain groups in the population,

so the sample is not representative.

The real statistics of the real population. This is what we want to have a guess at

using our sample.

43 Level 3 Statistics | Statistically Based Reports | © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Key Terms

Sampling Errors:

These happen because data is collected from a sample rather than the whole

population – and one sample will never perfectly reflect the population.

Non-sampling errors:

These happen if a sample has bias or doesn’t accurately represent the population.

Dependent Probabilities:

When we compare two groups that are connected under one question.

Independent Probabilities:

When you’ve two different questions and want to compare their stats. They’re not

going to affect each other if they’re two separate surveys or questions – which is

what makes them independent.

Majority:

Anything over half.

The sample’s statistic plus or minus the margin of error. This gives a range that we’re

95% confident the population statistic will be in, because if we were to repeat more

samples, 95% of those would be in this range.

Margin of Error:

A distance from the sample statistic that we’re pretty 95% sure the population

statistic falls within.

Causality:

When one variable causes another. As one changes, it makes the other change.

Correlation:

When both variables change. As one variable changes, the other one tends to, but

this doesn’t mean they make each other change.

Sampling Methods

Random sampling:

Where each bit of the population is numbered off and has an equal chance of

being selected.

Systematic sampling:

Where the parts of a population are ordered, and then every nth one is picked from

a random starting point.

44 Level 3 Statistics | Statistically Based Reports| © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Key Terms

Stratified sampling:

When the researcher tries to make the sample look as much like the population as

possible by grouping it using a different variable and then sampling proportionately

within each group.

Cluster sampling:

When you grab whole groups and use that as your sample.

Quota sampling:

Where the researcher has to have a certain amount of minority groups in their

sample so they’re not underrepresented or ignored.

This is where people are accosted by TV crews and such to get their opinion.

Self-selected sample:

Think web research, text-votes, phone polls, or voluntary surveys. Where people

choose to be part of the sample.

Equations

The Rule of Thumb for a Margin of Error:

1

Margin of Error = ±

√n

Dependent Probabilities:

Independent Probabilities:

MoE1+MoE2

MoE of difference = ±1.5 ×

2

45 Level 3 Statistics | Statistically Based Reports | © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Key Terms

NOTES

46 Level 3 Statistics | Statistically Based Reports| © Inspiration Education Limited 2018. All rights reserved.

Statistically Based Reports | Key Terms

47 Level 3 Statistics | Statistically Based Reports | © Inspiration Education Limited 2018. All rights reserved.

studytime.co.nz

© Inspiration Education Limited 2018. All rights reserved

- Teddlie - Mixed Methods Sampling - A Typology With ExamplesUploaded byKermit The Frog
- MKTG 30 Chapter 8 QuizUploaded bytharemidy
- Landscape Archaeology Survey ProjectUploaded byL.W.Franklin, Ph.D, RPA
- Labour Welfare Activities in BiltUploaded bySachit Gambhir
- A Project Report on Brand Positioning of Big BazaarUploaded byBabasab Patil (Karrisatte)
- Praveen Mr ProjectUploaded byPraveen Mishra
- researchUploaded bySylvia Nabwire
- Purposive SamplingUploaded byYawar Khan Khilji
- Final Synopsis - Pallab (HRIS)Uploaded byAnil Kumar Singh
- Srikanth ProjectUploaded byGopi
- ap statistics unit 3 study guideUploaded byapi-232613595
- MECE513 2006 Lecture 04 A TN5 Work MeasureUploaded byapi-3827845
- Test Bank for Business Statistics 9th Edition by GroebnerUploaded bya243011001
- Research MethodologyUploaded byAtul Ninawat
- GUIDELINES FOR RESEARCH PROJECTS (2)Uploaded byRajesh Chaudary A
- Kaur Kaur Kaur KaurUploaded byPwïñcëës Šƛm
- Bms Slides-research Process_notesUploaded bymihir1811
- QBA 282 Statistics SyllabusUploaded byNewtonberry
- Survey SamplingUploaded byShahid Rashid
- Chapter 5 Notes - HinemanUploaded byCindy Laylan
- hrairtel-120725162819-phpapp02Uploaded byamritabhosle
- ResearchUploaded byRebecca Cummings
- Research Powerpoint Eto Na TalagaUploaded bykyle
- Ages Averages Partnership.Uploaded byTee Si
- Data Densitas & ViskositasUploaded byAal Azwa
- Term PaperUploaded byMarzuq Hussain
- 1148-10788-1-PBUploaded byMaulina Aini
- Isha- AMR Final Sem 2Uploaded byIsha Aggarwal
- Socio Res Method 25-8-08Uploaded bySaurabh Suman
- chap 4 - section 3 - power pointUploaded byapi-173610472

- jcph.4Uploaded byTharanga Devinda Jayathunga
- Crankshaft NotesUploaded bytharad
- UserGuide.pdfUploaded byTharanga Devinda Jayathunga
- Boundary LaUploaded byArun
- 5 - Effects of Reynolds NumberUploaded byCharles Portner
- Venn Diagrams.docxUploaded byTharanga Devinda Jayathunga
- readme.txtUploaded byTharanga Devinda Jayathunga
- Sheet MetalUploaded byTharanga Devinda Jayathunga
- Effect of Rear End Taper on Drag ForUploaded byTharanga Devinda Jayathunga

- adhd worrywise kidsUploaded byapi-246705411
- Chai Wala Servicescape Assignment 3Uploaded byVarun S
- A Brain for Business - A Brain for LifeUploaded bythunglungxxx
- LA 12.9 POEMUploaded byVejaya Letchumi Supramaniam
- john quinteros - week 9 - lesson planUploaded byapi-371140872
- BRM Complete SyllabusUploaded bysomya
- ‘Conversion Therapists,’ the Anti-LGBT RightUploaded byAndrew Richard Thompson
- City and Sanctuary in Hellenistic Asia Minor Constructing civic identity in the sacred landscapes of Mylasa and Stratonikeia in KariaUploaded byChristina Williamson
- Customer SatisfactionUploaded byBinod_Sahu_6448
- 27 Points of Difference between Personnel Management & HRDUploaded byMurtaza Ejaz
- research finalUploaded byapi-260050868
- preparing for my future lesson planUploaded byapi-349365251
- Cognition in a Patient With Very Mild Right Sided HemiparkinsonismUploaded byAdriàBermudoGallaguet
- introtext_measures2Uploaded byambuenaflor
- FS 2Uploaded byGrenalyn De la Mar
- Facebook Engineering BootcampUploaded bymazeltovjp
- 1Uploaded byApril Deveras Judilla
- 49 Interview Questions for the CIOUploaded byyagay
- Alchemist Tg HarperUploaded byEddy Lagroue
- lecterUploaded byHugo Ballesteros
- eng 101 reflectionUploaded byapi-385829774
- Mathematics v AUploaded byRea Ann Autor Lira
- Module 1Uploaded bySubhas Roy
- 1.Language Learning Through Video MakingUploaded byVicente AR
- Ultimate Phonics Reading TestUploaded bySpencer Learning
- Early Childhood Australia Code of Ethics PublicationsUploaded byCezanne Broad
- The Education of American Girls by Brackett, Anna CallenderUploaded byGutenberg.org
- PMBOK Guide 5th Edition - Notes (2)Uploaded byJoaoCOS
- Race Evasiveness Among Camp Workers - DraftUploaded byed_the_platypus3451
- In_English A2.2_TGUploaded byU.E. Federico Intriago