0 views

Original Title: shifts

Uploaded by Viet Quoc Hoang

- Pacing Calendar
- Quartiles
- general maths unit plan
- Csurvey2 Manual
- Bolted
- Normalitas Grace
- Audit Sampling
- 5estimation Conf. Int
- gourav
- ACS_10_SF4_B13002 (versión 1) (Autoguardado)
- 3. Quiz 2 - 4
- Ram Deo Statistics
- Chapter 10 - Work Measurements and Standards.ppt
- FYBSC
- Solution Manual for Complete Business Statistics 7th Edition Aczel
- January 2016B
- Tea Stall With Page
- 26771.pdf
- SJBv4n2_071
- Estimation

You are on page 1of 7

Curriculum Levels 5 to 7

(First posted June 2008, updated 4 Feb 2011)

in collaboration with Nicholas Horton, Maxine Pfannkuch & Matt Regan

Looking at the world using data

in DATA sampling variation in populations

REAL

(reflecting the differences between

individuals in population)

VARIATION Unseen

in sample data

INDUCED

Seen (by sampling variation)

© Department of Statistics, The University of Auckland, Feb 2011. p 1 of 7 Improvement suggestions to c.wild@auckland.ac.nz

What I see may not quite be the way it really is

in populations sampling variation in DATA

SMALL samples BIG distortions

At worst we get

BIG samples SMALL distortions

Bigger sample size information about what is happening back in the population

Description:

Distribution of A-values shifted up scale from that of B-values

A A-values bigger on average than B-values

B

Assumed Student development at this point:

Can describe what they see in the observed data

Can I claim there is a similar Aware of the effects of sampling variation in visual displays

pattern back in the populations? Sampling variation alone can produce shifts

These shifts are small in very large samples

They can be misleadingly large in small samples

Will I claim A-values are also bigger on average back in the populations?

I will if the shifts are bigger than those produced by sampling variation

Otherwise I will not. I cannot tell whether A-values are bigger than

B-values back in the populations. It may even be the other way around.

© Department of Statistics, The University of Auckland, Feb 2011. p 2 of 7 Improvement suggestions to c.wild@auckland.ac.nz

Observed data: Back in the populations:

Do B values tend to be bigger than A values?

My call is ....

A B is bigger

B

all sample sizes

A

B is bigger

B

A Claim B is bigger

B if both sample sizes > 20 Larger random samples have

more information about the

populations they came from.

of 7

A Thus, with larger random samples,

Whats my call here?

B we can make the B is bigger

call from smaller shifts

But how do we decide?

A Whats my call here? - depends on educational level of students

- see next page ...

B

A Call Cannot tell

B unless both samples are huge

A

B Cannot tell all sample sizes

Warning to teachers: avoid doing this sample with sizes smaller than about 20 in each group. Small samples quite often give rise to unstable and often very strange boxplots

To echo the previous diagram, we get very large distortions -- see plots for samples of size 10 on page 6

How to make the call by Curriculum Level

At all levels: A

B

If there is no overlap of the boxes, make the call immediately

B tends to be bigger than A back in the populations

A

Curriculum Level 5: the 3/4-1/2 rule

B

If the median for one of the samples lies outside the box for the other sample

(e.g. more than half of the B group are above three quarters of the A group)

make the claim B tends to be bigger than A back in the populations

[Restrict to samples sizes of between 20 and 40 in each group]

A

B

dist. betw. medians

if distance between medians is greater than about ...

[Could also use 1/10 of overall visible spread for sample sizes of around 1000]

Curriculum Level 7: based on informal confidence intervals for the population median

Draw horizontal line

IQR = interquartile range

IQR IQR = width of box

Med - 1.5 Med + 1.5 n = sample size

n n

A

B

if these horizontal lines (intervals) do not overlap

© Department of Statistics, The University of Auckland, Feb 2011. p 4 Improvement suggestions to c.wild@auckland.ac.nz

Some notes about the guidelines

At all levels:

Emphasize the visual, keep the eyes constantly on the plots

What we are doing here is just one small step in interpreting a comparison

- It is definitely not what the statistics module is all about

While our depictions are in terms of 2 groups do not hesitate to use more groups

- The stories uncovered in data by comparing several groups are often much more interesting

The intuitive idea here is the majority of the B group is bigger than the the great whack of the A group

Operate as "the visual shift is big enough to make the call if the median for one of the samples lies outside

the box for the other sample regardless of whether this happens on the lower or upper side of the graphs.

Technical aside: sampling variation alone does not often produce shifts large enough to trigger this rule

- about 15 times in 100 for samples of size 20 in each group, 7 times in 100 for samples of 30,

3 times in 100 for samples of 40, 1 times in 2,500 for samples of size 100.

Curriculum Level 6: distance between medians as proportion of overall visible spread

Students should only be making rough eye-ball judgements

You are getting the students accustomed to using an idea, not the precise implementation of an algorithm

- Do not make this hinge on accuracy of application of the 1/3 and 1/5 rules

Whether the distance is bigger than 1/3 or 1/5 will often be obvious

- Otherwise they should do a freehand subdivision of a line into thirds or fifths and then decide

Technical aside: sampling variation alone seldom produces shifts large enough to trigger these rules

(about 8 times out of 100 for both rules at the listed sample sizes )

Curriculum Level 7: based on informal confidence intervals for the population median

They cover the true population Median for approximately 9 out of 10 samples taken (show with simulations)

- So appeal to the population median for A is probably in here somewhere, similarly for B

- This leads naturally to B bigger than A claim when they do not overlap

* Technical aside 1: sampling variation hardly ever causes shifts big enough to make us

mistakenly claim that B is bigger than A or vice versa using this method

(only about once per 40 pairs of samples)

* Technical aside 2: When the intervals do not overlap, a confidence interval for the difference

in population medians ranges from the smalller distance between the intervals to the larger

A

B

dist. = lower confidence limit for difference in population medians

for difference in population medians

© Department of Statistics, The University of Auckland, Feb 2011. p 5 of 7 Improvement suggestions to c.wild@auckland.ac.nz

Examples of shifts caused purely by sampling variation

The population being sampled is the 12 yearolds in NZ CensusAtSchool database

the measure used is height

This single population is being sampled independently over and over again so any shifts seen are due solely to the sampling

Population distribution Population distribution

Samples of size 10 Samples of size 30

120 130 140 150 160 170 180 120 130 140 150 160 170 180

Samples of size 10 shown to demonstrate why we should not be working in this way

with such small samples

Examples of shifts caused purely by sampling variation

The population being sampled is the 12 yearolds in NZ CensusAtSchool database

the measure used is height

This single population is being sampled independently over and over again so any shifts seen are due solely to the sampling

Population distribution Population distribution

Samples of size 100 Samples of size 500

120 130 140 150 160 170 180 120 130 140 150 160 170 180

- Pacing CalendarUploaded byStephRunn44
- QuartilesUploaded byKaren Onggay
- general maths unit planUploaded byapi-368306089
- Csurvey2 ManualUploaded byafa81
- BoltedUploaded byJoe Andrews
- Normalitas GraceUploaded bykevin
- Audit SamplingUploaded byAnonymous Ul3litq
- 5estimation Conf. IntUploaded byanon_679166612
- gouravUploaded byg1s
- ACS_10_SF4_B13002 (versión 1) (Autoguardado)Uploaded byluchonson
- 3. Quiz 2 - 4Uploaded byhimanshubahmani
- Ram Deo StatisticsUploaded byRam Bharti
- Chapter 10 - Work Measurements and Standards.pptUploaded bynicks012345
- FYBSCUploaded bySachinAPatil
- Solution Manual for Complete Business Statistics 7th Edition AczelUploaded byhanand6
- January 2016BUploaded byjan
- Tea Stall With PageUploaded byMahmudurRahmanRumi
- 26771.pdfUploaded bybob smith
- SJBv4n2_071Uploaded byCh Ahmad Asghar
- EstimationUploaded byanil_singh_62
- 1467-9639Uploaded byopurata
- LexiconUploaded bySiddharth Sikaria
- BASE Costello, 2003 Relationships Between Poverty and PsychopathologyUploaded byDoniLeite
- Bivariate Data ProjectUploaded byTrang Vu
- Sample Size DeterminationUploaded byrkhoo_7
- Listeners in JmeterUploaded byNitin Tyagi
- Statistics in Analytical ChemistryUploaded byHelder Durao
- SamplePlanner2007 FINALUploaded bypradeepnagda
- 2012 Math Studies Exam PaperUploaded byTifeny Seng
- Upload.docxUploaded byAnonymous a5cIliy5

- 2000 Heffernan Exam 1.pdfUploaded byViet Quoc Hoang
- 2000 Heffernan Exam 1.pdfUploaded byViet Quoc Hoang
- Angles RevisionUploaded byViet Quoc Hoang
- Factors, Multiples, Primes, Prime Factors, LCM and HCFUploaded byKatrice Li
- 2005 Kilbaha Exam 1Uploaded byViet Quoc Hoang
- 6.3 Pythagoras' Theorem and Isosceles TrianglesUploaded byViet Quoc Hoang
- Revision Race Score BoardUploaded byViet Quoc Hoang
- Cambridge Secondary Checkpoint - Mathematics (1112) October 2017 Paper 2Uploaded byViet Quoc Hoang
- Angles RevisionUploaded byViet Hoang Quoc
- Factors, Multiples, Primes, Prime Factors, LCM and HCFUploaded byViet Quoc Hoang
- Harry D. Ruderman - NYSML-ARML Contests 1973-1985 (1987)Uploaded byViet Quoc Hoang
- Substitution Race Game A3Uploaded byViet Quoc Hoang
- 2000 MAV Exam 1 Solutions.pdfUploaded byViet Quoc Hoang
- Assessment Writers ScheduleUploaded byViet Quoc Hoang
- 2019 Teaching Journal.docxUploaded byViet Quoc Hoang
- A453 Grade a Example Password StrengthUploaded byViet Quoc Hoang
- algebraUploaded byViet Quoc Hoang
- Maths in Focus - Margaret Grove - ch9Uploaded bySam Scheding
- Maths in Focus - Margaret Grove - ch9Uploaded bySam Scheding
- Jason PopcornUploaded byViet Quoc Hoang
- Codebreaker Solving EquationsUploaded byViet Quoc Hoang
- trigonometryUploaded byViet Quoc Hoang
- Study GuideUploaded byViet Quoc Hoang
- drivenUploaded byViet Quoc Hoang
- yearUploaded byViet Quoc Hoang
- trigUploaded byViet Quoc Hoang
- Toronto MathCompUploaded byTanagorn Jennawasin
- CodebreakerSolvingEquations.pdfUploaded byViet Quoc Hoang
- trigUploaded byViet Quoc Hoang

- GRE Quant-DistributionUploaded byMonjurul Karim Raju
- Introduction to R - PusdiUploaded byMegayanaYessyMaretta
- Book Chapter 07Uploaded byJoe
- Quartile and Outlier Detection on Heterogeneous Clusters using Distributed Radix SortUploaded byklspafford
- Chapter 2 Probabilty and StatisticsUploaded bykerlen
- SPSSUploaded byDwi Yanto
- ALL.pptxUploaded byMangala Semage
- Graphical Representation of Statistical DataUploaded byxaxif8265
- Method of Statistical Analysis by SpringerUploaded byjoycechicago
- Descriptive StatisticUploaded byFaizin UINSK
- Jay L. Devore-Probability and Statistics for Engineering and the Sciences, Jay L Devore Solutions Manual-Duxbury Press (2007)Uploaded byNicodemus Sigit Sutanto
- stats eport projectUploaded byapi-320818400
- Mba Statistics First Long QuizUploaded byEllaine Jane Fajardo
- Concept DescriptionUploaded byIshaan Tuli
- final paperUploaded byapi-313029975
- mth536 mod1 app histogram lessonUploaded byapi-349546233
- 1 3 box and whisker plots udl 620Uploaded byapi-282904450
- Areas of NumeracyUploaded byTalibe
- KN0Uploaded byNabila Saribanun
- Comprehensive Guide to Data Visualization in RUploaded byDwaipayan Mojumder
- 03_SBE11e_SM_Ch03.docUploaded byMohamed Med
- chapter 10Uploaded byapi-268597715
- Contour BoxplotUploaded byDiego Alonso Melendez Saldaña
- Volatility Radar en 1228198Uploaded byTze Shao
- Teaching Program Year 10 (Advanced)Uploaded byEvan Tran
- Fast Food RestaurantsUploaded bychandanmiddha
- 3. R Basic Statistics Graphs and Reporting_Video_Ref_v1Uploaded byYoga Nand
- Lec 8 Intro to Information VisualizationUploaded byPoop
- Statistics Fundamentals SuccinctlyUploaded byJedaias Rodrigues
- A1INSE6220-Winter17sol.pdfUploaded bypicala