Вы находитесь на странице: 1из 435

Commission on Higher Education

in collaboration with the Philippine Normal University

in collaboration with the Philippine Normal University TEACHING GUIDE FOR SENIOR HIGH SCHOOL Statistics and

TEACHING GUIDE FOR SENIOR HIGH SCHOOL

Statistics and Probability

CORE SUBJECT

This Teaching Guide was collaboratively developed and reviewed by educators from public and private schools, colleges, and universities. We encourage teachers and other education stakeholders to email their feedback, comments, and recommendations to the Commission on Higher Education, K to 12 Transition Program Management Unit - Senior High School Support Team at k12@ched.gov.ph. We value your feedback and recommendations.

Published by the Commission on Higher Education, 2016 Chairperson: Patricia B. Licuanan, Ph.D.

Commission on Higher Education K to 12 Transition Program Management Unit Office Address: 4th Floor, Commission on Higher Education, C.P. Garcia Ave., Diliman, Quezon City Telefax: (02) 441-1143 / E-mail Address: k12@ched.gov.ph

Telefax: (02) 441-1143 / E-mail Address: k12@ched.gov.ph DEVELOPMENT TEAM Team Leader: Jose Ramon G. Albert, Ph.D.
Telefax: (02) 441-1143 / E-mail Address: k12@ched.gov.ph DEVELOPMENT TEAM Team Leader: Jose Ramon G. Albert, Ph.D.

DEVELOPMENT TEAM

Team Leader: Jose Ramon G. Albert, Ph.D.

Writers:Zita VJ Albacea, Ph.D., Mark John V. Ayaay

Isidoro P. David, Ph.D., Imelda E. de Mesa

Technical Editors: Nancy A. Tandang, Ph.D., Roselle V. Collado

Copy Reader: Rea Uy-Epistola

Illustrator: Michael Rey O. Santos

Cover Artists: Paolo Kurtis N. Tan, Renan U. Ortiz

CONSULTANTS

THIS PROJECT WAS DEVELOPED WITH THE PHILIPPINE NORMAL UNIVERSITY.

University President: Ester B. Ogena, Ph.D. VP for Academics: Ma. Antoinette C. Montealegre, Ph.D. VP for University Relations & Advancement: Rosemarievic V. Diaz, Ph.D.

Ma. Cynthia Rose B. Bautista, Ph.D., CHEDBienvenido F. Nebres, S.J., Ph.D., Ateneo de Manila University Carmela C. Oracion, Ph.D., Ateneo de Manila University Minella C. Alarcon, Ph.D., CHEDGareth Price, Sheffield Hallam University Stuart Bevins, Ph.D., Sheffield Hallam University

SENIOR HIGH SCHOOL SUPPORT TEAM

CHED K TO 12 TRANSITION PROGRAM MANAGEMENT UNIT

Program Director: Karol Mark R. Yee

Lead for Senior High School Support: Gerson M. Abesamis

Lead for Policy Advocacy and Communications: Averill M. Pizarro

Course Development Officers: John Carlo P. Fernando, Danie Son D. Gonzalvo

Teacher Training Officers: Ma. Theresa C. Carlos, Mylene E. Dones

Monitoring and Evaluation Officer: Robert Adrian N. Daulat

Administrative Officers: Ma. Leana Paula B. Bato, Kevin Ross D. Nera, Allison A. Danao, Ayhen Loisse B. Dalena

Kevin Ross D. Nera, Allison A. Danao, Ayhen Loisse B. Dalena This Teaching Guide by the

This Teaching Guide by the Commission on Higher Education is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. This means you are free to:

Share — copy and redistribute the material in any medium or format

Adapt — remix, transform, and build upon the material.

The licensor, CHED, cannot revoke these freedoms as long as you follow the license terms. However, under the following terms:

Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.

NonCommercial — You may not use the material for commercial purposes.

ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.

Printed in the Philippines by EC-TEC Commercial, No. 32 St. Louis Compound 7, Baesa, Quezon City, ectec_com@yahoo.com

Preface

Prior to the implementation of K-12, Statistics was taught in public high schools in the Philippines typically in the last quarter of third year. In private schools, Statistics was taught as either an elective, or a required but separate subject outside of regular Math classes. In college, Statistics was taught practically to everyone either as a three unit or six unit course. All college students had to take at least three to six units of a Math course, and would typically “endure” a Statistics course to graduate. Teachers who taught these Statistics classes, whether in high school or in college, would typically be Math teachers, who may not necessarily have had formal training in Statistics. They were selected out of the understanding (or misunderstanding) that Statistics is Math. Statistics does depend on and uses a lot of Math, but so do many disciplines, e.g. engineering, physics, accounting, chemistry, computer science. But Statistics is not Math, not even a branch of Math. Hardly would one think that accounting is a branch of mathematics simply because it does a lot of calculations. An accountant would also not describe himself as a mathematician.

Math largely involves a deterministic way of thinking and the way Math is taught in schools leads learners into a deterministic way of examining the world around them. Statistics, on the other hand, is by and large dealing with uncertainty. Statistics uses inductive thinking (from specifics to generalities), while Math uses deduction (from the general to the specific).

“Statistics has its own tools and ways of thinking, and statisticians are quite insistent that those of us who teach mathematics realize that statistics is not mathematics, nor is it even a branch of mathematics. In fact, statistics is a separate discipline with its own unique ways of thinking and its own tools for approaching problems.” - J. Michael Shaughnessy, “Research on Students’ Understanding of Some Big Concepts in Statistics” (2006)

Statistics deals with data; its importance has been recognized by governments, by the private sector, and across disciplines because of the need for evidence-based decision making. It has become even more important in the past few years, now that more and more data is being collected, stored, analyzed and re-analyzed. From the time when humanity first walked the face of the earth until 2003, we created as much as 5 exabytes of data (1 exabyte being a billion “gigabytes”). Information communications technology (ICT) tools have provided us the means to transmit and exchange data much faster, whether these data are in the form of sound, text, visual images, signals or any other form or any combination of those forms using desktops, laptops, tablets, mobile phones, and other gadgets with the use of the internet, social media (facebook, twitter). With the data deluge arising from using ICT tools, as of 2012, as much as 5 exabytes were being created every two days (the amount of data created from the beginning of history up to 2003); a year later, this same amount of data was now being created every ten minutes.

In order to make sense of data, which is typically having variation and uncertainty, we need the Science of Statistics, to enable us to summarize data for describing or explaining phenomenon; or to make predictions (assuming trends in the data continue). Statistics is the science that studies data, and what we can do with data. Teachers of Statistics and Probability can easily spend much time on the formal methods and computations, losing sight of the real applications, and taking the excitement out of things. The eminent statistician Bradley Efron mentioned how diverse statistical applications are:

“During the 20th Century statistical thinking and methodology has become the scientific framework for literally dozens of fields including education, agriculture, economics, biology, and medicine, and with increasing influence recently on the hard sciences such as astronomy, geology, and physics. In other words, we have grown from a small obscure field into a big obscure field.”

In consequence, the work of a statistician has become even fashionable. Google’s chief economist Hal Varian wrote in 2009 that “the sexy job in the next ten years will be statisticians.” He went on and mentioned that “The ability to take data - to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it's going to be a hugely important skill in the next decades, not only at the professional level but even at the educational level for elementary school kids, for high school kids, for college kids. “

This teaching guide, prepared by a team of professional statisticians and educators, aims to assist Senior High School teachers of the Grade 11 second semester course in Statistics and Probability so that they can help Senior High School students discover the fun in describing data, and in exploring the stories behind the data. The K-12 curriculum provides for concepts in Statistics and Probability to be taught from Grade 1 up to Grade 8, and in Grade 10, but the depth at which learners absorb these concepts may need reinforcement. Thus, the first chapter of this guide discusses basic tools (such as summary measures and graphs) for describing data. While Probability may have been discussed prior to Grade 11, it is also discussed in Chapter 2, as a prelude to defining Random Variables and their Distributions. The next chapter discusses Sampling and Sampling Distributions, which bridges Descriptive Statistics and Inferential Statistics. The latter is started in Chapter 4, in Estimation, and further discussed in Chapter 5 (which deals with Tests of Hypothesis). The final chapter discusses Regression and Correlation.

Although Statistics and Probability may be tangential to the primary training of many if not all Senior High School teachers of Statistics and Probability, it will be of benefit for them to see why this course is important to teach. After all, if the teachers themselves do not find meaning in the course, neither will the students. Work developing this set of teaching materials has been supported by the Commission on Higher Education under a Materials Development Sub-project of the K-12 Transition Project. These materials will also be shared with Department of Education.

Writers of this teaching guide recognize that few Senior High School teachers would have formal training or applied experience with statistical concepts. Thus, the guide gives concrete suggestions on classroom activities that can illustrate the wide range of processes behind data collection and data analysis.

It would be ideal to use technology (i.e. computers) as a means to help teachers and students with computations; hence, the guide also provides suggestions in case the class may have access to a computer room (particularly the use of spreadsheet applications like Microsoft Excel). It would be unproductive for teachers and students to spend too much time working on formulas, and checking computation errors at the expense of gaining knowledge and insights about the concepts behind the formulas.

The guide gives a mixture of lectures and activities, (the latter include actual collection and analysis of data). It tries to follow suggestions of the Guidelines for Assessment and Instruction in Statistics Education (GAISE) Project of the American Statistical Association to go beyond lecture methods, and instead exercise conceptual learning, use active learning strategies and focus on real data. The guide suggests what material is optional as there is really a lot of material that could be taught, but too little time. Teachers will have to find a way of recognizing that diverse needs of students with variable abilities and interests.

This teaching guide for Statistics and Probability, to be made available both digitally and in print to senior high school teachers, shall provide Senior High School teachers of Statistics and Probability with much-needed support as the country’s basic education system transitions into the K-12 curriculum. It is earnestly hoped that Senior High School teachers of Grade 11 Statistics and Probability can direct students into examining the context of data, identifying the consequences and implications of stories behind Statistics and Probability, thus becoming critical consumers of information. It is further hoped that the competencies gained by students in this course will help them become more statistical literate, and more prepared for whatever employment choices (and higher education specializations) given that employers are recognizing the importance of having their employee know skills on data management and analysis in this very data-centric world.

Core Subject Description: At the end of the course, the students must know how to find the mean and variance of a random variable, to apply sampling techniques and distributions, to estimate population mean and proportion, to perform hypothesis testing on population mean and proportion, and to perform correlation and regression

Page 1 of 7

No. of Hours/Semester: 80 hours/semester

Prerequisite (if needed):

K to 12 BASIC EDUCATION CURRICULUM SENIOR HIGH SCHOOL CORE SUBJECT

K to 12 Senior High School Core Curriculum Statistics and Probability December 2013

Grade: 11/12 Core Subject Title: Statistics and Probability

analyses on real-life problems.

CODE

 

M11/12SP-IIIa-1

M11/12SP-IIIa-2

 

M11/12SP-IIIa-3

M11/12SP-IIIa-4

M11/12SP-IIIa-5

M11/12SP-IIIa-6

 

M11/12SP-IIIb-1

M11/12SP-IIIb-2

M11/12SP-IIIb-3

M11/12SP-IIIb-4

 

M11/12SP-IIIc-1

M11/12SP-IIIc-2

LEARNING COMPETENCIES

The learner …

illustrates a random variable (discrete and continuous).

between a discrete and a continuous

distinguishes

variable.

random

finds the possible values of a random variable.

properties. for a discrete

and its distribution

a probability

illustrates variable

random

of a discrete

mass function histogram.

and its corresponding

the probability

variable

constructs

random

probabilities corresponding to a given

computes variable.

random

illustrates the mean and variance of a discrete random

variable.

the mean and the variance of a discrete

calculates variable.

random

the mean and the variance of a discrete

interprets variable.

random

involving mean and variance of

solves problems

distributions.

probability

The learner …

illustrates a normal random variable and its

characteristics. constructs a normal curve.

1.

2.

3.

4.

5.

6.

7.

8.

9.

10.

11.

12.

PERFORMANCE

STANDARDS The learner is able to apply an appropriate random variable for a given real-life problem (such as in decision making and games of chance).

 

The learner is able to accurately formulate and solve real-life problems in different disciplines

CONTENT STANDARDS

The learner demonstrates understanding of key concepts of random variables and probability

distributions.

 

The learner demonstrates understanding of key concepts of normal probability distribution.

CONTENT

Random Variables

and Probability

Distributions

Normal

Distribution

Page 2 of 7

K to 12 BASIC EDUCATION CURRICULUM SENIOR HIGH SCHOOL CORE SUBJECT PERFORMANCE

K to 12 Senior High School Core Curriculum Statistics and Probability December 2013

CODE

M11/12SP-IIIc-3

M11/12SP-IIIc-4

M11/12SP-IIIc-d-

1

 

M11/12SP-IIId-2

M11/12SP-IIId-3

M11/12SP-IIId-4

M11/12SP-IIId-5

 

M11/12SP-IIIe-1

 

M11/12SP-IIIe-2

M11/12SP-III-3

M11SP-IIIe-f-1

 

M11/12SP-IIIf-2

M11/12SP-IIIf-3

LEARNING COMPETENCIES

normal values.

normal curve

under the standard

to different

identifies regions

13. corresponding

variable to a standard

vice versa.

and random

a normal

converts variable

normal

table. and percentiles using the

probabilities

computes normal

standard

 

The learner …

illustrates random sampling.

distinguishes between parameter and statistic.

identifies sampling distributions of statistics (sample mean).

and variance of the sampling distribution

mean mean.

sample

the the

finds

of

defines the sampling distribution of the sample mean for

normal population when the variance is:

known

(a)

unknown

(b)

illustrates the Central Limit Theorem.

Theorem. of the sample mean

Limit distribution

sampling

defines the the Central

using

solves problems involving sampling distributions of the

sample mean.

The learner …

illustrates point and interval estimations.

distinguishes between point and interval estimation.

14.

15.

 

1.

2.

3.

4.

5.

6.

7.

8.

1.

2.

 

STANDARDS involving normal distribution.

The learner is able to apply suitable sampling and sampling distributions of the sample mean to solve real-life problems in different disciplines.

 

The learner is able to estimate the population mean and population proportion to make sound

CONTENT STANDARDS

 

The learner demonstrates understanding of key concepts of sampling and sampling distributions of the sample mean.

 

The learner demonstrates understanding of key concepts of estimation of population mean and

CONTENT

 

Sampling and

Sampling

Distributions

Estimation of

Parameters

M11/12SP-IIIg-4

M11/12SP-IIIg-2

M11/12SP-IIIh-2

M11/12SP-IIIg-5

M11/12SP-IIIg-3

M11/12SP-IIIh-3

M11/12SP-IIIg-1

M11/12SP-IIIh-1

M11/12SP-IIIf-4

M11/12SP-IIIf-5

M11/12SP-IIIi-4

M11/12SP-IIIi-2

M11/12SP-IIIi-5

M11/12SP-IIIi-3

M11/12SP-IIIi-1

Page 3 of 7

CODE

mean. identifies the appropriate form of the confidence interval estimator for the population mean when: (a) the population variance is known, (b) the population variance is unknown, and (c) the Central Limit Theorem

under the t-distribution corresponding

computes for the confidence interval estimate based on

identifies point estimator for the population proportion.

the population mean based on

identifies the appropriate form of the confidence interval estimator for the population proportion based

the confidence interval estimate of the

identifies point estimator for the population mean.

computes for the point estimate of the population proportion.

computes for the point estimate of the population

interval

interval

the appropriate form of the estimator for the population mean.

LEARNING COMPETENCIES

proportion.

identifies percentiles using the t-table.

confidence

confidence

mean.

about estimate.

on the Central Limit Theorem.

population

population

is to be used. illustrates the t-distribution.

constructs a t-distribution.

of the involving

of the involving

proportion.

interval

t-values.

K to 12 BASIC EDUCATION CURRICULUM SENIOR HIGH SCHOOL CORE SUBJECT PERFORMANCE

conclusion

different regions

solves problems

solves problems

confidence

computes for

population

estimation

estimation

identifies

draws

its

to

17.

16.

13.

10.

19.

18.

14.

11.

15.

12.

11.

4.

9.

3.

5.

K to 12 Senior High School Core Curriculum Statistics and Probability December 2013

STANDARDS inferences in real-life problems in different disciplines.

CONTENT STANDARDS

population proportion.

CONTENT

Page 4 of 7

K to 12 BASIC EDUCATION CURRICULUM SENIOR HIGH SCHOOL CORE SUBJECT PERFORMANCE

K to 12 Senior High School Core Curriculum Statistics and Probability December 2013

CODE

M11/12SP-IIIi-6

M11/12SP-IIIj-1

M11/12SP-IIIj-2

M11/12SP-IIIj-3

M11/12SP-IIIj-4

 

M11/12SP-IVa-1

 

M11/12SP-IVa-2

M11/12SP-IVa-3

M11/12SP-IVb-1

 

M11/12SP-IVb-2

 

LEARNING COMPETENCIES

estimate proportion

population

interval

about the

its confidence

conclusion

draws on

based

identifies the length of a confidence interval.

computes for the length of the confidence interval.

an appropriate sample size using the

for interval.

of the

computes

length

solves problems involving sample size determination.

The learner …

illustrates:

null hypothesis alternative hypothesis level of significance rejection region; and types of errors in hypothesis testing.

(b)

(d)

(a)

(e)

(c)

calculates the probabilities of committing a Type I and

Type II error.

identifies the parameter to be tested given a real-life

problem.

and alternative

mean.

a population null

on appropriate

formulates the

hypotheses

identifies the appropriate form of the test-statistic

when:

the population variance is assumed to be known the population variance is assumed to be unknown;

(b)

(a)

and

the Central Limit Theorem is to be used.

(c)

20.

21.

22.

23.

24.

1.

2.

3.

4.

5.

STANDARDS

 

The learner is able to perform appropriate tests of hypotheses involving the population mean and population proportion to make inferences in real-life problems in different disciplines.

 

CONTENT STANDARDS

 

The learner demonstrates understanding of key concepts of tests of hypotheses on the population mean and population proportion.

 

CONTENT

 

Tests of

Hypothesis

Page 5 of 7

K to 12 BASIC EDUCATION CURRICULUM SENIOR HIGH SCHOOL CORE SUBJECT PERFORMANCE

K to 12 Senior High School Core Curriculum Statistics and Probability December 2013

CODE

 

M11/12SP-IVc-1

 

M11/12SP-IVd-1

M11/12SP-IVd-2

M11/12SP-IVe-1

M11/12SP-IVe-2

M11/12SP-IVe-3

M11/12SP-IVe-4

M11/12SP-IVf-1

M11/12SP-IVf-2

M11/12SP-IVf-g-

1

LEARNING COMPETENCIES

identifies the appropriate rejection region for a given level of significance when:

the population variance is assumed to be known the population variance is assumed to be unknown;

(b)

(a)

and

the Central Limit Theorem is to be used.

(c)

computes for the test-statistic value (population mean).

based on

region.

rejection mean

population

the the

about and

conclusion value

test-statistic

draws

the

mean. involving test of hypothesis on the

solves problems

population

and alternative

proportion.

a population null

on appropriate

formulates the

hypotheses

form of is the to test-statistic

be used.

Limit Theorem

appropriate

Central

the the

identifies

when

identifies the appropriate rejection region for a given level of significance when the Central Limit Theorem is to be used.

computes for the test-statistic value (population proportion).

draws conclusion about the population proportion based on the test-statistic value and the rejection region.

solves problems involving test of hypothesis on the population proportion.

6.

7.

8.

9.

10.

11.

12.

13.

14.

15.

STANDARDS

 

CONTENT STANDARDS

 

CONTENT

 

M11/12SP-IVh-2

M11/12SP-IVg-4

M11/12SP-IVg-2

M11/12SP-IVh-3

M11/12SP-IVg-3

M11/12SP-IVh-1

M11/12SP-IVj-2

M11/12SP-IVi-4

M11/12SP-IVi-2

M11/12SP-IVj-1

M11/12SP-IVi-3

M11/12SP-IVi-1

Page 6 of 7

CODE

of association between the variables

(direction), and variation

calculates the Pearson’s sample correlation coefficient.

variable given the

line. calculated slope and y-intercept of the

calculates the slope and y-intercept of the regression

identifies the independent and dependent variables.

solves problems involving correlation analysis.

solves problems involving regression analysis.

LEARNING COMPETENCIES

draws the best-fit line on a scatter plot.

illustrates the nature of bivariate data.

of the dependent

trend plot.

variable.

on a scatter

independent

constructs a scatter plot.

based (form),

scatter plot.

on a strength

the value

K to 12 BASIC EDUCATION CURRICULUM SENIOR HIGH SCHOOL CORE SUBJECT

describes shape

interprets the

of the

regression

(strength)

estimates

predicts

based

value

line.

10.

11.

12.

7.

4.

1.

5.

8.

2.

6.

9.

3.

STANDARDS The learner is able to perform correlation and regression analyses on real-life problems in different disciplines.

PERFORMANCE

K to 12 Senior High School Core Curriculum Statistics and Probability December 2013

CONTENT STANDARDS

The learner demonstrates understanding of key concepts of correlation and regression analyses.

Correlation and

CONTENT

ENRICHMENT

Regression

Analyses

Page 7 of 7

M11/12

III

SP

1

a

-

-

illustrates a random variable (discrete and continuous)

Statistics and Probability

Third Quarter

Mathematics

Grade 11/12

Week one

K to 12 BASIC EDUCATION CURRICULUM SENIOR HIGH SCHOOL CORE SUBJECT

Sample: M11/12SP-IIIa-1

Code Book Legend

and Strand/ Subject or

Topic

Domain/Content/

Specialization

Competency

Grade Level

Quarter

K to 12 Senior High School Core Curriculum Statistics and Probability December 2013

Component/

Week

Learning Area

*Put a hyphen (-) in between letters to indicate more than a specific week

quarter

Lowercase Letter/s

Uppercase Letter/s

Numeral

Arabic Number

First Entry

if no specific

*Zero Roman

SAMPLE

LEGEND

Table of Contents

Chapter 1: Exploring Data

Chapter 3: Sampling

Introducing Statistics

1

Coin Tossing revisited from a

Data Collection Activity

7

Statistical Perspective

204

Basic Terms in Statistics

17

The Need for Sampling

221

Levels of Measurement

24

Sampling Distribution of the Sample

Data Presentation

31

Mean

242

Measures of Central Tendency

44

Sampling without Replacement

265

Other Measures of Location

54

Sampling from a Box of Marbles, Nips,

Measures of Variation

60

or Colored Paper Clips and One-Peso

More on Describing Data:

Coins

285

Summary Measures and Graphs

69

Sampling from the Periodic Table

299

Chapter 2: Random Variables and Probability Distributions

Chapter 4: On Estimation of Parameters

Concepts of Point and Interval

 
 

Probability

86

Estimation

316

Geometric Probability

98

Point Estimation of the Population

Random Variables

108

Mean

321

Probability Distributions of

Confidence Interval Estimation of the

 

Discrete Random Variables

117

Population Mean

328

 

Probability Density Functions

130

Point and Confidence Interval

Mean and Variance of Discrete

Estimation of the Population Proportion

344

 

Random Variables

144

More on Point Estimates and

 

More about Means and Variances

155

Confidence Intervals

351

The Normal Distribution and Its

Chapter 5: Tests of Hypothesis

 

Properties

164

Basic Concepts in Hypothesis Testing

362

 

Areas Under a Standard Normal

Steps in Hypothesis Testing

368

 

Distribution

182

Test on Population Mean

374

 

Areas under a Normal

Test on Population Proportion

385

 

Distribution

194

More on Hypothesis Tests Regarding the Population Proportion

390

 

Chapter 6: Correlation and Regression Analysis

Examining Relationships with Correlation

399

Biographical Notes

421

!

CHAPTER 1: EXPLORING DATA

Lesson 1: Introducing Statistics

TIME FRAME: 60 minutes

OVERVIEW OF LESSON

In decision making, we use statistics although some of us may not be aware of it. In this lesson, we make the students realize that to decide logically, they need to use statistics. An inquiry could be answered or a problem could be solved through the use of statistics. In fact, without knowing it we use statistics in our daily activities.

LEARNING COMPETENCIES:

At the end of the lesson, the learner should be able to identify questions that could be answered using a statistical process and describe the activities involved in a statistical process.

LESSON OUTLINE:

1. Motivation

2. Statistics as a Tool in Decision-Making

3. Statistical Process in Solving a Problem

REFERENCES:

Albert, J. R. G. (2008).Basic Statistics for the Tertiary Level (ed. Roberto Padua, Welfredo Patungan, Nelia Marquez), published by Rex Bookstore.

Handbook of Statistics 1 (1 st and 2 nd Edition), Authored by the Faculty of the Institute of Statistics, UP Los Baños, College Laguna 4031

Workbooks in Statistics 1 (From 1 st to 13 th Edition), Authored by the Faculty of the Institute of Statistics, UP Los Baños, College Laguna 4031

1 "

!

DEVELOPMENT OF THE LESSON

A. Motivation

You may ask the students, a question that is in their mind at that moment. You

may write their answers on the board. (Note: You may try to group the questions

as you write them on the board into two, one group will be questions that are

answerable by a fact and the other group are those that require more than one

information and needs further thinking).

The following are examples of what you could have written on the board:

Group 1:

How old is our teacher?

Is the vehicle of the Mayor of our city/town/municipality bigger than the vehicle used by the President of the Philippines?

How many days are there in December?

Does the Principal of the school has a post graduate degree?

How much does the Barangay Captain receive as allowance?

What is the weight of my smallest classmate?

Group 2:

How old are the people residing in our town?

Do dogs eat more than cats?

Does it rain more in our country than in Thailand?

Do math teachers earn more than science teachers?

How many books do my classmates usually bring to school?

What is the proportion of Filipino children aged 0 to 5 years who are underweight or overweight for their age?

The first group of questions could be answered by a piece of information which is considered always true. There is a correct answer which is based on a fact and you don’t need the process of inquiry to answer such kind of question. For example, there is one and only one correct answer to the first question in Group 1 and that is your age as of your last birthday or the number of years since your birth year.

On the other hand, in the second group of questions one needs observations or data to be able to respond to the question. In some questions you need to get the observations or responses of all those concerned to be able to answer the question. On the first question in the second group, you need to ask all the people in the locality about their age and among the values you obtained you get a representative value. To answer the second question in the second group,

2 "

!

you need to get the amount of food that all dogs and cats eat to respond to the question. However, we know that is not feasible to do so. Thus what you can do is get a representative group of dogs and another representative group for the cats. Then we measure the amount of food each group of animal eats. From these two sets of values, we could then infer whether dogs do eat more than cats.

So as you can see in the second group of questions you need more information or data to be able to answer the question. Either you need to get observations from all those concerned or you get representative groups from which you gather your data. But in both cases, you need data to be able to respond to the question. Using data to find an answer or a solution to a problem or an inquiry is actually using the statistical process or doing it with statistics.

Now, let us formalize what we discussed and know more about statistics and how we use it in decision-making.

B. Main Lesson

1. Statistics as a Tool in Decision-Making

Statistics is defined as a science that studies data to be able to make a decision. Hence, it is a tool in decision-making process. Mention that Statistics as a science involves the methods of collecting, processing, summarizing and analyzing data in order to provide answers or solutions to an inquiry. One also needs to interpret and communicate the results of the methods identified above to support a decision that one makes when faced with a problem or an inquiry.

Trivia: The word “statistics” actually comes from the word “state”— because governments have been involved in the statistical activities, especially the conduct of censuses either for military or taxation purposes. The need for and conduct of censuses are recorded in the pages of holy texts. In the Christian Bible, particularly the Book of Numbers, God is reported to have instructed Moses to carry out a census. Another census mentioned in the Bible is the census ordered by Caesar Augustus throughout the entire Roman Empire before the birth of Christ.

Inform students that uncovering patterns in data involves not just science but it is also an art, and this is why some people may think “Stat is eeeks!” and may view any statistical procedures and results with much skepticism

Make known to students that Statistics enable us to

characterize persons, objects, situations, and phenomena;

explain relationships among variables;

formulate objective assessments and comparisons; and, more importantly

make evidence-based decisions and predictions.

3 "

!

And to use Statistics in decision-making there is a statistical process to follow which is to be discussed in the next section.

2. Statistical Process in Solving a Problem

You may go back to one of the questions identified in the second group and use

it to discuss the components of a statistical process. For illustration on how to do it, let us discuss how we could answer the question “Do dogs eat more than cats?”

As discussed earlier, this question requires you to gather data to generate statistics which will serve as basis in answering the query. There should be plan or a design on how to collect the data so that the information we get from it is enough or sufficient for us to minimize any bias in responding to the query. In relation to the query, we said earlier that we cannot gather the data from all dogs and cats. Hence, the plan is to get representative group of dogs and another representative group of cats. These representative groups were observed for some characteristics like the animal weight, amount of food in grams eaten per day and breed of the animal. Included in the plan are factors like how many dogs and cats are included in the group, how to select those included in the representative groups and when to observe these animals for their characteristics.

After the data were gathered, we must verify the quality of the data to make a good decision. Data quality check could be done as we process the data to summarize the information extracted from the data. Then using this information, one can then make a decision or provide answers to the problem or question at hand.

To summarize, a statistical process in making a decision or providing solutions to

a problem include the following:

Planning or designing the collection of data to answer statistical questions in a way that maximizes information content and minimizes bias;

Collecting the data as required in the plan;

Verifying the quality of the data after they were collected;

Summarizing the information extracted from the data; and

Examining the summary statistics so that insight and meaningful information can be produced to support decision-making or solutions to the question or problem at hand.

Hence, several activities make up a statistical process which for some the process is simple but for others it might be a little bit complicated to implement. Also, not all questions or problems could be answered by a simple statistical

4 "

!

process. There are indeed problems that need complex statistical process. However, one can be assured that logical decisions or solutions could be formulated using a statistical process.

KEY POINTS

Difference between questions that could be and those that could not

answered using Statistics.

Statistics is a science that studies data.

There are many uses of Statistics but its main use is in decision-making.

Logical decisions or solutions to a problem could be attained through a

statistical process.

ASSESSMENT Note: Answers are provided inside the parentheses and italicized.

1. Identify which of the following questions are answerable using a statistical

process.

a.

What is a typical size of a Filipino family? (answerable through a statistical

process)

b.

How many hours in a day? (not answerable through a statistical process)

c.

How old is the oldest man residing in the Philippines? (answerable through

a statistical process)

d.

Is planet Mars bigger than planet Earth? (not answerable through a

statistical process)

e.

What is the average wage rate in the country? (answerable through a

statistical process)

f.

Would Filipinos prefer eating bananas rather than apple? (answerable

through a statistical process)

g.

How long did you sleep last night? (not answerable through a statistical

process)

h.

How much a newly-hired public school teacher in NCR earns in a month?

(not answerable through a statistical process)

i.

How tall is a typical Filipino? (answerable through a statistical process)

j.

Did you eat your breakfast today? (not answerable through a statistical

process)

2. For each of the identified questions in Number 1 that are answerable using a

statistical process, describe the activities involved in the process.

5 "

!

For a. What is a typical size of a Filipino family? (The process includes getting a representative group of Filipino families and ask the family head as to how many members do they have in their family. From the gathered data which had undergone a quality check a typical value of the number of family members could be obtained. Such typical value represents a possible answer to the question.)

For c. How old is the oldest man residing in the Philippines? (The process includes getting the ages of all residents of the country. From the gathered data which had undergone a quality check the highest value of age could be obtained. Such value is the answer to the question.)

For e. What is the average wage rate in the country? (The process includes getting all prevailing wage rates in the country. From the gathered data which had undergone a quality check a typical value of the wage rate could be obtained. Such value is the answer to the question.)

For f. Would Filipinos prefer eating bananas rather than apple? (The process includes getting a representative group of Filipinos and ask each one of them on what fruit he/she prefers, banana or apple? From the gathered data which had undergone a quality check the proportion of those who prefers banana and proportion of those who prefer apple will be computed and compared. The results of this comparison could provide a possible answer to the question.)

For i. How tall is a typical Filipino? (The process includes getting a representative group of Filipinos and measure the height of each member of the representative group. From the gathered data which had undergone a quality check a typical value of the height of a Filipino could be obtained. Such typical value represents a possible answer to the question.)

Note: Tell the students that getting a representative group and obtaining a typical value are to be learned in subsequent lessons in this subject.

6 "

CHAPTER 1: EXPLORING DATA

Lesson 2: Data Collection Activity

TIME FRAME: 60 minutes

OVERVIEW OF LESSON

As we have learned in the previous lesson, Statistics is a science that studies data. Hence to teach Statistics, real data set is recommend to use. In this lesson,we present an activity where the students will be asked to provide some data that will be submitted for consolidation by the teacher for future lessons. Data on heights and weights, for instance, will be used for calculating Body Mass Index in the integrative lesson. Students will also be given the perspective that the data they provided is part of a bigger group of data as the same data will be asked from much larger groups (the entire class, all Grade 11 students in school, all Grade 11 students in the district). The contextualization of data will also be discussed.

LEARNING COMPETENCIES:

At the end of the lesson, the learner should be able to:

Recognize the importance of providing correct information in a data collection activity;

Understand the issue of confidentiality of information in a data collection activity;

Participate in a data collection activity; and

Contextualize data

LESSON OUTLINE:

1. Preliminaries in a Data Collection Activity

2. Performing a Data Collection Activity

3. Contextualization of Data

REFERENCES

Albert, J. R. G. (2008). Basic Statistics for the Tertiary Level (ed. Roberto Padua, Welfredo Patungan, Nelia Marquez), published by Rex Bookstore.

Handbook of Statistics 1 (1 st and 2 nd Edition), Authored by the Faculty of the Institute of Statistics, UP Los Baños, College Laguna 4031

Workbooks in Statistics 1 (From 1 st to 13 th Edition), Authored by the Faculty of the Institute of Statistics, UP Los Baños, College Laguna 4031

https://www.khanacademy.org/math/probability/statistical-studies/statistical-questions/v/statistical-

questions

https://www.illustrativemathematics.org/content-standards/tasks/703

7 "

DEVELOPMENT OF THE LESSON

A. Preliminaries in a Data Collection Activity

Before the lesson, prepare a sheet of paper listing everyone’s name in class with a “Class Student Number” (see Attachment A for the suggested format). The class student number is a random number chosen in the following fashion:

(a)

Make a box with “tickets” (small pieces of papers of equal sizes) listing the numbers 1 up to the number of students in the class.

(b)

Shake the box, get a ticket, and assign the number in the ticket to the first person in the list.

(c)

Shake the box again, get another ticket, and assign the number of this ticket to the next person in the list.

(d)

Do (c) until you run out of tickets in the box.

At this point all the students have their corresponding class student number written across their names in the prepared class list. Note that the preparation of the class list is done before the class starts.

At the start of the class, inform each student confidentially of his/her class student number. Perhaps, when the attendance is called, each student can be provided a separate piece of paper that lists her/his name and class student number. Tell students to remember their class student number, and to always use this throughout the semester whenever data are requested of them. Explain to students that in data collection activity, specific identities like their names are not required, especially because people have a right to confidentiality, but there should be a way to develop and maintain a database to check quality of data provided, and verify from respondent in a data collection activity the data that they provided (if necessary).

These preliminary steps for generating a class student number and informing students confidentially of their class student number are essential for the data collection activities to be performed in this lesson and other lessons so that students can be uniquely identified, without having to obtain their names. Inform also the students that the class student numbers they were given are meant to identify them without having to know their specific identities in the class recording sheet (which will contain the consolidated records that everyone had provided). This helps protect confidentiality of information.

In statistical activities, facts are collected from respondents for purposes of getting aggregate information, but confidentiality should be protected. Mention that the agencies mandated to collect data is bound by law to protect the confidentiality of information provided by respondents. Even market research organizations in the private sector and individual researchers also guard confidentiality as they merely want to obtain aggregate data. This way, respondents can be truthful in giving

8 "

information, and the researcher can give a commitment to respondents that the data they provide will never be released to anyone in a form that will identify them without their consent.

B. Performing a Data Collection Activity

Explain to the students that the purpose of this data collection activity is to gather data that they could use for their future lessons in Statistics. It is important that they do provide the needed information to the best of their knowledge. Also, before they respond to the questionnaire provided in the Attachment B as Student Information Sheet (SIS), it is recommended that each item in the SIS should be clarified. The following are suggested clarifications to make for each item:

1. CLASS STUDENT NUMBER: This is the number that you provided confidentially

to the student at the start of the class.

2. SEX: This is the student’s biological sex and not their preferred gender. Hence, they have to choose only one of the two choices by placing a check mark () at space provided before the choices.

3. NUMBER OF SIBLINGS: This is the number of brothers and sisters that the student has in their nuclear or immediate family. This number excludes him or her in the count. Thus, if the student is the only child in the family then he/she will report zero as his/her number of siblings.

4. WEIGHT (in kilograms): This refers to the student’s weight based on the student’s knowledge. Note that the weight has to be reported in kilograms. In case the student knows his/her weight in pounds, the value should be converted to kilograms by dividing the weight in pounds by a conversion factor of 2.2 pounds per kilogram.

5. HEIGHT (in centimeters): This refers to the student’s height based on the student’s knowledge. Note that the height has to be reported in centimeters. In case the student knows his/her height in inches, the value should be converted to centimeters by multiplying the height in inches by a conversion factor of 2.54 centimeters per inch.

6. AGE OF MOTHER (as of her last birthday in years): This refers to the age of the student’s mother in years as of her last birthday, thus this number should be reported in whole number. In case, the student’s mother is dead or nowhere to be found, ask the student to provide the age as if the mother is alive or around.You could help the student in determining his/her mother’s age based

9 "

on other information that the student could provide like birth year of the mother or student’s age. Note also that a zero value is not an acceptable value.

7. USUAL DAILY ALLOWANCE IN SCHOOL (in pesos): This refers to the usual amount in pesos that the student is provided for when he/she goes to school in a weekday. Note that the student can give zero as response for this item, in case he/she has no monetary allowance per day.

8. USUAL DAILY FOOD EXPENDITURE IN SCHOOL (in pesos): This refers to the usual amount in pesos that the student spends for food including drinks in school per day. Note that the student can give zero as response for this item, in case he/she does not spend for food in school.

9. USUAL NUMBER OF TEXT MESSAGES SENT IN A DAY:

This refers to the usual

number of text messages that a student send in a day. Note that the student can give zero as response for this item, in case he/she does not have the gadget to use to send a text message or simply he/she does not send text messages.

10. MOST PREFERRED COLOR: The student is to choose a color that could be considered his most preferred among the given choices. Note that the student could only choose one. Hence, they have to place a check mark () at space provided before the color he/she considers as his/her most preferred color among those given.

11. USUAL SLEEPING TIME:

This refers to the usual sleeping time at night during a

typical weekday or school day. Note that the time is to be reported using the military way of reporting the time or the 24-hour clock (0:00 to 23:59 are the possible values to use)

12. HAPPINESS INDEX FOR THE DAY : The student has to response on how he/she

feels at that time using codes from 1 to 10. Code 1 refers to the feeling that the

student is very unhappy while Code 10 refers to a feeling that the student is very

happy on the day when the data are being collected.

After the clarification, the students are provided at most 10 minutes to respond to the questionnaire. Ask the students to submit the completed SIS so that you could consolidate the data gathered using a formatted worksheet file provided to you as Attachment C. Having the data in electronic file makes it easier for you to use it in the future lessons. Be sure that the students provided the information in all items in the SIS.

10 "

Inform the students that you are to compile all their responses and compiling all these records from everyone in the class is an example of a census since data has been gathered from every student in class. Mention that the government, through the Philippine Statistics Authority (PSA), conducts censuses to obtain information about socio-demographic characteristics of the residents of the country. Census data are used by the government to make plans, such as how many schools and hospitals to build. Censuses of population and housing are conducted every 10 years on years ending in zero (e.g., 1990, 2000, 2010) to obtain population counts, and demographic information about all Filipinos. Mid-decade population censuses have also been conducted since 1995. Censuses of Agriculture, and of Philippine Business and Industry, are also conducted by the PSA to obtain information on production and other relevant economic information.

PSA is the government agency mandated to conduct censuses and surveys. Through Republic Act 10625 (also referred to as The Philippine Statistical Act of 2013), PSA was created from four former government statistical agencies, namely:

National Statistics Office (NSO), National Statistical Coordination Board (NSCB), Bureau of Labor and Employment of Statistics (BLES) and Bureau of Agricultural Statistics (BAS). The other agency created through RA 10625 is the Philippine Statistical Research and Training Institute (PSRTI) which is mandated as the research and training arm of the Philippine Statistical System. PSRTI was created from its forerunner the former Statistical Research and Training Center (SRTC).

C. Contextualization of Data

Ask students what comes to their minds when they hear the term “data” (which may be viewed as a collection of facts from experiments, observations, sample surveys and censuses, and administrative reporting systems).

Present to the student the following collection of numbers, figures, symbols, and words, and ask them if they could consider the collection as data.

3, red, F, 156, 4, 65, 50, 25, 1, M, 9, 40, 68, blue, 78, 168, 69, 3, F, 6, 9, 45, 50, 20, 200, white, 2, pink, 160, 5, 60, 100, 15, 9, 8, 41, 65, black, 68, 165, 59, 7, 6, 35, 45,

Although the collection is composed of numbers and symbols that could be classified as numeric or non-numeric, the collection has no meaning or it is not contextualized, hence it cannot be referred to as data.

11 "

Tell the students that data are facts and figures that are presented, collected and analyzed. Data are either numeric or non-numeric and must be contextualized. To contextualize data, we must identify its six W’s or to put meaning on the data, we must know the following W’s of the data:

1. Who? Who provided the data?

2. What? What are the information from the respondents and What is the unit of measurement used for each of the information (if there are any)?

3. When? When was the data collected?

4. Where? Where was the data collected?

5. Why? Why was the data collected?

6. HoW? HoW was the data collected?

Let us take as an illustration the data that you have just collected from the students, and let us put meaning or contextualize it by responding to the questions with the Ws. It is recommended that the students answer theW-questions so that they will learn how to do it.

1. Who? Who provided the data?

The students in this class provided the data.

2. What? What are the information from the respondents and What is the unit of measurement used for each of the information (if there are any)?

The information gathered include Class Student Number, Sex, Number of Siblings, Weight, Height, Age of Mother, Usual Daily Allowance in School, Usual Daily Food Expenditure in School, Usual Number of Text Messages Sent in a Day, Most Preferred Color, Usual Sleeping Time and Happiness Index for the Day.

The units of measurement for the information on Number of Siblings, Weight, Height, Age of Mother, Usual Daily Allowance in School, Usual Daily Food Expenditure in School, and Usual Number of Text Messages Sent in a Day are person, kilogram, centimeter, year, pesos, pesos and message, respectively.

3. When? When was the data collected?

12 "

The data was collected on the first few days of classes for Statistics and Probability.

4. Where? Where was the data collected?

The data was collected inside our classroom.

5. Why? Why was the data collected?

As explained earlier, the data will be used in our future lessons in Statistics and Probability

6. HoW? HoW was the data collected?

The students provided the data by responding to the Student Information Sheet prepared and distributed by the teacher for the data collection activity.

Once the data are contextualized, there is now meaning to the collection of number and symbols which may now look like the following which is just a small part of the data collected in the earlier activity.

Class

Student

Number

Sex

Number

of

siblings

(in

Weight

(in kg)

Height

(in cm)

Age of

mother

(in

years)

Usual

daily

allowance

in school

Usual daily

food

expenditure

in school

Usual number of text messages sent in a day

Most

Preferred

Color

Usual

Sleeping

Time

Happiness

Index for

the Day

person)

(in pesos)

(in pesos)

 

M

1 2

 

60

156

60

200

150

20

RED

23:00

8

 

F

2 5

 

63

160

66

300

200

25

PINK

22:00

9

 

F

3 3

 

65

165

59

250

50

15

BLUE

20:00

7

 

M

4 1

 

55

160

55

200

100

30

BLACK

19:00

6

 

M

5 0

 

65

167

45

350

300

35

BLUE

20:00

8

 

:

: :

 

: :

   

: :

:

 

: :

 

: :

 

:

: :

 

: :

   

: :

:

 

: :

 

: :

KEY POINTS

Providing correct information in a government data collection activity is a

responsibility of every citizen in the country.

Data confidentiality is important in a data collection activity.

Census is collecting data from all possible respondents.

Data to be collected must be clarified before the actual data collection.

Data must be contextualized by answering six W-questions.

13 "

ATTACHMENT A: CLASS LIST

STUDENT NAME

CLASS STUDENT NUMBER

1.

 

2,

 

3.

 

4.

 

5.

 

6.

 

7.

 

8.

 

9.

 

10.

 

11.

 

12.

 

13.

 

14.

 

15.

 

16.

 

17.

 

18.

 

19.

 

20.

 

21.

 

22.

 

23.

 

24.

 

25.

 

26.

 

27.

 

28.

 

29.

 

30.

 

31.

 

32,

 

33.

 

34.

 

35.

 

14 "

STUDENT NAME

CLASS STUDENT NUMBER

36.

 

37.

 

38.

 

39.

 

40.

 

41.

 

42.

 

43.

 

44.

 

45.

 

46.

 

47.

 

48.

 

49.

 

50.

 

51.

 

52.

 

53.

 

54.

 

55.

 

56.

 

57.

 

58.

 

59.

 

60.

 

61.

 

62.

 

63.

 

64.

 

65.

 

66.

 

67.

 

68.

 

69.

 

70.

 

ATTACHMENT B: STUDENT INFORMATION SHEET

Instruction to the Students: Please provide completely the following information. Your teacher is available to respond to your queries regarding the items in this information sheet, if you have any. Rest assured that the information that you will be providing will only be used in our lessons in Statistics and Probability.

1. CLASS STUDENT NUMBER:

2. SEX (Put a check mark, ):

3. NUMBER OF SIBLINGS:

4. WEIGHT (in kilograms):

5. HEIGHT (in centimeters):

6. AGE OF MOTHER (as of her last birthday in years):

Male

Female

(If mother deceased, provide age if she was alive)

7. USUAL DAILY ALLOWANCE IN SCHOOL (in pesos):

8. USUAL DAILY FOOD EXPENDITURE IN SCHOOL (in pesos):

9. USUAL NUMBER OF TEXT MESSAGES SENT IN A DAY:

10. MOST PREFERRED COLOR (Put a check mark, . Choose only one):

WHITE

RED

PINK

ORANGE

YELLOW

GREEN

BLUE

PEACH

BROWN

GRAY

BLACK

PURPLE

11. USUAL SLEEPING TIME (on weekdays):

12. HAPPPINESS INDEX FOR THE DAY:

On a scale from 1 (very unhappy) to 10 (very happy), how do you feel

today?

15 "

ATTACHMENT C: CLASS RECORDING SHEET (for the Teacher’s Use)

Number Usual Usual Daily Age of Class of Daily food Most Usual Happiness Sex Weight
Number
Usual
Usual Daily
Age of
Class
of
Daily
food
Most
Usual
Happiness
Sex
Weight
Height
mother
Student
siblings
allowance
expenditure
Preferred
Sleeping
Index for
(in kg)
(in cm)
(in
Number
(in
in school
in school
Color
Time
the Day
years)
person)
(in pesos)
(in pesos)
Usual
number
of text
messages
sent in a
day

16 "

CHAPTER 1: EXPLORING DATA

Lesson 3: Basic Terms in Statistics

TIME FRAME: 60 minutes

OVERVIEW OF LESSON

As continuation of Lesson 2 (where we contextualize data) in this lesson we define basic terms in statistics as we continue to explore data. These basic terms include the universe, variable, population and sample. In detail we will discuss other concepts in relation to a variable.

LEARNING OUTCOME(S):

At the end of the lesson, the learner is able to

Define universe and differentiate it with population; and

Define and differentiate between qualitative and quantitative variables, and between discrete and continuous variables (that are quantitative);

LESSON OUTLINE:

1. Recall previous lesson on ‘Contextualizing Data’

2. Definition of Basic Terms in Statistics (universe, variable, population and sample)

3. Broad of Classification of Variables(qualitative and quantitative, discrete and continuous)

REFERENCES

Albert, J. R. G. (2008). Basic Statistics for the Tertiary Level (ed. Roberto Padua, WelfredoPatungan, Nelia Marquez), published by Rex Bookstore.

Handbook of Statistics 1 (1 st and 2 nd Edition), Authored by the Faculty of the Institute of Statistics, UP Los Baños, College Laguna 4031

Takahashi, S. (2009). The Manga Guide to Statistics. Trend-Pro Co. Ltd.

Workbooks in Statistics 1 (From 1 st to 13 th Edition), Authored by the Faculty of the Institute of Statistics, UP Los Baños, College Laguna 4031

17 #

DEVELOPMENT OF THE LESSON

A. Recall previous lesson on ‘Contextualizing Data’

Begin by recalling with the students the data they provided in the previous lesson and how they contextualized such data. You could show them the compiled data set in a table like this:

Class

Student

Number

Sex

Number

of

siblings

(in

Weight

(in kg)

Height

(in cm)

Age of

mother

(in

years)

Usual

Daily

allowance

in school

Usual Daily

food

expenditure

in school

Usual number of text messages sent in a day

Most

Preferred

Color

Usual

Sleeping

Time

Happiness

Index for

the Day

person)

(in pesos)

(in pesos)

 

M

1 2

 

60

156

60

200

150

20

RED

23:00

8

 

F

2 5

 

63

160

66

300

200

25

PINK

22:00

9

 

F

3 3

 

65

165

59

250

50

15

BLUE

20:00

7

 

M

4 1

 

55

160

55

200

100

30

BLACK

19:00

6

 

M

5 0

 

65

167

45

350

300

35

BLUE

20:00

8

 

:

: :

 

:

:

 

: :

:

:

:

 

: :

 

:

: :

 

:

:

 

: :

:

:

:

 

: :

Recall also their response on the first Ws of the data, that is, on the question “Who provided the data?” We said last time the students of the class provided the data or the data were taken from the students.

Another Ws of the data is What? What are the information from the respondents? and What is the unit of measurement used for each of the information (if there are any)? Our responses are the following:

The information gathered include Class Student Number, Sex, Number of Siblings, Weight, Height, Age of Mother, Usual Daily Allowance in School, Usual Daily Food Expenditure in School, Usual Number of Text Messages Sent in a Day, Most Preferred Color, Usual Sleeping Time and Happiness Index.

The units of measurement for the information on Number of Siblings, Weight, Height, Age of Mother, Usual Daily Allowance in School, Usual Daily Food Expenditure in School, and Usual Number of Text Messages Sent in a Day are person, kilogram, centimeter, year, pesos, pesos and message, respectively.

B. Main Lesson

1. Definition of Basic Terms

The collection of respondents from whom one obtain the data is called the universe of the study. In our illustration, the set of students of this Statistics and Probability class is our universe. But we must precaution the students that a universe is not necessarily composed of people. Since there are studies where the observations were taken from plants or animals or even from non-living things like buildings, vehicles, farms, etc. So formally, we define universe as the collection or set of

18 #

units or entities from whom we got the data. Thus, this set of units answers the first Ws of data contextualization.

On the other hand, the information we asked from the students are referred to as the variables of the study and in the data collection activity, we have 12 variables including Class Student Number. A variable is a characteristic that is observable or measurable in every unit of the universe. From each student of the class, we got the his/her age, number of siblings, weight, height, age of mother, usual daily allowance in school, usual daily food expenditure in school, usual number of text messages sent in a day, most preferred color, usual sleeping time and happiness index for the day. Since these characteristics are observable in each and every student of the class, then these are referred to as variables.

The set of all possible values of a variable is referred to as a population. Thus for each variable we observed, we have a population of values. The number of population in a study will be equal to the number of variables observed. In the data collection activity we had, there are 12 populations corresponding to 12 variables.

A subgroup of a universe or of a population is a sample. There are several ways to take a sample from a universe or a population and the way we draw the sample dictates the kind of analysis we do with our data.

We can further visualize these terms in the following figure:

Unit 1 Unit 2 Unit 3 : : Unit N UNIVERSE
Unit 1
Unit 2
Unit 3
:
:
Unit N
UNIVERSE
VARIABLE 1 Value 1 Value 2 Value 3 : : Value N
VARIABLE 1
Value 1
Value 2
Value 3
:
:
Value N

POPULATION OF VARIABLE 1

VARIABLE 2 Value 1 Value 2 Value 3 : : Value N
VARIABLE 2
Value 1
Value 2
Value 3
:
:
Value N

POPULATION OF VARIABLE 2

VARIABLE 12

Value 1 Value 2 Value 3 … # : : Value N POPULATION OF VARIABLE
Value 1
Value 2
Value 3
#
:
:
Value N
POPULATION OF
VARIABLE 12
Unit 1 Value 1 : : : OR # : Unit n Value n SAMPLE
Unit 1
Value 1
:
:
:
OR #
:
Unit n
Value n
SAMPLE
A SAMPLE OF
A SAMPLE OF UNITS
POPULATION VALUES
Figure 3.1 Visualization of the relationship among universe, variable, population and sample.
19
#

2. Broad Classification of Variables

Following up with the concept of variable, inform the students that usually, a variable takes on several values. But occasionally, a variable can only assume one value, then it is called a constant. For instance, in a class of fifteen-year olds, the age in years of students is constant.

Variables can be broadly classified as either quantitative or qualitative, with the latter further classified into discrete and continuous types (see Figure 3.3 below).

into discrete and continuous types (see Figure 3.3 below). Figure 3.3 Broad Classification of Variables (i)

Figure 3.3

Broad Classification of Variables

(i)

Qualitative variables express a categorical attribute, such as sex (male or female), religion, marital status, region of residence, highest educational attainment. Qualitative variables do not strictly take on numeric values (although we can have numeric codes for them, e.g., for sex variable, 1 and 2 may refer to male, and female, respectively). Qualitative data answer questions “what kind.” Sometimes, there is a sense of ordering in qualitative data, e.g., income data grouped into high, middle and low-income status. Data on sex or religion do not have the sense of ordering, as there is no such thing as a weaker or stronger sex, and a better or worse religion. Qualitative variables are sometimes referred to as categorical variables.

(ii)

Quantitative (otherwise called numerical) data, whose sizes are meaningful, answer questions such as “how much” or “how many”. Quantitative variables have actual units of measure. Examples of quantitative variables include the height, weight, number of registered cars, household size, and total household expenditures/income of survey respondents. Quantitative data may be further classified into:

20 #

a. Discrete data are those data that can be counted, e.g., the number of days for cellphones to fail, the ages of survey respondents measured to the nearest year, and the number of patients in a hospital. These data assume only (a finite or infinitely) countable number of values.

b. Continuous data are those that can be measured, e.g. the exact height of a survey respondent and the exact volume of some liquid substance. The possible values are uncountably infinite.

With this classification, let us then test the understanding of our students by asking them to classify the variables, we had in our last data gathering activity. They should be able to classify these variables as to qualitative or quantitative and further more as to discrete or continuous. If they did it right, you have the following:

 

TYPE OF

TYPE OF QUANTITATIVE VARIABLE

VARIABLE

VARIABLE

Class Student Number

Qualitative

 

Sex

Qualitative

 

Number of Siblings

Quantitative

Discrete

Weight (in kilograms)

Quantitative

Continuous

Height (in centimeters)

Quantitative

Continuous

Age of Mother

Quantitative

Discrete

Usual Daily Allowance in School (in pesos)

Quantitative

Discrete

Usual Daily Food Expenditure in School (in pesos)

Quantitative

Discrete

Usual Number of Text Messages Sent in a Day

Quantitative

Discrete

Usual Sleeping Time

Qualitative

 

Most Preferred Color

Qualitative

 

Happiness Index for the Day

Qualitative

 

Special Note:

For quantitative data, arithmetical operations have some physical interpretation. One can add 301 and 302 if these have quantitative meanings, but if, these numbers refer to room numbers, then adding these numbers does not make any sense. Even though a variable may take numerical values, it does not make the corresponding variable quantitative! The issue is whether performing arithmetical operations on these data would make any sense. It would certainly not make sense to sum two zip codes or multiply two room numbers.

21 #

KEY POINTS

A universe is a collection of units from which the data were gathered.

A variable is a characteristic we observed or measured from every element of the universe.

A population is a set of all possible values of a variable.

A sample is a subgroup of a universe or a population.

In a study there is only one universe but could have several populations.

Variables could be classified as qualitative or quantitative, and the latter could be further classified as discrete or continuous. ASSESSMENT

Note: Answers are provided inside the parentheses and italicized.

1. A market researcher company requested all teachers of a particular school to fill

The following are

up a questionnaire in relation to their product market study.

some of the information supplied by the teachers:

highest educational attainment

predominant hair color

body temperature

civil status

brand of laundry soap being used

total household expenditures last month in pesos

number of children in the household

number of hours standing in queue while waiting to be served by a bank teller

amount spent on rice last week by the household

distance travelled by the teacher in going to school

time (in hours) consumed on Facebook on a particular day

a. If we are to consider the collection of information gathered through the completed questionnaire, what is the universe for this data set? (The universe is the set of all teachers in that school)

b. Which of the variables are qualitative? Which are quantitative? Among the quantitative variables, classify them further as discrete or continuous.

highest educational attainment (qualitative)

predominant hair color (qualitative)

body temperature (quantitative: continuous)

civil status (qualitative)

brand of laundry soap being used (qualitative)

total household expenditures last month in pesos (quantitative: discrete)

number of children in a household (quantitative: discrete)

number of hours standing in queue while waiting to be served by a bank teller (quantitative: discrete)

22 #

amount spent on rice last week by a household (quantitative: discrete)

distance travelled by the teacher in going to school (quantitative:

continuous)

time (in hours) consumed on Facebook on a particular day (quantitative:

continuous)

c. Give at least two populations that could be observed from the variables identified in (b). (Possible answer: The population is the set of all values of the highest educational attainment and another population is {single, married, divorced, separated, widow/widower}) 2. The Engineering Department of a big city did a listing of all buildings in their locality. If you are planning to gather the characteristics of these buildings,

a.

What is the universe of this data collection activity? (Set of all buildings in the big city)

b.

What are the crucial variables to observe? It would also be better if you could classify the variables as to whether it is qualitative or quantitative. Furthermore, classify the quantitative variable as discrete or continuous. (A possible answer is the number of floors in the building, quantitative, discrete)

3. A survey of students in a certain school is conducted. The survey questionnaire details the information on the following variables. For each of these variables, identify whether the variable is qualitative or quantitative, and if the latter, state whether it is discrete or continuous.

a.

number of family members who are working (quantitative: discrete)

b.

ownership of a cell phone among family members (qualitative)

c.

length (in minutes) of longest call made on each cell phone owned per month (quantitative: continuous)

d.

ownership/rental of dwelling (qualitative)

e.

amount spent in pesos on food in one week (quantitative: discrete)

f.

occupation of household head (qualitative)

g.

total family income (quantitative: discrete)

h.

number of years of schooling of each family member (quantitative: discrete)

i.

access of family members to social media (qualitative)

j.

amount of time last week spent by each family member using the internet (quantitative: continuous)

Explanatory Note:

Teachers have the option to just ask this assessment orally to the entire class, or to group students and ask them to identify answers, or to give this as homework, or to use some questions/items here for a chapter examination.

23 #

CHAPTER 1: EXPLORING DATA

Lesson 4: Levels of Measurement

TIME FRAME: 60 minutes

OVERVIEW OF LESSON

In this lesson we discuss the different levels of measurement as we continue to explore data. Knowing such will enable us to plan the data collection process we need to employ in order to gather the appropriate data for analysis.

LEARNING OUTCOME(S):

At the end of the lesson, the learner is able to identify and differentiate the different levels of measurement and methods of data collection

LESSON OUTLINE:

1. Motivational Activity

2. Levels of Measurement

3. Data Collection Methods

REFERENCES

Albert, J. R. G. (2008). Basic Statistics for the Tertiary Level (ed. Roberto Padua, Welfredo Patungan, Nelia Marquez), published by Rex Bookstore.

Handbook of Statistics 1 (1 st and 2 nd Edition), Authored by the Faculty of the Institute of Statistics, UP Los Baños, College Laguna 4031

Takahashi, S. (2009). The Manga Guide to Statistics. Trend-Pro Co. Ltd.

Workbooks in Statistics 1 (From 1 st to 13 th Edition), Authored by the Faculty of the Institute of Statistics, UP Los Baños, College Laguna 4031

24! !

DEVELOPMENT OF THE LESSON

A. Motivational Activity

Ask the students first if they believe the following statement:

“Students who eat a healthy breakfast will do best on a quiz, students who eat an unhealthy breakfast will get an average performance, and students who do not eat anything for breakfast will do the worst on a quiz”

You could further ask one or more students who have different answers to defend their answers. Then challenge the students to apply a statistical process to investigate on the validity of this statement. You could enumerate on the board the steps in the process to undertake like the following:

1. Plan or design the collection of data to verify the validity of the statement in a way that maximizes information content and minimizes bias;

2. Collect the data as required in the plan;

3. Verify the quality of the data after it was collected;

4. Summarize the information extracted from the data; and

5. Examine the summary statistics so that insight and meaningful information can be produced to support your decision whether to believe or not the given statement.

Let us discuss in detail the first step. In planning or designing the data collection activity, we could consider the set of all the students in the class as our universe. Then let us identify the variables we need to observe or measure to verify the validity of the statement. You may ask the students to participate in the discussion by asking them to identify a question to get the needed data. The following are some possible suggested queries:

1. Do you usually have a breakfast before going to school? (Note: This is answerable by Yes or No)

2. What do you usually have for breakfast? (Note: Possible responses for this question are rice, bread, banana, oatmeal, cereal, etc)

The responses in Questions Numbers 1 and 2 could lead us to identify whether a student in the class had a healthy breakfast, an unhealthy breakfast or no breakfast at all.

25! !

Furthermore, there is a need to determine the performance of the student in a quiz on that day. The score in the quiz could be used to identify the student’s performance as best, average or worst.

As we describe the data collection process to verify the validity of the statement, there is also a need to include the levels of measurement for the variables of interest.

B. Main Lesson:

1. Levels of Measurement

Inform students that there are four levels of measurement of variables: nominal, ordinal, interval and ratio. These are hierarchical in nature and are described as follows:

Nominal level of measurement arises when we have variables that are categorical and non-numeric or where the numbers have no sense of ordering. As an example, consider the numbers on the uniforms of basketball players. Is the player wearing a number 7 a worse player than the player wearing number 10? Maybe, or maybe not, but the number on the uniform does not have anything to do with their performance. The numbers on the uniform merely help identify the basketball player. Other examples of the variables measured at the nominal level include sex, marital status, religious affiliation. For the study on the validity of the statement regarding effect of breakfast on school performance, students who responded Yes to Question Number 1 can be coded 1 while those who responded No, code 0 can be assigned. The numbers used are simply for numerical codes, and cannot be used for ordering and any mathematical computation.

Ordinal level also deals with categorical variables like the nominal level, but in this level ordering is important, that is the values of the variable could be ranked. For the study on the validity of the statement regarding effect of breakfast on school performance, students who had healthy breakfast can be coded 1, those who had unhealthy breakfast as 2 while those who had no breakfast at all as 3. Using the codes the responses could be ranked. Thus, the students who had a healthy breakfast are ranked first while those who had no breakfast at all are ranked last in terms of having a healthy breakfast. The numerical codes here have a meaningful sense of ordering, unlike basketball player uniforms, the numerical codes suggest that one student is having a healthier breakfast than another student. Other examples of the ordinal scale include socio economic status (A to E, where A is wealthy, E is poor), difficulty of questions in an exam (easy, medium difficult), rank in a contest (first place, second place, etc.), and perceptions in Likert scales.

26! !

Note to Teacher: Let us also emphasize to the students that while there is a sense or ordering, there is no zero point in an ordinal scale. In addition, there is no way to find out how much “distance” there is between one category and another. In a scale from 1 to 10, the difference between 7 and 8 may not be the same difference between 1 and 2.

Interval level tells us that one unit differs by a certain amount of degree from another unit. Knowing how much one unit differs from another is an additional property of the interval level on top of having the properties posses by the ordinal level. When measuring temperature in Celsius, a 10 degree difference has the same meaning anywhere along the scale – the difference between 10 and 20 degree Celsius is the same as between 80 and 90 centigrade. But, we cannot say that 80 degrees Celsius is twice as hot as 40 degrees Celsius since there is no true zero, but only an arbitrary zero point. A measurement of 0 degrees Celsius does not reflect a true "lack of temperature." Thus, Celsius scale is in interval level. Other example of a variable measure at the interval is the Intelligence Quotient (IQ) of a person. We can tell not only which person ranks higher in IQ but also how much higher he or she ranks with another, but zero IQ does not mean no intelligence. The students could also be classified or categorized according to their IQ level. Hence, the IQ as measured in the interval level has also the properties of those measured in the ordinal as well as those in the nominal level.

Special Note: Inform also the students that the interval level allows addition and subtraction operations, but it does not possess an absolute zero. Zero is arbitrary as it does not mean the value does not exist. Zero only represents an additional measurement point.

Ratio level also tells us that one unit has so many times as much of the property as does another unit. The ratio level possesses a meaningful (unique and non-arbitrary) absolute, fixed zero point and allows all arithmetic operations. The existence of the zero point is the only difference between ratio and interval level of measurement. Examples of the ratio scale include mass, heights, weights, energy and electric charge. With mass as an example, the difference between 120 grams and 135 grams is 15 grams, and this is the same difference between 380 grams and 395 grams. The level at any given point is constant, and a measurement of 0 reflects a complete lack of mass. Amount of money is also at the ratio level. We can say that 2000 pesos is twice more than 1,000 pesos. In addition, money has a true zero point: if you have zero money, this implies the absence of money. For the study on the validity of the statement regarding effect of breakfast on school performance, the student’s score in the quiz is measured at the ratio level. A score of zero implies that the student did not get a correct answer at all.

In summary, we have the following levels of measurement:

27! !

Level

Property

Basic Empirical Operation

Nomina

   

l

No order, distance, or origin

Determination of equivalence

Ordinal

Has order but no distance or unique origin

Determination of greater or lesser values

Interval

Both with order and distance but no unique origin

Determination of equality of intervals or difference

Ratio

Has order, distance and unique origin

Determination of equality of ratios or means

The levels of measurement depend mainly on the method of measurement, not on the property measured. The weight of primary school students measured in kilograms has a ratio level, but the students can be categorized into overweight, normal, underweight, and in which case, the weight is then measured in an ordinal level. Also, many levels are only interval because their zero point is arbitrarily chosen.

To assess the students understanding of the lesson, you may go back to the set of variables in the data gathering activity done in Lesson 2. You could ask the students to identify the level of measurement for each of the variable. If they did it right, you have the following:

VARIABLE

LEVEL OF MEASUREMENT

Class Student Number

Nominal

Sex

Nominal

Number of Siblings

Ratio

Weight (in kilograms)

Ratio

Height (in centimeters)

Ratio

Age of Mother

Ratio

Usual Daily Allowance in School (in pesos)

Ratio

Usual Daily Food Expenditure in School (in pesos)

Ratio

Usual Number of Text Messages Sent in a Day

Ratio

Usual Sleeping Time

Nominal

Most Preferred Color

Nominal

Happiness Index for the Day

Ordinal

28! !

2.

Methods of Data Collection

Variables were observed or measured using any of the three methods of data collection, namely: objective, subjective and use of existing records. The objective and subjective methods obtained the data directly from the source. The former uses any or combination of the five senses (sense of sight, touch, hearing, taste and smell) to measure the variable while the latter obtains data by getting responses through a questionnaire. The resulting data from these two methods of data collection is referred to as primary data. The data gathered in Lesson 2 are primary data and were obtained using the subjective method.

On the other hand, secondary data are obtained through the use of existing records or data collected by other entities for certain purposes. For example, when we use data gathered by the Philippine Statistics Authority, we are using secondary data and the method we employ to get the data is the use of existing records. Other data sources include administrative records, news articles, internet, and the like. However, we must emphasize to the students that when we use existing data we must be confident of the quality of the data we are using by knowing how the data were gathered. Also, we must remember to request permission and acknowledge the source of the data when using data gathered by other agency or people.

KEY POINTS

Four levels of measurement: Nominal, Ordinal, Interval and Ratio

Knowing what level the variable was measured or observed will guide us to know the type of analysis to apply.

Three methods of data collection include objective, subjective and use of existing records.

Using the data collection method as basis, data can be classified as either primary or secondary data.

ASSESSMENT Note: Answers are provided inside the parentheses and in bold face.

1. Using the data of the teachers in a particular school gathered by a market

researcher company, identify the level of measurement for each of the following variable.

highest educational attainment (ordinal)

predominant hair color (nominal)