Вы находитесь на странице: 1из 19

RIZAL TECHNOLOGICAL UNIVERSITY

Cities of Mandaluyong and Pasig

WEEK 6

MODULE 4:
Section 2
Part I. Mathematics as a Tool
Chapter 4. Data Management

A. Gathering, Organizing, Representing, Analyzing and Interpreting Data


B. Measures of Central Tendency
C. Measures of Dispersion
D. Measures of Relative Position
E. Probabilities and Normal Distribution
F. Correlation ang Regression

Overview

Data Management is the whole process of dealing with data from the very
beginning of the study with data analysis as the last part of it. It is actually
divided into three phases with phase 1 being the preparation of data entry that
includes review of questionnaire forms, coding, preparation of master sheets or
spread sheets, dummy tables and quality control; phase 2 is the data entry and
the 3rd and last phase as mentioned is the data analysis.

Data analysis can either be descriptive or analytic. Descriptive analysis can


be done in three methods: tabular, graphic or numeric. Analytic utilizes
principles of statistics to test a hypothesis.

In short, data management is all about statistics both descriptive and


inferential.

According to Florence Nightingale, statistics is “the most important science


in the whole world: for upon it depends the practical application of every other
science and of every art: the one science essential to all political and social
administration, all education, all organization based on experience, for it only
gives results of our experience.”

Prepared by Lhalili
GE04 1
RIZAL TECHNOLOGICAL UNIVERSITY
Cities of Mandaluyong and Pasig

*slide taken from the 2nd generation training on MMW, Mapua Institute of
Technology (2017)

Study Guide

1. Download or read and understand the module for Chapter 4.


2. Watch corresponding video lectures.
3. Attend synchronous classes.
4. Participate in the discussion forum through the class’ private FB group.
5. Answer the assignment.
6. Answer the quiz.
7. Submit group activity.

Learning Outcomes

LO1: Use variety of statistical tools to process and manage numerical data.

LO2: Identify and define basic concepts of probability.

LO3: Determine the range of probability values and find the probability of an event.

LO4: Calculate areas under the normal curve.

LO5: Obtain specific percentiles of the normal distribution.

LO6: Use the methods of linear regression and correlations to predict the value of a variable
given certain conditions.

LO7: Advocate the use of statistical data in making important decisions.

Prepared by Lhalili
GE04 2
RIZAL TECHNOLOGICAL UNIVERSITY
Cities of Mandaluyong and Pasig

PART 1 of Module 4

GATHERING, ORGANIZING, REPRESENTING, ANALYZING AND


INTERPRETING DATA

Statistics has both a plural and singular sense. In its plural sense, statistics
refers to numerical facts that are systematically collected and analyzed. In its singular
sense, statistics refers to the scientific discipline consisting of theory and methods for
processing numerical information that one can use when making decisions in the face
of uncertainty. The recognition of uncertainty and the importance of statistical
activities are likely to be as old as civilization itself. Even before the art of counting
was perfected, there is evidence to suggest that herdsmen were putting notches on
trees to keep track of their cattle. In its plural and singular sense, the term Statistics
refers to quantities computed from numerical information (Philippine Statistical
Association, 2008).
As such, statisticians are involved with methods of data collection, data
summarization, and data analyses, as well as communicating the results of its
analyses.

Areas of Statistics
1. Descriptive Statistics are methods concerned w/ collecting, describing, and
analyzing a set of data without drawing conclusions (or inferences) about a
large group.
2. Descriptive Statistics are methods concerned w/ collecting, describing, and
analyzing a set of data without drawing conclusions (or inferences) about a
large group.

Statistical methods have two broad aims: (a) to describe, and (b) to infer. In
the first case, the main task is that of data organization and presentation (without
drawing conclusions or inferences beyond the data). These tools are called
descriptive statistical methods. In the second case, the task is to generalize results
beyond the data collected provided that the data collected is a part (sample) of a large
set of items (population). In this case, the statistical analysis required is inferential
statistical methods.

Prepared by Lhalili
GE04 3
RIZAL TECHNOLOGICAL UNIVERSITY
Cities of Mandaluyong and Pasig

What are statistics?


1. Statistics include numerical facts and figures.
Examples:
a. Average family income is Php313,000.00 in December 2018.
b. Inflation rate is 2.5% in December 2019.
c. Total fertility rate is 2.7 births per woman in 2017.
d. Employment rate is 95.5% in October 2019.
e. US pharmaceutical company Moderna Inc. claimed that their covid vaccine
has an effectivity rate of 94.5%.

2. Statistics refers to a range of techniques and procedures for displaying,


analyzing, interpreting and making decisions based on data.
3. Statistics is the language of science and data.
4. Statistics is a science that deals with the collection, presentation, analysis
and interpretation of data

Prepared by Lhalili
GE04 4
RIZAL TECHNOLOGICAL UNIVERSITY
Cities of Mandaluyong and Pasig

What statistics are not

1. Statistics is not math. Statistics is a broader way of organizing, interpreting


and communicating information in an objective manner.
2. Statistics is a way of viewing reality as it exists around us in a way that we
otherwise could not.
3. Statistics is how we communicate in science. It serves as a link between a
research idea and usable conclusions.
4. Statistics provides tools that you need in order to react intelligently to
information you hear or read.

Examples

a. 4 out 5 dentists recommend Colgate.


b. Almost 85% of lung cancers in men and 45% in women are tobacco-related.
c. Condoms are effective 94% of the time.
d. People tend to be more persuasive when they look others in the eye and
speak loudly and quickly.
e. A surprising new study shows that eating egg whites can increase one’s life
span.
f. There is an 80% chance that in a room full of 30 people that at least two
people will share the same birthday.
g. 79.48% of all statistics are made up on the spot.

Why do we study statistics?


Statistics are often presented in an effort to add credibility to an argument
or advice. It will make you into an intelligent consumer of statistical claims.

Definition of Terms

The universe/physical population is the collection of things or observational units


under consideration.

A variable is a characteristic observed or measured on every unit of the universe.

The statistical population is the set of all possible values of the variable.

Measurement is the process of determining the value or label of the variable based
on what has been observed.

An observation or variate is the realized value of the variable.

Prepared by Lhalili
GE04 5
RIZAL TECHNOLOGICAL UNIVERSITY
Cities of Mandaluyong and Pasig

Data is the collection of all observations.


Parameters are numerical measures that describe the population or universe of
interest. Usually donated by Greek letters;  (mu),  (sigma),  (rho),  (lambda), 
(tau),  (theta),  (alpha) and  (beta).

Statistics are numerical measures of a sample.

Kinds of variable and Data

The building blocks of statistical science are data. They come in diverse
range of formats and each type gives us a unique type of information. Data
represent the measured value of variables.

When an observation unit, e.g., a person, a family, a firm, has a characteristic


that may vary from unit to unit, the characteristic is called a variable. Variables are
characteristics or feature of the thing we are interested in people. In Psychology for
example, people are the subjects of studies so variables include levels of stress,
anxiety and physical health. We use statistics to understand if and how they are
related. There is a need to understand the nature of data: what they represent and
where they come from.

Types of Variables

The thing you are trying to explain or predict is most often called the dependent
variable (DV). It depends on others, which are called independent variables (IV). An
independent variable can sometimes be thought of as a cause and a dependent
variable as an effect.

Prepared by Lhalili
GE04 6
RIZAL TECHNOLOGICAL UNIVERSITY
Cities of Mandaluyong and Pasig

Examples:
1. An experimenter might compare 4 types of antidepressants and seeks to
determine the effect of the antidepressant on relief from depression.

IV: antidepressants
DV: relief from depression

2. Can blueberries slow down aging? A study indicates that antioxidants found
in blueberries may slow down the process of aging. In this study, 19-month-
old rats (equivalent to 60-year-old humans) were fed either their standard
diet or a diet supplemented by either blueberry, strawberry or spinach
powder. After 8 weeks, the rats were given memory and motor skills tests.
Although all supplemented with blueberry powder showed the most notable
improvement.

3. Does beta-carotene protect against cancer? Beta-carotene supplements


have been thought to protect against cancer. However, a study published in
the Journal of the National Cancer Institute suggests this is false. The study
was conducted with 39,000 women aged 45 and up. These women were
randomly assigned to receive a beta-carotene supplement or a placebo, and
their health was studied over their lifetime. Cancer rates for women in the
beta-carotene supplement group did not differ systematically from the
cancer rates of those women taking the placebo.

The number of levels of an independent variable is the number of experimental


conditions.

Variables in General

 Qualitative variables are those that express a qualitative attribute. They


describe the quality or character of something.
 Quantitative variables are those variables that are measured in terms of
numbers. They describe the amount or number of something that can be
classified either as discrete or continuous.

Examples:
Discrete: Number of apples in a box, Number of Students who are absent, COVID19
cases in the Philippines

Continuous: Time to respond to a stimulus, Height of first year college students,


Distance traveled

Prepared by Lhalili
GE04 7
RIZAL TECHNOLOGICAL UNIVERSITY
Cities of Mandaluyong and Pasig

A. DATA COLLECTION

Data Collection: Population and Samples


Example #1: You were hired by the Commission on Elections to examine how
Filipinos feel about the voting procedures in the Philippines. Who will you ask?

The population of a study is all of the individuals, items or units relevant to the
study. It comprises individuals, groups, organizations, documents, campaigns,
incidents and so on. It is also called the “universe”. Samples are subsets of the
population selected to represent the population.

Inferences from statistics are based on the assumption that sampling is


representative of the population otherwise, there is a possibility that sampling bias
occurs where the conclusions only apply to the samples and are not generalizable to
the population.

Example #2: A substitute teacher wants to know how students in the class did on
their last test. The teacher asks the 10 students sitting in the front row to state their
latest test score. He concludes from their report that the class did extremely well.

Levels of Measurement

The measurement process is an integral part of data collection. If the unit of analysis
is an individual person, many characteristics of that person, some visible and other
invisible, can be measured. Visible characteristics include sex, skin color, age,
height, weight, eye color and hair color. Invisible characteristics include intelligence,
prejudice, authoritarianism, alienation, paranoia, love and hate. Measurement is the
assignment of numbers to objects or events according to a predetermined set of
rules. To measure a property means to assign numbers to units as a way of
representing that property.

The kind of analysis that one can perform on the available data critically depends on
its scale of measurement or level of measurement.

1. Nominal (for categorical data) – one simply names or categorizes responses. The
nominal scale is the simplest scale of measurement for variables where a value or
unit of data is assigned to one of at least two distinct and exhaustive categories.
Each category is given some name but no assumptions are made about
relationships among the categories.

Examples of the nominal scale include sex, employment status, race, marital status,
religious affiliation, language spoken at home and race.

2. Ordinal (for ranked data) – allow comparisons of the degree to which two subjects
possess the dependent variable. It also tells when one unit has more of the property
than does another unit. The ordinal scale specifies the relative positive of items with

Prepared by Lhalili
GE04 8
RIZAL TECHNOLOGICAL UNIVERSITY
Cities of Mandaluyong and Pasig

respect to a given characteristic. The ordinal scale is like the nominal scale in that it
consists of mutually exclusive and exhaustive categories. Ordinal scaling is very
common in obtaining attitude scaling used in social research.

Examples of the ordinal scale include employee ratings, salary grade, IQ, score in an
authoritarian personality test, score in a personality/beauty test.

3. Interval (for measurement data) – numerical scales in which intervals have the
same interpretation throughout. It also tells us that one unit differs by a certain
amount of the property from another unit. The interval scale possesses the
properties of the nominal and ordinal scale with the additional property of equal
intervals between rank-ordered items. Consider the IQ of a person, we can tell not
only which person ranks higher in IQ but also how much higher he or she ranks with
another (how many units).

Examples of the interval scale include IQ levels, Aptitude Tests and Temperature.
One unit’s difference is the same regardless of their position in the scale.

Note, the interval scale allows addition and subtraction operations, but it does not
possess an absolute zero.

4. Ratio (for measurement data) – an interval scale with the additional property that
its zero position indicates the absence of the quantity being measured. It also tells
us that one unit has so many times as much of the property as does another unit.
The ratio scale possesses an absolute, fixed zero point and allows all arithmetic
operations. The existence of the zero point is the only difference between ratio and
interval measurement.

Examples of the ratio scale include measurements of heights, weights, and ages.

The scale of measurement depends mainly on the method of measurement, not on


the property measured. The weight of a tray of eggs measured in grams has an
interval/ratio scale, but if the boxes are labeled as one of small, medium large, the
weight is then measured in an ordinal scale. Also, many scales are only interval and
cannot be considered ratio because their zero point is arbitrarily chosen.

In summary:

Type of Scale Characteristics of Scale Basic Empirical Operation


Nominal No order, distance, or origin Determination of Equality
Has order but no distance or unique Determination of greater or lesser
Ordinal
origin values
Both with order and distance but no Determination of equality of intervals
Interval
unique origin or difference
Has order, distance and unique
Ratio Determination of equality of ratios
origin

Prepared by Lhalili
GE04 9
RIZAL TECHNOLOGICAL UNIVERSITY
Cities of Mandaluyong and Pasig

Methods of Collecting Data


1. Interview Method
a. Direct Method
b. Indirect Method
2. Questionnaire Method
3. Observation Method
4. Test Method
5. Registration Method
6. Experimentation Method
7. Mechanical Devices
8. Others
a. Focus Groups h. Internet Research
b. Secondary Sources i. Case Studies
c. Projective Techniques j. Content Analysis
d. Field Diaries k. Visual Methods and Image-based Research
e. Narrative Analysis l. Documentary Evidence
f. Discourse Analysis m. Semiotics
g. Oral History n. Archival Research

B. DATA PRESENTATION

Big data hardly give information that can be of help in making decisions fast.
Management of modern business or any organization demand fast and accurate
decisions for they can be ruined by other competitors who are knowledgeable in
summarizing and interpreting large mass of data.

Definition of Terms
Raw data are data in their original form just as they were collected.
Constants are quantities that do not change under the same condition.
Variables are quantities that change over time and in some location.
Variates are the actual values of the variable.

Methods of Presenting Data


1. Textual
2. Tabular
3. Graphical

Tabular Presentation

Frequency Distribution is an arrangement of data, which shows the frequency of


different values or group of values of a variable.

Prepared by Lhalili
GE04 10
RIZAL TECHNOLOGICAL UNIVERSITY
Cities of Mandaluyong and Pasig

Components of a Frequency Distribution Table

Class interval is made up of a lower limit and upper limit.


Class Frequency (f) or count is the number of observations belonging to a class
interval.
Class Mark (X) is the midpoint of the class interval obtained by averaging the lower
and upper limits.
Class Boundaries or exact limits are the true limits of the class interval obtained by
subtracting 0.5 to the lower limit and adding 0.5 to the upper limit.
Cumulative Frequency (cf) is obtained by adding absolute frequencies.
Relative Frequency (rf) is the ratio of the class frequency (f) to the total number of
cases (N).

Array is an arrangement of raw data which may be ascending or descending.

Stem and Leaf Display or a Stem and Leaf Plot is a method of graphically presenting
quantitative data likened to a histogram that helps in the visualization of the shape
of the distribution.

Steps in the Construction of a Frequency Distribution Table

1. Determine the range.


Range = Highest Value – Lowest Value
2. Compute n, the number of groupings.
𝑛 = 1 + 3.322 log 𝑁
3. Determine the class size, i.
𝑅
𝑖=
𝑛
4. Use the lowest score as the lower limit of the lowest class interval and obtain
the upper limit by adding 𝑖 − 1 to it.

Example. Scores of students in a GE04 test

10 20 50 23 21 12 13 15 24 25 26 23
24 25 28 24 56 20 10 32 30 31 13 25
65 45 51 42 35 65 36 28 35 40 60

Solution:

Array of scores
10 10 12 13 13 15 20 20 21 23 23 24
24 24 25 25 25 26 28 28 30 31 32 35
35 36 40 42 45 50 51 56 60 65 65

Prepared by Lhalili
GE04 11
RIZAL TECHNOLOGICAL UNIVERSITY
Cities of Mandaluyong and Pasig

Stem and Leaf Display

1. Range = 65 – 10 = 55
2. 𝑛 = 1 + 3.322 𝐿𝑜𝑔𝑁 = 1 + 3.322 𝐿𝑜𝑔35 = 6.13 ≈ 6
55
3. 𝑖 = 6 = 9.17 ≈ 9
4. lower limit of the lowest class interval: 10 (lowest score)
upper limit of the lowest class interval: 10 + (6 − 1) = 15

Frequency distribution table of the scores of students in a GE04 test


Class f X Class <cf >cf rf rp(%)
Interval Boundaries
10 – 18 6 14 9.5 – 18.5 6 35 0.1714 17.14
19 – 27 12 23 18.5 – 27.5 18 29 0.3429 34.29
28 – 36 8 32 27.5 – 36.5 26 17 0.2286 22.86
37 – 45 3 41 36.5 – 45.5 29 9 0.0857 8.57
46 – 54 2 50 45.5 – 54.5 31 6 0.0571 5.71
55 – 63 2 59 54.5 – 63.5 33 4 0.0571 5.71
64 – 72 2 68 63.5 – 72.5 35 2 0.0571 5.71
N=35

Graphical Presentation
1. Line Graph
2. Bar Graph
3. Circle Graph or Pie Graph
4. Pictograph or Picture Graph
5. Scatter Plots or Scatter gram
6. Statistical maps

Graphical Presentations of Frequency Distribution


1. Frequency Histogram is a special kind of bar graph where the bars are placed
adjacent to each other.
x – axis: lower boundaries
y – axis: f

Prepared by Lhalili
GE04 12
RIZAL TECHNOLOGICAL UNIVERSITY
Cities of Mandaluyong and Pasig

2. Frequency Polygon is a special kind of line graph obtained by plotting the


midpoints and the frequencies that starts and ends with the x – axis.
x – axis: class marks or midpoints
y – axis: f

3. Ogive is the graph of the cumulative frequency.


x – axis: lower boundaries
y – axis: cf

Examples

Prepared by Lhalili
GE04 13
RIZAL TECHNOLOGICAL UNIVERSITY
Cities of Mandaluyong and Pasig

Prepared by Lhalili
GE04 14
RIZAL TECHNOLOGICAL UNIVERSITY
Cities of Mandaluyong and Pasig

Example

The following scores represent the final examination grade for an elementary
statistics course:

22 60 49 32 57 74 52 70 82 36
80 77 81 95 41 65 92 85 55 76
52 10 64 75 78 25 80 98 81 67
41 71 83 54 64 72 88 62 74 43
60 78 89 76 84 48 84 90 15 79
34 67 17 82 69 74 63 80 85 61

1. Construct a frequency distribution table to include columns for f, X, class


boundaries and cf (< and >).
2. Construct the histogram, frequency polygon and ogive using the data.

Solution:
1. FDT
a. Range = 98 – 10 = 88
b. 𝑛 = 1 + 3.322𝐿𝑜𝑔60 = 6.9~7
88
c. 𝑖 = 7 = 12.57~13
d. 10 + (13 − 1) = 22

You can make an array of scores using excel

Prepared by Lhalili
GE04 15
RIZAL TECHNOLOGICAL UNIVERSITY
Cities of Mandaluyong and Pasig

Prepared by Lhalili
GE04 16
RIZAL TECHNOLOGICAL UNIVERSITY
Cities of Mandaluyong and Pasig

2.

Prepared by Lhalili
GE04 17
RIZAL TECHNOLOGICAL UNIVERSITY
Cities of Mandaluyong and Pasig

Assessment

ASSIGNMENT: The following data give the hours worked last week by employees
of a company.
42 48 42 45 53 34 40 23 21 38
51 40 35 42 31 47 48 34 40 40
16 27 36 39 39 51 41 43 40 36
25 27 57 28 52 45 25 26 39 41
60 52 54 35 27 46 20 22 36 30
25 62 20 52 41 33 65 63 25 26

a) Construct a frequency distribution table. Include in the table f, class marks


or midpoints (X), cf (< & >).
b) Draw the histogram, frequency polygon and ogive using the given data
(you may use graphing paper for the 3 graphs – one graph each).

Date of submission: November 28, 2020 @ 11:59PM via Google Forms to be sent
as an attached file. All answers should be handwritten and submit in the form of
image/s.

Prepared by Lhalili
GE04 18
RIZAL TECHNOLOGICAL UNIVERSITY
Cities of Mandaluyong and Pasig

References
E. M. Adina & R. T. Earnhart. Mathematics in the Modern World Second Generation
Training. Mapua Institute of Technology. 2017

Training Manual on Teaching Basic Statistics. Philippine Statistical Association


(PSA) & Statistical Research & Training Center (SRTC). 2007.

Training on Teaching Basic Statistics for Tertiary Level Teachers Summer 2008
Elementary Statistics: A Handbook of Slide Presentation prepared by Z.V.J. Albacea,
C. E. Reano, R. V. Collado, L. N. Cornia and N. A. Tandang. Institute of Statistics, CAS,
UP Los Baños, Laguna. 2005

Foster, Garett C.; Lane, David; Scott, David; Hebl, Mikki; Guerra, Rudy; Osherson,
Dan; and Zimmer, Heidi, "An Introduction to Psychological Statistics" (2018). Open
Educational Resources Collection.

Prepared by Lhalili
GE04 19

Вам также может понравиться