Академический Документы
Профессиональный Документы
Культура Документы
STATISTICS
What is Statistics?
Statistics is used almost every day in our life. It is also very useful in the field of
research.
Statistics is a field of mathematics that deals with the Collection, Organization,
Analysis, and Interpretation of quantitative data.
MEAN
The mean ( x) , also called as the “average” or “arithmetic average” or “arithmetic
mean”, is the most commonly used measure of central tendency. It is said to be the most
reliable measure of central tendency and has the least probable error but does not supply
information about the homogeneity of the distribution.
UNGROUPED DATA
A. Simple Mean
Getting the simple mean means that we are giving equal weight to each value in the data set.
To compute the mean of ungrouped data, we use the formula:
x
x or x
x1 x2 x3 ... xn
n n
Where x represents each value/observation, and n represents the total number of observation.
Examples:
1. The ages of five contestants in a Statistics Quiz Bee are the following: 18, 17, 18, 19, and
18. Find their average age.
Solution:
x1 x2 x3 ... xn
x
n
18 17 18 19 18
x . Add all the values (ages)
5
90
x Then divide the sum by 5
5
x 18 Therefore, the mean age of the contestant is 18.
2. Six employees are working as call center agents. Their salaries are as follows: Php 23500,
Php24300, Php 25800, Php 23900, Php 24100, and Php 24950. What is the average salary of
the employees?
Solution:
x1 x2 x3 ... xn
x
n
Php23500 Php24300 Php25800 Php23900 Php24100 Php24950
x
6
x Php 24425
B. Weighted Mean
Weighted mean is mean calculated by giving values in a data set more influence
according to some attribute of the data. It is an average in which each quantity to be averaged
has assigned weight, and these weightings determine the relative importance of each quantity
on the average.
The formula for weighted mean is:
WM
wx
w
Where w is the weight of each value and x is the matching value.
Examples:
1. You take three 100-point exams in your Mathematics class and score 80, 80 and 95
respectively. The last exam is much easier than the first two, so your professor has given
it less weight. The weights for the three exams are:
Exam 1: 40% of your grade
Exam 2: 40 % of your grade
Exam 3: 20% of your grade
What is your final weighted average for the exam?
Solution:
WM
wx
w
[(0.4)(80) (0.4)(80) (0.2)(95)]
WM Multiply the numbers in your data set by
0.4 0.4 0.2
the weights
32 32 19
WM Add the numbers
0.4 0.4 0.2
83
WM Divide
1
WM 83
Therefore, your final weighted average for the exam is 83.
2. At the end of the first semester, Mary received her grades as shown on the table below
with their respective weight or units. What is her general weighted average for the first
semester?
SUBJECTS Grades Units
Mathematics 2.0 3.0
English 1.75 3.0
Filipino 2 3.0
Science 2.5 3.0
Religion 1.25 3.0
Solution:
[(2.0)(3) (1.75)(3) (2.0)(3) (2.5)(3) (1.25)(3)]
WM
33333
6 5.25 6 7.5 3.75
WM
33333
28.5
WM
15
WM 1.90
Therefore, the GWA of Mary is 1.90
MEDIAN
A median (Md) is defined as the middle value or observation in an organized list of
numbers and falls in the middle-most position of the whole data.
UNGROUPED DATA
The median value in an ungrouped data is determined by first arranging the numbers
in value order from lowest to highest or vice versa. If there is an odd amount of
numbers, the median value is the middle most number, with the same amount of
numbers below and above. If there is an even amount of numbers in the list, the
middle pair must be determined, added together and divided by two to find the
median value. The median can be used to determine an approximate average.
Examples:
1. The number of books loaned from the library during each day of the week were 36, 31,
24, 45, 50. What is the median?
Solution:
a. First, arrange the data in ascending or descending order.
24, 31, 36, 45, 50
b. Next divide the data set into two equal parts.
24, 31, 36, 45, 50
Since there is an odd number of values in the data, we take the middle most
number/value which is 36 as the median of the data set.
Therefore, the median of the given data is 36.
2. The speed of ten stenographer in typing per minute are as follows:
Stenographer 1 2 3 4 5 6 7 8 9 10
Speed 121 110 120 119 112 121 118 115 107 115
Determine the median of the speed of the stenographers.
Solution:
a. First, arrange the data in ascending or descending order.
107, 110, 112, 115, 115, 118, 119, 120, 121, 121
b. Next divide the data set into two equal parts.
107, 110, 112, 115, 115, 118, 119, 120, 121, 121
Since there is an even number of values in the data, we take the two middle most
numbers/values which are 115 and 118.
c. Get the average of the two values.
115 + 118 = 233
233/2 = 116.5
Therefore, the median of the speed of the stenographers is 116.5 per minute.
This implies that the stenographers whose speed is above the median are fast while
those whose speed lie below the median are a little slower than the other.
MODE (Mo)
The number/value/observation in a data set which appears the most number of times.
If no number in the list is repeated, then there is no mode for the list. However, it is also
possible to have more than one mode for the same distribution of data, (bi-modal, tri-
modal, or multi-modal).
UGROUPED DATA
To find the mode of an ungrouped data, find the frequency of each
number/value/observation in the given data set. Then, choose the
number/value/observation having the highest frequency as the mode.
MODE=number/value/observation with the highest frequency
Examples:
1. Find the mode of the given data set: 15, 28, 25, 48, 22, 43, 39, 44, 43, 49, 34, 22, 33,
27, 25, 22, 30.
Solution:
First, arrange the data set in ascending or descending order.
15, 22, 22, 22, 25, 25, 27, 28, 30, 33, 34, 39, 43, 43, 44, 48, 49
Next, determine the number that appeared the most number of times.
In the given data set, the number that appeared the most number of times is
22.
2. The speed of ten stenographer in typing per minute are as follows:
Stenographer 1 2 3 4 5 6 7 8 9 10
Speed 121 110 120 119 112 121 118 115 107 115
Thus, the data set has two modes : 115 and 121. The data set is said to be bi-modal.
GROUPED DATA
17 25 30 33 25 45 19 23
27 35 45 48 20 38 18 39
44 22 46 26 36 29 21 15
50 47 34 26 37 25 49 33
22 33 44 38 46 41 32 37
Compute the mean, median and mode.
MEAN
x
fX
n
Where:
f = frequency
X = Class mark or midpoint
n = total number of observations
Frequency Distribution Table
a. Compute for the range. Range is the difference of the highest value and the lowest
value.
R=highest value - lowest value
R= 50-15
R=35
Range
b. Class Interval =
k
k=1 + [(3.3) log(n)]
k=1+[(3.3) log(40)]
k=1+[(3.3)(1.602059991)]
k=1+5.286797971
k=6.286797971
k≈6
35
ci
6
ci 5.8333 6
FREQUENCY DISTRIBUTION TABLE
Class Interval Tally Frequency (f) Midpoint (X) fX
45-50 IIIII-III 8 47.5 380
39-44 IIII 4 41.5 166
33-38 IIIII-IIIII 10 35.5 355
27-32 IIII 4 29.5 118
21-26 IIIII-IIII 9 23.5 211.5
15-20 IIIII 5 17.5 87.5
TOTAL n=40 fX 1318
45 50
c. Midpoint= 47.5
2
x
fX
n
1318
x
40
x 32.95
Thus, the mean of the grouped data is 32.95
MEDIAN
n
2 CFb
Md LBMC ci
fm
Where:
n/2 = median class, n is the total number of observations
LBMC = Low boundary of median class
<CFb = cumulative frequency of the class immediately below the median class
Fm = frequency of median class
ci = class interval
n
value for at the <CF column (must be within one of the <CF). The interval
2
corresponding to this <CF value is the median class.
n 40
20
2 2
n
2 CFb
Md LBMC ci
fm
B.) Consider the frequency distribution below, determine the mean,median and mode of
the grouped data.
The heights of 40 grade 6 pupils in a certain grade school are presented in a
frequency distribution as shown below:
Height of a class of 40 students
Class Interval Frequency
78-82 2
73-77 6
68-72 6
63-67 8
58-62 7
53-57 7
48-52 4
n = 40
Measures of Variability
Variability refers to the spread of scores in a set of data.
The measures of variability describe numerically the extent to which individual
observations in a data set are scattered about the average.
There are four measures of variability: the range, the mean absolute deviation,
variance, and standard deviation.
X = Midpoint
f = frequency
x = mean score of grouped data
n = number of observations
3. Variance is the average of the squared deviation of the set of observations from the
mean. It measures how far a data set is spread out. A variance of zero indicates that all of
the data values are identical. A small variance indicates that the data points tend to be
very close to the mean, and to each other. A high variance indicates that the data points
are very spread out from the mean, and from one another.
Ungrouped Data
x x
2
Sample Variance : S 2
n 1
n 1
2
Where S n 1 = sample variance
x = raw score
x = sample mean
n = number of observations
x 2
Population Variance: N
2
N
Where = population variance
2
N
x = raw score
= population mean
N = number of observations
Grouped Data
n fX 2 ( fX ) 2
Sample Variance : S 2
n 1
n(n 1)
f X
2
Population Variance: 2
N
N
4. Standard Deviation is a measure of the dispersion of a set of data from its mean. It is
determined by calculating the positive root square root of variance. A large standard
deviation indicates that the data points are far from the mean (heterogeneous) and a
small standard deviation indicates that they are clustered closely around the mean
(homogeneous).
Ungrouped and Grouped Data
Sample Standard Deviation : S n 1 S n21
A group of mountaineers went hiking to Mt. Pulag, Philippines to study the different
species of plants existing in that area. The ages of the mountaineers are 34, 35, 45, 46, 49,
32. Find the MAD, variance and standard deviation of their ages.
Solution:
A) MAD
34 35 45 46 49 32
Mean Age ( x )= 40.17
6
x xx x x
34 -6.17 6.17
35 -5.17 5.17
45 4.83 4.83
46 5.83 5.83
49 8.83 8.83
32 -8.17 8.17
x x 39
MAD
xx
39
6 .5 7
n 6
x xx x x 2
34 -6.17 38.0689
35 -5.17 26.7289
45 4.83 23.3289
46 5.83 33.9889
49 8.83 77.9689
32 -8.17 66.7489
n=6 x x
2
266.8334
x x
2
266.8334 266.8334
S 2
n 1 53.36668 53.37
n 1 6 1 5
C. Standard Deviation
S n21 53.37
x
fX
1212
30.3
n 40
MAD
f X x
n
384
MAD 9 .6
40
Therefore, the mean absolute deviation of the scores of the students is 9.6
n fX 2 fX
2
S 2
n n 1
n 1
Since 2i is less than the standard deviation, it only shows that the data is
heterogeneous.
Exercise:
Find the MAD, sample variance and standard deviation of below grouped data
and determine the homogeneity of each group.
Weight of 50 women in a fitness club
Class Interval Frequency
129-136 2
121-128 7
113-120 6
105-112 5
97-104 10
89-96 12
81-88 8
n = 50