Вы находитесь на странице: 1из 18

BENG 2142 Statistic

Chapter 1: Data Description and


Numerical Measures
- Definition of statistics
- Types of variables
- Graphical and numerical methods for describing
qualitative data
- Graphical and numerical methods for describing
quantitative data
- Measures of tendency: mean, mode and median

BENG 2142 Statistic

1.1 Definition of Statistics

The science of collecting, organizing, summarizing


and analyzing information in order to draw
conclusions.

There are two types of statistics


Descriptive Statistics
Inferential Statistics

1
BENG 2142 Statistic

Descriptive Statistics
• consists of organizing and summarizing the
information collected. Descriptive statistics
describes the information collected through
numerical measurements, charts, graphs and tables.

Example:
We are interested with all 200 students (population)
from FKE for their result of Differential Equation. We
may categorize the data according to their grades and
analyze the central tendency (mode, median, mean)
and spread (range, quartiles, absolute deviation,
variance and standard deviation)

BENG 2142 Statistic

Inferential Statistics
• generalize results obtained from a sample to the
population and measure their reliability.

Example:
We are interested with all students from Malaysia
(population) for their result of Differential Equation.
We may not have access to the whole population data
but only some limited number of data that represents
the whole population of students from Malaysia. So,
the limited number of data acts as sample that we
use to make generalizations about the population on
mean and standard deviation.

2
BENG 2142 Statistic

1.2 Basic Terms


• Population - consists of all items or elements of
interest for a particular decision or investigation.
Example (all married staff over the age of 25 in
UTeM. )
• Sample is a certain number of elements that have
been chosen from a population. Sample is a
subset of population.
Example: (a list of married staffs over the age of
25 in the FKE would be a sample from the
population of all married staff over the age of 25
in the UTeM.)

BENG 2142 Statistic

1.2 Basic Terms (cont.)

• Random sample is a sample drawn in such a


way that each element of the population has a
chance of being selected.
• Simple random sample implies that any
particular sample of a specified sample size
has the same chance of being selected as any
other sample.
• Element / member is a specific subject or
individual about which the information is
collected.

3
BENG 2142 Statistic

1.2 Basic Terms (cont.)

• Variable is a characteristic of the individual


within the sample or population.

• Observation/Measurement is the value of a


variable for an element

• Data set is a collection of values of one or


more variables.

BENG 2142 Statistic

Example:
The following table shows the sales of four car
models in year 2000 in Malaysia.

Car model Number of


Variables
cars
W 2300
Elements
or x 1780 Observations or
members Y 1450 measurements
Z 1900

4
BENG 2142 Statistic

1.2 Basic Terms (cont.)

• Ungrouped data set contains information of each


member of a sample or population.

• Grouped data set is a collection of data which are


grouped in classes.

• Raw data is data recorded in the sequence in which


they are collected and before they are processed or
ranked.

BENG 2142 Statistic

1.2 Basic Terms (cont.)

• Population parameter is a descriptive measure


computed from a population data.

• Sample statistic is a descriptive measure


computed from a sample data.

• Outliers / Extreme Values are values that are very


small or very large relative to the majority of the
values in a data set.

5
BENG 2142 Statistic

1.3 Types of Variables


There are two types of variables

• Qualitative variable allow for classification of


individuals based some attribute or characteristics
 Example: the gender of new born babies; the
marital status of people, types of cars.

• Quantitative variables provide numerical


measures of individuals. (measureable).
 Example: The weight of children; the numbers
of cars owned.

BENG 2142 Statistic

1.3 Types of Variables (cont.)


• Quantitative variables can be further classified
into two groups:

 (a) Discrete Variables.


 finite / countable number of possible values.
 Example:
 The number of heads obtained by flipping a
coin five times.
 The number of cars that arrive at
McDonald’s drive-through between 1.00
p.m to 2.00 p.m.

6
BENG 2142 Statistic

1.3 Types of Variables (cont.)

 (b) Continuous Variables.


 infinite number of possible values that are not
countable. They are obtained by measuring;
include fractions and decimals.
 Example
 Time spent studying for your first statistics
exam.
 The height of volleyball players.

BENG 2142 Statistic

1.4 Graphical and Numerical Methods for


Describing Qualitative Data
Frequency Distributions
• A frequency distribution lists the number of
occurrences for each category of data.

Relative Frequency Distributions


• A relative frequency distribution lists the relative
frequency of each category of data.
• To construct a graph (bar graph and pie chart), a data
set can be described by using:
(a) frequency
(b) relative frequency
(c) percentage

7
BENG 2142 Statistic

1.4 Graphical and Numerical Methods for


Describing Qualitative Data (cont.)
(a) Frequency is the number of observations in a
category or in a class.

(b) Relative frequency is the proportion observations


within a category and is found using the formula:
𝐹𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦
𝑅𝑒𝑙𝑎𝑡𝑖𝑣𝑒 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 =
𝑇𝑜𝑡𝑎𝑙 𝐹𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦

(c) Percentage of measurements in each category:


Percentage = Relative frequency X 100%

BENG 2142 Statistic

Qualitative data can be displayed by using:

Bar graph
• A bar graph is constructed by labelling each category
of data on a horizontal axis and the frequency,
relative frequency or percentages of the category on
the vertical axis. A rectangle of equal width is drawn
for each category. The height of the rectangle is
equal to the category’s frequency, relative frequency
or percentage.

Pie chart
• A pie chart is a circle divided into sectors. Each sector
represents a category of data. The area of each
sector is proportional to the frequency of the
category.

8
BENG 2142 Statistic

Example:
In a survey concerning public education, 400 school
administrators were asked to rate the quality of education
in Malaysia. Their responses are summarized in the table
below.
Rating Frequency
A 35
B 260
C 93
D 12

(a) Construct a frequency distribution table which


includes values for relative frequency and percentage.
(b) Construct a bar graph and a pie chart.

BENG 2142 Statistic

Solution:

Bar graph: Pie chart:

300
3%
250
9%
23%
200
Frequency

A
150 B
100 65% C

50 D

0
A B C D
Rating

9
BENG 2142 Statistic

Example 1.1:
Suppose we ask 50 students in one of the local universities
in Malaysia about their Statistics results for last semester.
The data set is:

C A+ B D C+ C+ F C C C+
A A+ C A A A+ D B+ F C
C+ A D B B A C A+ D D
B+ D C+ B B+ C+ C+ B A B+
F D A+ F C B A C+ B B

(a) Construct a frequency distribution table which includes


values for relative frequency and percentage.
(b) Construct a bar graph and a pie chart.

BENG 2142 Statistic

1.5 Graphical and Numerical Methods for


Describing Quantitative Data
• Class is an interval that includes all values that fall
within two numbers, the lower and upper limits.

• The lower limit of the class is the smallest value


within the class.

• The upper limit of the class is the largest value


within the class.

• The class width is the difference between 2


consecutive lower class limits.

10
BENG 2142 Statistic

• Class interval can be classified into two groups:

(a) Exclusive class interval is a class interval with no gap


in between the upper limit of one class and the
lower limit in the next class interval. It takes the
form of
a≤x<b.

(b) Inclusive class interval is a class interval with a gap


in between the upper limit of one class and the
lower limit in the next class interval. It takes the
form of a ≤ x ≤ b .

BENG 2142 Statistic

• Number of classes can be obtained by dividing the


range with a certain class size.
• Tally marks is used to count class frequency by
marking strokes against each class for each data that
falls in that class.
• The class midpoint is found by adding the lower class
limit and upper class limit of a class and dividing the
result by 2. That is:

lower limit  upper limit


Class midpoint 
2

11
BENG 2142 Statistic

• Class boundary is given by the midpoint of the upper


limit of one class and the lower limit of the next class.
• Class width can also be obtained by:
Class width = upper boundary – lower boundary
Example:
Given 121, 137, 140, 122, 139, 128, 134, 138, 125, 133,
127, 124, 135, 137.
Class limits Class Class Tally Frequency
midpoint boundaries
121 - 125 123 120.5 – 125.5 IIII 4
126 - 130 128 125.5 – 130.5 II 2
131 - 135 133 130.5 – 135.5 III 3
136 - 140 138 135.5 – 140.4 IIII 5

BENG 2142 Statistic

• Frequency distribution lists all classes and the


number of values that belong to each of the classes.
• A cumulative frequency distribution gives the total
number of values that fall below the upper boundary
of each class.
Example:
Monthly Earnings Number of Employees Cumulative
(RM) (f) Frequency

401-600 9 9
601-800 22 31
801-1000 39 70
1001-1200 15 85
1201-1400 9 94
1401-1600 6 100
Total = 100

12
BENG 2142 Statistic

• For graphing grouped data, we may use:

(a) Histogram
A histogram is constructed by drawing rectangles for
each class of data. The height of each rectangle is the
frequency, relative frequency or percentage of the
class. The width of each rectangle should be the same
and the rectangles should touch each other.

(b) Polygon
A polygon is drawn by plotting a point above each
class midpoint on a horizontal axis at a height equal to
the frequency of the class. After the points of each
class are plotted, straight lines are drawn between
consecutive points.

BENG 2142 Statistic

• For graphing cumulative frequency distribution, we


may use:

Ogive
Ogive is a graph that represents the cumulative
frequency or cumulative relative frequency for the
class. It is constructed by plotting points whose x-
coordinates are the upper class boundary and whose
y-coordinates are the cumulative frequencies or
cumulative relative frequencies. After the points for
each class are plotted, straight lines are drawn
between consecutive points.

13
BENG 2142 Statistic

Example:
The following data give the monthly expenditures (in
hundred RM) on food for 30 households randomly
selected from the households who incurred such
expenses.

4.57 3.95 6.95 3.80 1.50 3.99 7.84 5.05


8.00 14.75 9.33 1.05 5.08 7.00 9.60 18.99
9.15 11.32 4.75 9.95 3.63 1.99 1.39 13.09
19.31 11.15 7.73 12.00 7.58 16.35

BENG 2142 Statistic

(a) Construct a frequency distribution table, given the


class width is 3 and use this class interval 0 − 3,
3 − 6 ,.... to start.

(b) Calculate the relative frequencies and percentages


for all classes.

(c) Draw a histogram and a polygon for the frequency


distribution.

(d) Draw an ogive for the cumulative frequency


distribution.

14
BENG 2142 Statistic

Solution:
(a) and (b)
Class Frequency Relative Percentage
(f) Frequency (%)
0-3 4 0.1333 13.33
3-6 8 0.2666 26.66
6-9 6 0.2000 20.00
9-12 6 0.2000 20.00
12-15 3 0.1000 10.00
15-18 1 0.0333 3.33
18-21 2 0.0666 6.66
Total 30 1.0000 100.00%

BENG 2142 Statistic

Solution:
(c) Histogram

15
BENG 2142 Statistic

Solution:
(c) Polygon

9
8
7
6
frequency

5
4
3
2
1
0
-0.5 1.5 4.5 7.5 10.5 13.5 16.5 19.5 22.5
midpoint

BENG 2142 Statistic

Solution:
(d)

Class Cumulative frequency

0 to less than 3 4
0 to less than 6 12
0 to less than 9 18
0 to less than 12 24
0 to less than 15 27
0 to less than 18 28
0 to less than 21 30

16
BENG 2142 Statistic

Solution:
(d) Ogive

35

30

25
cum. freq

20
15
10

0
0 3 6 9 12 15 18 21
Upper boundary

BENG 2142 Statistic

Example 1.2:
The data given below represent the weight in grams of
40 Kit Kat Chocolate bars, in ascending order:

20.5 20.7 20.8 21.0 21.0 21.4 21.5 22.0


22.1 22.5 22.6 22.6 22.7 22.7 22.9 22.9
23.1 23.3 23.4 23.5 23.6 23.6 23.6 23.9
24.1 24.3 24.5 24.5 24.8 24.8 24.9 24.9
25.1 25.1 25.2 25.6 25.8 25.9 26.1 26.7

17
BENG 2142 Statistic

(a) Construct a frequency distribution table.


Use 20.5 − 21.3, 21.4 − 22.2,..... to start.
(b) Determine the class boundaries and class midpoints.
(c) Find the class width.
(d) Calculate the relative frequencies and percentages
for all classes.
(e) Construct a frequency histogram for the data.
(f) Prepare the cumulative frequency, cumulative
relative frequency and cumulative percentage
distributions.
(g) Construct an ogive for cumulative frequency.

18