Вы находитесь на странице: 1из 26

# Measures of Central Tendency

*
The arithmetic mean is the most widely used measure
of location. It requires at least the interval scale. Its
major characteristics are:

All values are used.
It is unique.
The sum of the deviations from the mean is 0.
It is calculated by summing the values and dividing
by the number of values.
Used for interval-level and ratio-level of data.

Also known as Measure of Center, Measure of Central
Location
Root Mean Square/Quadratic Mean = Sqrt((X
1
)
2
+(X
2
)
2
+(X
3
)
2
+........+(X
N
)
2
/N)

is a statistical measure of the magnitude of a varying quantity. It is especially
useful when variates are positive and negative, e.g., sinusoids. RMS is used in
various fields, including electrical engineering; one of the more prominent
uses of RMS is in the field of signal amplifiers.
Harmonic mean
is one of several kinds of average. Typically, it is appropriate for
situations when the average of rates is desired.
Geometric mean
is a type of mean or average, which indicates the central tendency or
typical value of a set of numbers
GM
APPLICATIONS
1. The number of building permits issued last month to twelve (12) construction
firms in a small city were 4, 7, 0, 7, 11, 4, 1, 15, 3, 5, 8, and 7. Treating the data
as a population, find the mean, median, mode, and midrange

2. A savings and a loan association makes one car loan of Php 765,000 at 10.8%
Interest, a second loan of Php 550,000 at 10.5% interest, and a third car loan of
Php 375,000 at 11% interest. What is the average percentage return to the
savings and loan association for these three loans?
3. Given the following data: 2, 3, 6, 7, 7, 8, 9, 9, 9, and 10
Solve the HM, GM, and QM or RMS
MEASURES OF DISPERSION
Measure of how the data is distributed about the mean.
Range is the difference between the largest and
smallest number in the set.

Mean absolute deviation is the average of unsigned deviations
from mean

Variance is the average of square deviations

population sample
Standard deviation is given as the square root of
variance

Population Sample
Coefficient of Variation
cv = ( std dev / mean ) 100%
e. g.
A survey of convenience stores showed that the average number of products
sold is 56, with a standard deviation of 12. The same survey showed that
the average length of time each store is in business is 6 years, with a
standard deviation of 2.5 years. Which is more variable?

The standard deviation measures absolute variability and
not relative variability. A statistic that allows us to compare two different
data sets that have different units of measurement is called
Coefficient of Variation
PROBLEM 2

The following data are the measures of the diameters
of 36 rivet heads in 1/100 of an inch.

6.72 6.77 6.82 6.70 6.78 6.70 6.62 6.75
6.66 6.66 6.64 6.76 6.73 6.80 6.72 6.76
6.76 6.68 6.66 6.62 6.72 6.76 6.70 6.78
6.76 6.67 6.70 6.72 6.74 6.81 6.79 6.78
6.66 6.76 6.76 6.72

a. Compute the sample mean, median, and mode.
b. Compute the sample variance.
c. Compute sample standard deviation
*
*Quartile
*Dividing the data set into 4 groups.

*Decile
*Dividing the data set into 10 groups.

*Percentile
*Dividing the data set into 100 groups.

Standard Scores (z-scores)
PROBLEM 3
In the a11icle "Evaluation of Low-Temperature Properties of HMA
Mixtures" (P. Sebaaly, A. Lake, and J. Epps, Journal of Transportation
Engineering, 2002:578-583), the following values of fracture stress
(in megapascals) were measured for a sample of 24 mixtures of hot-
mixed asphalt (HMA).

30 75 79 80 80 105 126 138 149 179 179 191
223 232 232 236 240 242 245 247 254 274 384 470

Find the first and third quartiles.

Find the 65th percentile of the asphalt data
PROBLEM 4
Which of the following exam grades has a better relative position?
(a) a grade of 52 on a test with mean = 45 and s = 4
(b) a grade of 78 on a test with mean = 70 and s = 6
*
Skewness
Skewness is defined as the degree of asymmetry of
distribution about a mean. It is a measure on how the
data departs from being symmetrical. Based on the
computed values of skewness, with a certain degree of
tolerance, the data distribution can be interpreted as
symmetric, positively skewed or negatively skewed.

Kurtosis
Kurtosis is the degree of peakedness exhibited by the
distribution. It is computed as the fourth degree
moment from the mean

Pearsonian Coefficient of Skewness

Symmetric Distribution
Sk =0

Skewed to the right
Sk > 0

Skewed to the left
Sk <0

s
x x
Sk
)
~
( 3

*

leptokurtic mesokurtic platykurtic
k > 0.263 k = 0.263 k < 0.263
*
Understanding Standard deviations
(a) Range rule of thumb sample population
4 4
2
+ 2s
-2
+ 2
(b) Chebyshevs Theorem
The proportion or fraction of any set of data lying within k standard
deviations of the mean is always at least 1
1

2

where k is a positive number greater than 1.

(c) Empirical rule
- about 68% of all the scores fall within 1 standard deviations
- about 95% of all the scores fall within 2 standard deviations
- about 99% of all the scores fall within 3 standard deviations
Let 1.
=

Applications
1. Given a mean of 100 and a standard deviation of 15. What does the rule says
about the number of scores between 55 and 145? between 85 and 115?
2. The average of 30 numbers is 60. If two of the largest numbers 250 and 150
are removed, what is the average of the remaining numbers?

3. Use Chebyshevs theorem to find what percent of the values will fall
between 10 and 26 for a data set with mean of 18 and standard
deviation of 2.

4. Use the Empirical Rule to find what two values 99.7% of the data will fall
between for a data set with mean 0 and standard deviation
of 12.
BIVARIATE ANALYSIS

Scientists and engineers often collect data in order to determine the
nature of a relationship between two quantities.

For example, a civil engineer may run a tensile strength tests several
times in order to study the relationship between the tensile force appled
an a reinforcing bar and the strain induced in it.

Data that consist of ordered pairs are called bivariate data. ln many
cases, ordered pairs generated in a scientific experiment tend to cluster
around a straight line when plotted.

The summary statistic most often used to measure the closeness of the
association between two variables is the correlation coefficient.

When two variables are closely related Lo each other, it is often of
interest to try to predict the value of one of them when given the value of
the other. This is often done with the equation of a line known as the
least-squares line,
lt is a mathematical fact that the correlation coefficient is always
between -I and I.

Positive values of the correlation coefficient indicate that the least-
squares line has a positive slope, which means that greater values of
one variable are associated with greater values of the other.

Negative values of the correlation coefficient indicate that the least-
squares line has a negative slope, which means that greater values
of one variable
are associated with lesser values of the other.

Values of the correlation coefficient close to I or to - I indicate a
strong linear relationship; values close to 0 indicate a weak linear
relationship.
The Least-Squares Line
When two variables have a linear relationship, the scatterplot tends to be
clustered around a line known as the least-squares line
PROBLEM 6
In a certain type of metal test specimen, the normal stress on a specimen is
known to be functionally related to the shear resistance. The following is a set
of coded experimental data on the two variables:
Normal Stress: Shear Resistance
26.8 26.5
25.4 27.3
28.9 24.2
23.6 27.1
27.7
23.9
24.7
28.1
26.9
27.4
22.6
25.6

a) Estimate the regression line
b) Estimate the shear resistance for a normal stress of 24.5 kilograms
per square centimeter.
c) Calculate the coefficient correlation r.