Академический Документы
Профессиональный Документы
Культура Документы
Data Preparation
Univariate Analysis
Bivariate Analysis
Multivariate Analysis
Range
Mean Deviation
Examples
Variance
Standard Deviation
Example
Interpretation
Coefficient of Variation
Frequency Distribution
Distributions
Frequency Distributions A description of the
number of times the various attributes of a
variable are observed in a sample.
Frequency distribution is counts of the number of
response to a question or to the occurrence of a
phenomenon of interest.
Distribution (cont)
Distribution (cont)
Central Tendency
Central Tendency
Measurement of Dispersion
Range
Percentile range
Quartile deviation
Mean deviation
Variance and standard deviation
Coefficient of variation
Coefficient of mean deviation
Coefficient of range
Coefficient of quartile deviation
Range = 13 - 1 = 12
Interquartile Range
Measures the range of the middle 50% of the values only
Is defined as the difference between the upper and lower
quartiles
Interquartile range
quartile
= Q3 - Q1
x-x
Mean deviation
The mean of the absolute deviations
Mean deviation =
xx
n
Find
xx
for each x
Mean deviation =
xx
n
(x x )
Standard deviation = =
(x x )
n 1
x
xx
( x x )2
xx
2
(
)
x
n 1
6. Take the square root of quantity
in Step 5
xx
n 1
)2
Frequency
f
85
192
123
Total
400
s=
x x
1
Coefficient of variation
Is a measure of relative variability used to:
measure changes that have occurred in a
population over time
compare variability of two populations that are
expressed in different units of measurement
expressed as a percentage rather than in terms of
the units of the particular data
s
V = 100 %
x
where
Summary
Measures of central tendency
The range
Simply the difference between the largest and
smallest values in a set of data
Useful for: daily temperature fluctuations or share
price movement
Is considered primitive as it considers only the
extreme values which may not be useful indicators of
the bulk of the population.
The formula is:
Interquartile range
Measures the range of the middle 50% of the
values only
Is defined as the difference between the upper
and lower quartiles
= Q3 - Q1
Mean deviation
The mean of the absolute deviations
Mean deviation =
xx
n
Find
xx
for each x
Mean deviation =
xx
n
(
x x)
The squareStan
ofdard
thedeviation
population
standard
==
deviation is called the variance. n
2
Variance = 2
2002 McGraw-Hill Australia,
PPTs t/a Introductory
Mathematics & Statistics for
Business 4e by John S.
Croucher
47
(x x )
n 1
Where: (n-1) is the number of observations in the sample
48
xx
( x x )2
xx
2
(
)
x
n 1
6. Take the square root of quantity
in Step 5
2002 McGraw-Hill Australia,
PPTs t/a Introductory
Mathematics & Statistics for
Business 4e by John S.
Croucher
49
xx
n 1
)2
Frequency
f
85
192
123
Total
400
s=
2002 McGraw-Hill Australia,
PPTs t/a Introductory
Mathematics & Statistics for
Business 4e by John S.
Croucher
x x
1
50
Coefficient of variation
Is a measure of relative variability used to:
measure changes that have occurred in a
population over time
compare variability of two populations that are
expressed in different units of measurement
expressed as a percentage rather than in terms of
the units of the particular data
51
s
V = 100 %
x
where
52
Summary
Measures of central tendency
53
Measures of Dispersion:
Why The Range Can Be Misleading
Ignores the way in which data are distributed
7
10 11 12
Range = 12 - 7 = 5
9 10
11 12
Range = 12 - 7 = 5
Sensitive to outliers
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5
Range = 5 - 1 = 4
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120
Quartile Measures
Quartiles split the ranked data into 4 segments with an
equal number of values per segment
25%
25%
Q1
25%
Q2
25%
Q3
The first quartile, Q1, is the value for which 25% of the
observations are smaller and 75% are larger
Q2 is the same as the median (50% of the observations are
smaller and 50% are larger)
Only 25% of the observations are greater than the third
quartile
Measures of Dispersion:
The Variance
Average (approximately) of squared deviations
of values from the mean
n
Sample variance:
Where
S =
2
(X X)
i=1
X= arithmetic mean
n = sample size
Xi = ith value of the variable X
n -1
Measures of Dispersion:
The Standard Deviation s
S=
2
(X
X
)
i
i=1
n -1
Sample
Data (Xi) :
Measures of Dispersion:
Sample Standard Deviation:
Calculation Example
10
12
14
15 17 18 18 24
n=8
Mean = X = 16
S=
130
7
4.3095
Summary of Measures
Range
X largest X smallest
Total Spread
Standard Deviation
(Sample)
(X i
Dispersion about
Sample Mean
Standard Deviation
(Population)
(X i
X )
Dispersion about
Population Mean
Variance
(Sample)
X)
n 1
N
(X i X
n1
Squared Dispersion
about Sample Mean
60
2
4
6
8
10
12
14
20
30
60
5 = 25th %tile
25 = 75th %tile
61
Variance
Variance is defined as the average of the
square deviations:
(
)
X
=
2
N
62
Subgroup Comparison
Bivariate Analysis
Bivariate Analysis
Percentaging a Table
Multivariate Analysis
Conclusion
DESCRIPTIVE STATISTICS
Summarise and organise data.
Measures of central tendency
Mean average sum of scores/number of scores.
Mode most common value typical value.
Median middle value.
17
2
3
(n = number of responses)
Always provide a table of results.
n
25.0
29
22
%
42.7
32.3
INFERENTIAL STATISTICS
Allow you to make inferences from data.
Uses at least 2 variables.
What affect does the independent variable have on the
dependent variable? Causality is A caused by B?
TYPES OF TEST
1. Parametric tests. These tests use interval or ratio data (see
Chapter 6 for a reminder). Parametric tests assume that the
data is drawn from a normally distributed population (i.e. the
data is not skewed) and have the same variance (or spread) on
the variables being measured.
2. Non-parametric tests. These are used with ordinal or
nominal data, and do not make any assumptions about the
characteristics of the sample in terms of its distribution.
TESTS OF ASSOCIATION
CORRELATION
Correlations investigate the relationship between two variables
consisting of interval or ratio data.
A correlation can indicate:
Whether there is a relationship between the two variables.
The direction of the relationship, i.e. whether it is positive or
negative.
The strength, or magnitude of the relationship.
R=1
strong
positive
correlation
R= -1
strong
negative
correlation
R=0
no
correlation
TESTING DIFFERENCES
Tests of difference generally assess whether differences between
two samples are likely to have occurred by chance, or whether
they are the result of the effect of a particular variable.
Interval or ratio
Are you looking to identify relationships between two variables?
If so, consider the use of a Pearsons correlation.
If there are three or more variables, then consider multiple
regression.
If you are concerned with differences between scores, then ttests or ANOVA may be appropriate.
If you want to identify differences within one group, then a
paired samples t-test should be used.
If you are comparing two randomly assigned groups, then use
an independent samples t-test.
If you are looking to compare two non-randomly assigned, or
three or more groups, then use ANOVA.