Вы находитесь на странице: 1из 40

ACC 751

Week 10 – Quantitative Data analysis

Freda Pitakaka (Ms.)


Date: 05th Nov 2018

12-Nov-18
Presentation Plan
1. Measures of Central Tendency
2. Distributions
3. Measures of Variability
- Standard deviation
- Inter-quartile range
4. How to present these

12-Nov-18
Descriptive Statistics
• A summary description of characteristics or
measurements (variables) of a group (of people)
• By summarizing information, descriptive statistics can
simplify understanding of characteristics of the groups
• And how they vary by subgroup or over time
Highest level Men Women All
Illiterate 42 64 106
Basic literacy 45 29 74
Primary School Certificate 32 25 57
Secondary School Certificate 8 3 11
Higher level qualification 1 1 2
Total 128 122 250

12-Nov-18
Descriptive Statistics – Cont.
Important characteristics of Data
• Descriptive – the frequency of observed data –
Frequency, Proportion, Prevalence & incidence
• Centre – representative or average value that
indicates the middle of the data – Mean, Median
• Distribution – nature of shape of the data – Normal,
skewed
• Variation/Spread – the amount that the data values
vary among themselves – Variance, Standard
Deviation, IQR

12-Nov-18
Frequency
Frequency of a
Frequency of a numerical data categorical data

12-Nov-18
Proportion
• Ratio of the number of subjects with a given characteristic
(numerator to the total number of subjects in the group (the
denominator)
Number of subjects with a given characteristic x 100
Total number of subjects in a group

Example
Total number of family business owners = 400
Total number of failed business = 50
Proportion of failed business = 50/400 x 100 = 12.5%
12-Nov-18
An example
In a study describing the characteristics of people who owns family
businesses in 2017, researchers collected the following data.
Variable N

Total number of family business owners 250

Breakdown of people who owns family business by gender: 1) What was the proportion of females
Males owning a family business?
Females 100
150

Breakdown of people who owns family business by age: 2) What proportion of family business
20 – 30 years owners were 41 years and over?
31 – 40 years 35
41 years and older 75
140

12-Nov-18
An example
In a study comparing the characteristics of family
business owners of village x & y in 2017.
Variable Village x Village Y
Total number of family business 250 600
owners

Breakdown of family business 1) Which village had


owners by gender: the highest
Males 100 365 proportion of
Females 150 235 females as business
owners?
Breakdown of family business 2) Which village had
owners by age: the highest
20-30 years 75 110 proportion of
31-40 years 35 60 business owners
41 years and older 140 430 aged between 31-40
years?
12-Nov-18
Measures of Central Tendency
Mean = the average value
Sum of all values
Number of values in the data set

E.g. Set of ages: 10, 20, 30, 40 years


Mean age =(20+30+40+50)/4 = 35 years

12-Nov-18
Measures of Central Tendency
Median = the value in the middle
Order the numbers from smallest to largest and find the
middle value OR to find the middle value use the formula
Number of values in the data set +1
2

E.g. Number of business years 2, 3, 5, 8, 10, 12, 14


Median number of years 7+1 =4
2

The 4th value 2, 3, 5, 8, 10, 12, 20

Median length of stay = 8 days


12-Nov-18
Mean or Median?

12-Nov-18
Example (Asymmetrical or skewed
• Family income (10 households)

• $2000; $2000; $3000; $4000; $4000; $5000; $6000; $8000; $9000; $100,000

Mean = $ 14,300
Median = $ 4,500

• Mean does not provide a fair summary of the bulk of data since it is influenced
by extreme or outlying observations

• Unlike the mean, the median is not greatly affected by extreme or


outlying observations.
The median should be calculated when the data distribution is skewed
12-Nov-18
Example Business owner Age
1 22
2 16
• In a study about the 3 35
characteristics of family 4 9

business owners. The 5 47


6 18
following data was collected: 7 38
8 26

• Calculate the mean age 9


10
5
53
of the family business 11 12
owners 12 59
13 31
14 8
15 51

12-Nov-18
Example
Business Length of business

• In a study about the length of running a owner (months)

1 2
family business the following data was
2 25
collected:
3 3

4 7
What calculation would you do 5 5
to summarise the data? 6 4

7 3

Complete the calculation to 8 18

9 6
summarise the data
10 2

11 5

12-Nov-18
Measures of Variation
Choices of measure
• Variances
• Standard deviation

Alternative measures
• Quartiles: divide data into 4 quarters (Q1-Q4) – 25%
in each
• Percentiles: divide the data into two parts

12-Nov-18
Standard deviation
A large standard deviation A small standard deviation means
means the data are widely the data are bunched up closely
spread either side of the mean on either side of the mean

Graph A Graph B

12-Nov-18
Interquartile range
Range between 25th (Q1) and 75th (Q3) percentile - robust

Example: Set of ages (years)

5, 8, 12, 27, 32, 65, 68


I | |
Q1 Q2 Q3

Median age = 27 years (IQR 8-65 years)


12-Nov-18
Use of mean or median

Report means and standard Report medians and interquartile


deviations T-tests and other ranges Wilcoxon/rank and other “non-
“parametric tests” parametric” Tests
12-Nov-18
Variable N (%)

Create‘dummy’ tables of the District Code


APZ

summary data ENK


KNU
PTM
THR
Total
YOUR CODEBOOK WILL GUIDE Sex
YOU ON THE VARIABLES AND Female
Male
RESPONSES Not recorded
Total
Smoking
Yes
No
Not recorded
Total
HIV status
Yes
No
Not recorded
Total

12-Nov-18
Using Excel to summarize your data
Pivot tables
Pivot tables allow you to organize and analyse large
data in lists and tables

12-Nov-18
Using Excel to summarize your data
Descriptives
• Frequency
= COUNTIF(Range1:Range2, “Criteria")
=COUNTIF(D2:D129,“1") will give you the number of
times 1 appears in the cells D2=D129

The SORT and FILTER functions can also be used to


determine how many times a value occurs in a variable

Can also find and report MIN and MAX values

12-Nov-18
Using Excel to summarize your data – Cont.
• Median
=MEDIAN(Range1:Range2)
=MEDIAN (C1:C20) will give you the median for the values in the
cells from C1 to C20
• Interquartile range
=QUARTILE.EXC(C3:C129,1)
=QUARTILE.EXC(C3:C129,1) will give you the 1st quartile for
values in the cells from C3 to C129 (=25th percentile)
This formula can be repeated specifying
=QUARTILE.EXC(C3:C129,3) to get the 3rd quartile(=75th
Percentile
12-Nov-18
PRESENTING YOUR FINDINGS

12-Nov-18
Tables and Figures
Why do we use tables and figures?

• Provides a summary of data / results


• Allows better visualization of the
data/results
• Reduces text (word count)

12-Nov-18
Tables and Figures – Cont.
• Table - Presents list of numbers or text in
columns, each column having a title or label.

• Figure – is a visual presentation of results

• Graphs (most common) – show trends or


patterns
• Maps
• Diagrams
• Photos
12-Nov-18
Text
• Not all analysis warrants a Table or Figure

• Some simple results are best stated in a single


sentence.

E.g. Businesses in rural areas were more likely to


be successful than business in urban areas (P
<0.001).

12-Nov-18
Anatomy of a Table
Table Legend/Title

Column titles

Table body (data)

Lines demarcating the


different parts of the
Footnotes table

12-Nov-18
Anatomy of a Figure
Figure 1: Trend in registered deaths (200-2007) District A, Country A Figure Legend/Title

Y axis label

Data
Symbols

Key to symbols
Y axis
X axis label X axis

Footnotes
12-Nov-18
Anatomy of a Table (Qualitative)
Table Legend/Title

Column titles

12-Nov-18
Grounded Theory Models

12-Nov-18
A Simple way of presenting qualitative data
in a figure

12-Nov-18
Descriptive Data
Example 1

12-Nov-18
Example 2

12-Nov-18
Example 3
Pie chart

12-Nov-18
Showing the association between different variables
Example 1 - Association between two continuous
variables

12-Nov-18
Example 2 - Association between a categorical and
continuous variable

Box plot

12-Nov-18
Example 3 - Association between two categorical
variables

Bar chart

12-Nov-18
Example 1 – Showing Trends over time

12-Nov-18
Example 2

12-Nov-18
Using Excel to create tables and charts

12-Nov-18

Вам также может понравиться