Вы находитесь на странице: 1из 45

Descriptive Statistics

Disampaikan oleh Tri Widodo

descriptive statistics 1
Types of variables
 Choice of methods is determined by type of
variables
– Dependent
– Independent
 Categorical
 Numeric

descriptive statistics 2
Data

Category / qualitative Numeric / quantitative

Nominal Discrete
Ordinal Continuous
(Unordered e.g.
(Ordered e.g.
categories) • Number of
e.g.
categories) • height
e.g. person
• Gender •Age
• Education in household
• smoker/ •Blood
• Disease’ stages • Number of
non smoker
I, II, III, IV
pressure
white blood
• Blood type
• Level of cells
(O, A, B, AB)
knowledge
Marital status

descriptive statistics 3
Are your data normally distributed?
 How to know?
 What kind of location and spread will be
used?
– Normally distributed  mean and SD
– Not Normally distributed  median and
percentile or min, max
 Is there any relation with statistical test?
– Yes  parametric test
– No  non parametric test

descriptive statistics 4
SAMPLING DISTRIBUTION

Almost infinite samples


Variation of mean and sd
Follow sampling distribution

Mean of the population


Sd of the population
descriptive statistics 5
NORMAL CURVE

Symmetry
Mesocurtic
Asymptote at +3 standard dev.

descriptive statistics 6
Normal distribution = Z distribution (1)

(AUC)

- 1.96 - 1 m + 1 + 1.96

Z value - 1.96 -1 0 +1 + 1.96


x-m
Z = descriptive statistics 7


Zarni amri descriptive statistics 8
Normal distribution
COV = < 20%
(SD/mean) x 100%
Ratio Skewness -2 to +2
Skewness/ SE of skewness
Ratio Kurtosis -2 to +2
Kurtosi/SE of kurtosis

Histogram Simetris

Box plot Simetris

Kolmogorov-Smirnov > 0.05


Shapiro-wilk
descriptive statistics 9
Kolmogorov-Smirnov(a) Shapiro-Wilk
Statistic df Sig. Statistic df Sig.
age ,130 30 ,200(*) ,910 30 ,015

Statistic Std. Error


age Mean 32,4667 1,15045
95% Confidence Lower
30,1137
Interval for Mean Bound
Upper
34,8196
Bound
5% Trimmed Mean 31,9815
Median 31,5000
Variance 39,706
Std. Deviation 6,30125
Minimum 24,00
Maximum 52,00
Range 28,00
Interquartile Range 10,0000
Skewness 1,100 ,427
Kurtosis 1,628 ,833
descriptive statistics 10
age Stem-and-Leaf Plot
Frequency Stem & Leaf
1,00 2 . 4
10,00 2 . 6666677789
8,00 3 . 00112223
8,00 3 . 55777888
2,00 4 . 13
1,00 Extremes (>=52)
Stem width: 10,00
Each leaf: 1 case(s)

60

25

50

40

30

20
N = 30

age

descriptive statistics 11
Data menopause women; > 40 yrs
Umur

Cumulative
Frequency Per cent Valid Per cent Per cent
Valid 40 31 15.5 15.5 15.5
41 9 4.5 4.5 20.0
42 16 8.0 8.0 28.0
43 8 4.0 4.0 32.0
44 5 2.5 2.5 34.5
45 12 6.0 6.0 40.5
46 6 3.0 3.0 43.5
47 7 3.5 3.5 47.0
48 7 3.5 3.5 50.5
49 5 2.5 2.5 53.0
50 14 7.0 7.0 60.0
51 5 2.5 2.5 62.5
52 5 2.5 2.5 65.0
53 4 2.0 2.0 67.0
54 2 1.0 1.0 68.0
55 5 2.5 2.5 70.5
56 5 2.5 2.5 73.0
57 5 2.5 2.5 75.5
58 6 3.0 3.0 78.5
60 10 5.0 5.0 83.5
61 3 1.5 1.5 85.0
62 1 .5 .5 85.5
63 4 2.0 2.0 87.5
64 2 1.0 1.0 88.5
65 9 4.5 4.5 93.0
66 2 1.0 1.0 94.0
67 2 1.0 1.0 95.0
68 2 1.0 1.0 96.0
70 5 2.5 2.5 98.5
73 1 .5 .5 99.0
79 1 .5 .5 99.5
85 1 .5 .5 100.0
Zarni amri descriptive statistics 12
Total 200 100.0 100.0
Descriptives

Statistic Std. Error


Umur Mean 50.47 .67
95% Confidence Lower Bound 49.14
Interval for Mean Upper Bound
51.79

5% Trimmed Mean 49.83


Median 48.00
Variance 89.617
Std. Deviation 9.47
Minimum 40
Maximum 85
Range 45
Interquartile Range 15.00
Skewness .836 .172
Kurtosis .076 .342

Normal; skewness and descriptive


kurtosis = stat/SE, +2 s/d -2
statistics 13
histogram
60

50

40

30

20

10 Std. Dev = 9.47


Mean = 50.5
0 N = 200.00
40.0 45.0 50.0 55.0 60.0 65.0 70.0 75.0 80.0 85.0

Umur
descriptive statistics 14
Box Plot percentile
90

81

100
80

70

60
75

50
50

25
40 0

30
N= 200

Umur
descriptive statistics 15
Tests of N ormali ty
a
Kolm ogorov -Sm irnov
St at is tic df Sig.
U m ur . 134 200 . 000
a. Lillief ors Signif icanc e C orrec t ion

Not normally distributed 


Median and Min-Max
descriptive statistics 16
95 % Confidence Interval (1)
Example n=18 patients with stable angina pectoris
X = total cholesterol
Sample mean = 5.81
SD = 1.20
SE (X) = SD/vn = 1.20/v18 = 0.28

95 % CI = X + 1.96 x SE
= 5.81 + 1.96 x 0.28
= 5.81 + 0.55 = 5.26 , 6.36

descriptive statistics 17
95 % Confidence Interval (2)

a) A sample mean =6
does this sample belongs to the population with m = 5.81?
Yes, because lies within 95% CI of m = 6 (5.26,6.36)
or because 95% CI of = 6 + 0.55 = 5.45, 6.55
includes m = 6
b) A sample mean =7
does this sample belongs to the population with m = 6 ?
No, because lies outside 95% CI of m = 7 (5.26  6.36)
or because 95% CI of = 7 + 0.55 = 6.45,7.55
does not include m = 6

descriptive statistics 18
Data Presentation

 Textular
 Tabular
 Graphical

 Should be adjusted for:


– targeted audience
– Messages
– No duplication
descriptive statistics 19
TEXTULAR
 Limited substance
 Suitable for presenting qualitative
description
 Use to complement other presentation
methods
 Presenting data basis for the study
 Supporting statistical calculation
 Proper method for academic purposes
descriptive statistics 20
GRAPHICAL

 Visualization of tabular data


 Good to present progress
development
 Proper method for public
audiences

descriptive statistics 21
CRITERIA FOR A GOOD
TABLE

 Simple
 Self-explanatory
* Clear title
* Note
* Clear classification
* Row & column total
 Citation source
descriptive statistics 22
TYPES OF TABLES

 Master table (reference)


 Derived table (analysis)
* Frequency/distribution
* Cross table

descriptive statistics 23
Frequency tables
grup tensi N n Ht

Cumulative
Fr eq uency Percent Valid Percent Percent
Valid normotensi 196 84.1 84.1 84.1
hipertensi 37 15.9 15.9 100.0
Total 233 100.0 100.0

klasif tensi JNC 7

Cumulative
Freq uency Percent Valid Percent Percent
Valid normal 111 47.6 47.6 47.6
prahipertensi 85 36.5 36.5 84.1
hipertensi st 1 29 12.4 12.4 96.6
hipertensi st 2 8 3.4 3.4 100.0
Total 233 100.0 100.0

descriptive statistics 24
Tabel frekuensi
Kategori umur

Cumulative
Frequency Percent Valid Percent Percent
Valid 40-55 tahun 141 70.5 70.5 70.5
56-70 tahun 56 28.0 28.0 98.5
71-85 tahun 3 1.5 1.5 100.0
Total 200 100.0 100.0

Age (yrs) Frequency Percet


40-50 141 70.5
56-70 56 28.0
70-85 3 1.5
Total 200 100
descriptive statistics 25
Cross-Table
pelaksanaan shift bergilir * klasif tensi JNC 7 Crosstabulation

klasif tensi JNC 7


normal prahipertensi hipertensi st 1 hipertensi st 2 Total
pelaksanaan tidak Count 34 28 13 4 79
shift bergilir % within klasif
30.6% 32.9% 44.8% 50.0% 33.9%
tensi JNC 7
ya Count 77 57 16 4 154
% within klasif
69.4% 67.1% 55.2% 50.0% 66.1%
tensi JNC 7
Total Count 111 85 29 8 233
% within klasif
100.0% 100.0% 100.0% 100.0% 100.0%
tensi JNC 7

ko di ng i nd ex masa tu bu h * gr u p ten si N n Ht Cro sstabu latio n

Count
grup tensi N n Ht
normot ensi hipertens i Tot al
koding < 18.5 13 13
index masa 18. 5 - 24.99 91 10 101
tubuh 25-27 42 7 49
>27 41 17 58
Tot al 187 34 221
Zarni amri descriptive statistics 26
Catatan: !!!

 Pada studi epidemiologi, tabel silang


ini dapat dibuat dan dapat diuji
kemaknaan , apakah ada perbedaan
antara hipertensi dan kerja gilir atau
hipertensi dan status gizi, dengan
hanya menggunakan sampel untuk
survei/studi potong lintang

descriptive statistics 27
Cross-table

Kategori umur * Apakah anda sudah menopause ?


Crosstabulation

Count
Apakah anda s udah
menopause ?
Sudah Belum Tot al
Kategori 40-55 tahun 44 97 141
umur 56-70 tahun 55 1 56
71-85 tahun 3 3
Tot al 102 98 200

descriptive statistics 28
GRAPHICAL
 Line graph
* Arithmetic scale
* Semi logarithmic scale
 Histogram / polygon frequency
 Scatter diagram

 Bar diagram
 Pie diagram

descriptive statistics 29
Pie diagram
klasif tensi JNC 7

hipertensi st 2

hipertensi st 1

normal

prahipertensi

descriptive statistics 30
BAR DIAGRAM

90
80
70
60
50
Rural
40
Urban
30
20
10
0
Q-1 Q-2 Q-3 Q-4

descriptive statistics 31
INVERTED BAR DIAGRAM

Q-4

Q-3
Urban
Q-2 Rural

Q-1

0 20 40 60 80 100

descriptive statistics 32
Box plot & histogram

140 60

50
120 29
4
105

40

35
100 30
206
197
229
30

80 20

10 Std. Dev = 10.81


60 Mean = 76.7

0 N = 233.00
55.0 65.0 75.0 85.0 95.0 105.0 115.0
40 60.0 70.0 80.0 90.0 100.0 110.0 120.0
N= 233

td diastolik rata2 td diastolik rata2

descriptive statistics 33
LINE GRAPH

100
90
80
70
60
Rural
50
Urban
40
30
20
10
0
Q-1 Q-2 Q-3 Q-4

descriptive statistics 34
Example: Midterm scores of STAT 101

The following data set contains the midterm


exam scores of STAT 101.

74 76 78 88 87 87 53 95 82 79 79 78
62 80 77 70 60 60 84 95 85 93 79 84
71 85 100 77 72 95 79 83 97 87 73 84
74 83 85 95 62 50 86 83 86 36
Example: Midterm scores of STAT 101

A stem-and-leaf display is follows:

3: 6
4:
5 : 03 Leaf : last digit
6 : 0022 Stem : remaining digit(s)
7 : 012344677889999
8 : 02333444555667778
9 : 355557
10 : 0
Box plot
3000000

2000000 2
135
38
39
161
21

1000000

0
N= 200

Pendapatan

descriptive statistics 37
Scatter diagram
A plot of paired (x,y) data with a horizontal x-axis and a
vertical y-axis
Zarni amri descriptive statistics 39
SKALA PENGUKURAN
KARAKTERISTIK
SKALA Klasifi- Nol
UKUR Jenjang Jarak
kasi Abs.
NOMINAL 
ORDINAL  
INTERVAL   
RASIO    
OPERASIONALISASI
VARIABEL
Statistik Statistik
Metode
Non Parametrik Parametrik

Skala Nominal Ordinal Interval Rasio


Ukur

Hasil Ukur Variabel Diskret Variabel Kontinyu


Cara Ukur Data Hitung Data Ukur
DISAIN STATISTIK
MANIPULASI
MANIPULASI VARIABEL
SUBYEK
TIDAK YA

STUDI
TIDAK
KORELASI STUDI
STUDI INTERVENSI
YA
KOMPARASI
HUB. DISAIN DAN
VARIABEL
Metode Statistik
Disain Non Parametrik
Studi Parametrik Sampel Sampel
Besar Kecil
Korelasi Pearson Spearman Kendall
Mann
Komparasi Indep t-test Median test
Whitney
Intervensi Dep t-test McNemar Wilcoxon
descriptive statistics 44
Zarni amri descriptive statistics 45

Вам также может понравиться