Вы находитесь на странице: 1из 31
Dr. Moataza Mahmoud Abdel Wahab Lecturer of Biostatistics High Institute of Public Health University of Alexandria
Dr. Moataza Mahmoud Abdel Wahab Lecturer of Biostatistics High Institute of Public Health University of Alexandria

Dr. Moataza Mahmoud Abdel Wahab Lecturer of Biostatistics High Institute of Public Health University of Alexandria

moatazamahmoud@yahoo.com

Important statistical terms Population: a set which includes all measurements of interest to the researcher (The
Important statistical terms
Population:
a set which includes all
measurements of interest
to the researcher
(The collection of all
responses, measurements,
counts that are of interest)
or
Sample:
A subset of the population
Why sampling?
Why sampling?
Get information about large populations
Get information about large populations
  • Less costs

Why sampling? Get information about large populations  Less costs  Less field time  More
  • Less field time

Why sampling? Get information about large populations  Less costs  Less field time  More
  • More accuracy i.e. Can Do A Better Job of

Data Collection
Data Collection
  • When it’s impossible to study the whole

population

Target Population:

The population to be studied/ to which the investigator wants to generalize his results

Target Population: The population to be studied/ to which the investigator wants to generalize his results
Target Population: The population to be studied/ to which the investigator wants to generalize his results

Sampling Unit:

Target Population: The population to be studied/ to which the investigator wants to generalize his results

smallest unit from which sample can be selected

Target Population: The population to be studied/ to which the investigator wants to generalize his results

Sampling frame

List of all the sampling units from which sample is drawn

Sampling scheme

Sampling scheme

List of all the sampling units from which sample is drawn Sampling scheme

Method of selecting sampling units from sampling frame

Target Population: The population to be studied/ to which the investigator wants to generalize his results
Target Population: The population to be studied/ to which the investigator wants to generalize his results
Types of sampling
Types of sampling
Types of sampling  Non-probability samples  Probability samples
Types of sampling  Non-probability samples  Probability samples
Types of sampling  Non-probability samples  Probability samples
  • Non-probability samples

  • Probability samples

Non probability samples
Non probability samples
Non probability samples Convenience samples (ease of access)  sample is selected from elements of a
Non probability samples Convenience samples (ease of access)  sample is selected from elements of a
Convenience samples (ease of access)
Convenience samples (ease of access)

Non probability samples Convenience samples (ease of access)  sample is selected from elements of a

sample is selected from elements of a population that are easily accessible

Snowball sampling (friend of friend….etc.)
Snowball sampling (friend of friend….etc.)
 
Purposive sampling (judgemental)
Purposive sampling (judgemental)

You chose who you think should be in the study

Quota sample
Quota sample

Non probability samples
Non probability samples
Non probability samples Probability of being chosen is unknown Cheaper- but unable to generalise potential for
Non probability samples Probability of being chosen is unknown Cheaper- but unable to generalise potential for

Probability of being chosen is unknown

Non probability samples Probability of being chosen is unknown Cheaper- but unable to generalise potential for

Cheaper- but unable to generalise potential for bias

Probability samples
Probability samples
Probability samples  Random sampling  Each subject has a known probability of being selected 
  • Random sampling

Probability samples  Random sampling  Each subject has a known probability of being selected 
  • Each subject has a known probability of being selected

  • Allows application of statistical sampling theory to results to:

    • Generalise

    • Test hypotheses

Conclusions
Conclusions
Conclusions  Probability samples are the best  Ensure  Representativeness  Precision
  • Probability samples are the best

Conclusions  Probability samples are the best  Ensure  Representativeness  Precision
  • Ensure

    • Representativeness

    • Precision

Conclusions  Probability samples are the best  Ensure  Representativeness  Precision

Methods used in probability

samples
samples
Methods used in probability samples  Simple random sampling  Systematic sampling  Stratified sampling 
 Simple random sampling  Systematic sampling  Stratified sampling  Multi-stage sampling  Cluster sampling
Simple random sampling
Systematic sampling
Stratified sampling
Multi-stage sampling
Cluster sampling

Simple random sampling

Simple random sampling
Simple random sampling
Table of random numbers
Table of random numbers
6 8 4 2 5 7 9 5 4 1 2 5 6 3 2 1
6
8 4 2 5 7 9 5 4 1 2 5 6 3 2 1 4 0
5
8 2 0 3 2 1 5 4 7 8 5 9 6 2 0 2 4
3
6 2 3 3 3 2 5 4 7 8 9 1 2 0 3 2 5
9
8 5 2 6 3 0 1 7 4 2 4 5 0 3 6 8 6

Systematic sampling

Sampling fraction Ratio between sample size and population size
Sampling fraction
Ratio between sample size and population
size

Systematic sampling

Systematic sampling
Systematic sampling
Cluster sampling
Cluster sampling

Cluster: a group of sampling units close to each other i.e. crowding together in the same area or neighborhood

Cluster sampling Cluster: a group of sampling units close to each other i.e. crowding together in
Cluster sampling Section 1 Section 2 Section 4 Section 3 Section 5
Cluster sampling

Cluster sampling

Cluster sampling
Cluster sampling Section 1 Section 2 Section 4 Section 3 Section 5
Section 1 Section 2
Section 1
Section 2
Section 4
Section 4
Cluster sampling Section 1 Section 2 Section 4 Section 3 Section 5

Section 3

Section 5
Section 5
 Stratified sampling  Multi-stage sampling
Stratified sampling
Multi-stage sampling
 Stratified sampling  Multi-stage sampling

Errors in sample

Systematic error (or bias)

Inaccurate response

Selection bias

(information bias)

Errors in sample  Systematic error (or bias) Inaccurate response Selection bias (information bias)  Sampling

Sampling error (random error)

Type 1 error  The probability of finding a difference with our sample compared to population,
Type 1 error
The probability of finding a difference with
our sample compared to population, and
there really isn’t one….
Type 1 error  The probability of finding a difference with our sample compared to population,
 Known as the α (or “type 1 error”)  Usually set at 5% (or 0.05)
Known as the α (or “type 1 error”)
Usually set at 5% (or 0.05)
Type 2 error
Type 2 error
Type 2 error  The probability of not finding a difference that actually exists between our
 The probability of not finding a difference that actually exists between our sample compared to
The probability of not finding a difference
that actually exists between our sample
compared to the population…
Known as the β (or “type 2 error”)
Power is (1- β) and is usually 80%
Sample size Quantitative Qualitative 2 2 2 Z σ Z π(1  π) n  n
Sample size
Quantitative
Qualitative
2
2
2
Z
σ
Z π(1
π)
n 
n 
2
2
D
D
2
2
σ )xF
2 P (1- P) F
1
2
n 
n 
2
2
D
D
Problem 1 A study is to be performed to determine a certain parameter in a community.
Problem 1
A study is to be performed to determine a
certain parameter in a community. From a
previous study a sd of 46 was obtained.
If
a sample
error
of
up
to
4
is
to
be
accepted. How many subjects should be
included in this study
at
99%
level
of
confidence?
Answer
Answer
2 2 Z σ n  2 D
2
2
Z
σ
n 
2
D
Answer 2 2 Z σ n  2 D 2 2 2.58 x 46 n 
2 2 2.58 x 46 n   880 .3 ~ 881 2 4
2
2
2.58
x
46
n 
 880 .3 ~ 881
2
4
Problem 2  A study is to be done to determine effect of 2 drugs (A
Problem 2
A study is to be done to determine effect
of 2 drugs (A and B) on blood glucose
level. From previous studies using those
drugs, Sd of BGL of
obtained respectively.
8 and 12 g/dl were
A significant level of 95% and a power of
90% is required to detect a mean
difference between the two groups of 3
g/dl. How many subjects should be include
in each group?
Answer
Answer

n

(σ

2

1

2

σ )xF

2

D

2

n

(8

2

2

12 )x10.5

2
3

in each group

 242.6 ~ 243
 242.6 ~ 243
Problem 3 It was desired to estimate proportion of anaemic children in a certain preparatory school.
Problem 3
It was desired to estimate proportion of
anaemic children in a certain preparatory
school. In a similar study at another school
a proportion of 30 % was detected.
Compute the minimal sample size required
at a confidence limit of 95% and accepting
a difference of up to 4% of the true
population.
Answer
Answer

n

2

Z π(1

π)

n

Answer n  2 Z π(1  π) n  D 2 2 x 0.3(1 1.96

D

2

  • 2 x 0.3(1

1.96

0.3)

504 .21 ~ 505
2

(0.04)

Problem 4 In previous studies, percentage of hypertensives among Diabetics was 70% and among non diabetics
Problem 4
In
previous
studies,
percentage
of
hypertensives among Diabetics was 70%
and among non diabetics was 40%
in a
certain community.
A
researcher
wants
to
perform
a
comparative study for hypertension among
diabetics and non-diabetics
at
a
confidence
limit
95%
and
power
80%,
What is
the minimal sample to be taken
from each group with 4% accepted
difference of true value?
Answer
Answer
  • 2 P (1 - P) F

D

2

n

 2413.2
 2413.2
  • 2 x 0.55 (1- 0.55) x7.8

n

2

0.04

Cost
Cost

Precision