Вы находитесь на странице: 1из 74

Sampling and Sampling

Distributions
Suppose, a researcher intends to study the
employment status of graduates of the program
Bachelor in Secondary Education, Major in
Mathematics, in the Philippines (from 2007-2015). He
wants to know the average duration it takes (after
graduation) for the said teachers to secure a
permanent position in the Department of Education.
Questions:
1. Would it be feasible for the researcher to
survey 100% of the group?
2. What are the possible obstacles that the
researcher would face as he goes along the
research?
3. Is it valid to select representatives from the
group to answer the survey?
SAMPLING
• means selecting a particular group or
sample to represent the entire
population.
SAMPLING
•is a process used in statistical analysis
in which a predetermined number of
observations (sample) are taken from
a larger population.
Population

Population

Sample

Sample

A B
Sampling Techniques
• Random Sampling
- is a sampling technique in which the subjects of the population get
an equal opportunity to be selected as a representative sample.
a. Simple Random Sampling
b. Stratified Random Sampling
c. Cluster Random Sampling
d. Multistage Random Sampling
• Non-Random Sampling
a. Simple Random Sampling
• Each individual in the sample is
randomly chosen

• Each individual has an equal chance


of selection
b. Stratified Random Sampling
• The Population is divided into
homogeneous groups or strata

• A proportionate number of sample


from each strata are randomly
selected
c. Cluster Random Sampling
• The Population is divided according
to the pre-existing groups or clusters

• The sample is taken from the randomly


selected clusters
d. Multistage Random Sampling

• The sample is constructed by taking a


series of simple random samples in
stages
Parameter
vs
Statistic
Parameter (Population Data)
A parameter is a number that
describes the population. It is a fixed
number, but in practice we do not
know its value.
(e.g. population mean, population
variance, population standard
deviation and proportion,etc.)
Statistic (Sample Data)
A statistic is a function of the sample
data, i.e., it is a quantity whose value
can be calculated from the sample
data.
Statistics are used to make
inference about unknown
population parameters.
Symbols Commonly Used
Population Sample
Mean
Variance
Standard
Standard
Deviation
Deviation
Proportion
Proportion
Sampling Distribution of
the Sample Mean
Sampling Distribution
Suppose that we draw all possible samples of size n from
a given population and get the average of all the sample means:

A sampling distribution is a
probability distribution of a statistic
obtained through a large number of
samples drawn from a specific
population.
Suppose, a researcher intends to study the
employment status of graduates of the program
Bachelor in Secondary Education, Major in
Mathematics, in the Philippines (from 2007-
2015). It is known that the population standard
deviation is 15. What is the probability that the
average duration of 40 samples is less than 3
years?
10 Samples of Sample Size 40
Sample # Sample Mean

1 5.11

2 4.17

3 3.02

4 2.89

5 4.17

6 5.09

7 8.4

8 6.43

9 2.61

10 3.49
 

Find:

a) Expected Value of the Mean


b) Standard Error of the Mean
c) The Probability that
(using z-values)
Expected Value of the Mean
Sampling Distribution

The mean of the distribution of


sample means is called the Expected
Value of M and is always equal to the
population mean μ.
E()  or
• Where:
• x̅ is the mean of the samples

••   is the population mean


a) Find the expected Value of the Mean
Sample # Sample Mean
 
∑ 𝑥𝑖
1 5.11
𝜇 ´𝑥 =
2 4.17 𝑛
3 3.02

4 2.89
 
45 .38
¿
5 4.17 10
6 5.09

7 8.4  

8 6.43

9 2.61

10 3.49
Standard Error of the Mean

The standard deviation of the


distribution of sample means is
called the Standard Error of the
Mean.
Standard Error of the Mean

σ
 

σx̅ =
√𝒏
 
Where:

σ
n is the sample size
b) Find the Standard Error of the Mean
  σ
σ x ̅ =
√𝒏
𝐆𝐢𝐯𝐞𝐧 :
 

  σ
  σ x ̅ =
√𝒏
40
  15
¿
√ 𝟒𝟎
 
σ x ̅ = 𝟐 . 𝟑𝟕
Z-value for the Sampling Distribution of the
Sample Mean
x−μ 
 

𝒛=
σx̅
 
Where:

is the SD of the sampling distribution


c) z- values for the distribution
  x    − μ   
𝒛=
𝜎 𝑥̅

𝐆𝐢𝐯𝐞𝐧 :
    x    − μ   
𝒛=
𝜎 𝑥̅
 
  3    −  4 . 54  
¿
2 . 37
𝒛=− 𝟎 .𝟔𝟒𝟗𝟕𝟗 𝒐𝒓 − 𝟎 .𝟔𝟓
 

 
𝑨𝒓𝒆𝒂=. 𝟐𝟓𝟕𝟖
 
𝒑 ( 𝒙<𝟑 ) =𝟐𝟓 . 𝟕𝟖 %
1
Example # 2
Assume that a school district has 8,000
incoming SHS students. In this district, the
average grade of a 10th grader is 80, with a
standard deviation of 20. Suppose you draw a
random sample of 50 students. What is the
probability that (a)the average grade of a
sampled student will be greater than 83? (b)the
average grade is between 75 to 81?
Central Limit Theorem
- states that “given a sufficiently large
sample size from a population with a
finite level of variance, the mean of all
samples from the same population will
be approximately equal to the mean of
the population.”
1. The average time it takes a group of college
students to complete a certain examination is 46.2
minutes. The standard deviation is 8 minutes.
Assume that the variable is normally distributed.
What is the probability that a randomly selected
college student will complete the examination in
less than 43 minutes?
2. The average number of milligrams (mg) of
cholesterol in a cup of a certain brand of ice
cream is 660 mg, and the standard deviation
is 35 mg. Assume the variable is normally
distributed . If a cup of ice cream is selected ,
what is the probability that the cholesterol
content will be more than 670 mg?
Estimation of
Population
Parameters
Solve for the Average

8 13 14 6 7
9.6
10 13 11 9 14
11.4
9 17 7 10 15
11.6
15 12 8 14 7
11.2
Point Estimate
- A single sample mean to represent a
plausible value for the population mean.

Ex. A recent study found that tv watching is good for the emotional wellbeing. The
population mean of the hours spent by “happy” people in watching tv shows annually
is 75 hrs (per month). Five samples of size 50 have the ff. sample means:

Sample 1: x̅ = 74.8 hrs Sample 3: x̅ = 76.21 hrs


Sample 2 : x̅ = 75.6 hrs Sample 4: x̅ = 74.67 hrs

Sample 5: x̅ = 74.33 hrs.


Properties of Estimation

 Unbiased (Equal to the parameter being


estimated)
 Consistent (Sample size increase, the value of the
estimator approaches the value of the parameter
being estimated)
 Relatively Efficient (The estimator must have a
small variance)
Interval Estimator
- Interval of reasonable values based on a
sample to represent the population mean

Confidence Interval
- Interval estimator for the population
mean
- Associated with the confidence level
 Getting the average using EXCEL
Confidence Level
- Level of assurance that the resulting confidence
interval contains the population mean
Ex. Carlo works as a manager in an authentic bag factory.
He wants to know the average production of all the
workers in the different branches. He randomly selects a
sample of 30 workers and checks to see the average
number of bags being made in a month. He reports that
the population mean is between 40 and 47 bags with
95% level of confidence.
Margin of Error

 alpha () 95% of the


sample means
are in here

2.5% on 2.5% on
each tail each tail

 

Margin of Error at Different
Confidence Level
At 90% confidence level ± 1.645σ x
 

At 95% confidence level ± 1.960σ x


 

At 99% confidence level ± 2.576σ x


 

  - 1.96
x̄   + 1.96
x̄ x̄


 Confidence Interval ( is known)

𝑥 ± 𝑧 𝑎 𝜎𝑥 ̅
 
´
2
 
Where:

is the z value of the confidence level


is the standard error of the mean
Example # 1
 
To estimate the mean amount spent by one
customer in a canteen, data was collected
from 65 customers. We are to assume that is
P30.

1) At 95% level of confidence, what is the margin of error?

2) If the sample mean is P75, what is the 95% confidence


interval?
1) At 95% level of confidence, what is the margin of error?

  n= 65
  𝑥 ± 𝑧 𝑎 𝜎𝑥 ̅
´
2
 
=
(3.721042)
 

 
=  
´𝑥 ± 7 . 29
 
= 3.721042
 
= 1.960
2) If the sample mean is P75, what is the 95% confidence
interval?
  𝑥 ± 𝑧 𝑎 𝜎𝑥 ̅
´
2

 
´𝑥 ± 7 . 29  
67.71<
75 ±7 . 29
 

75+7 . 29=82 . 29
 

75 −7 . 29=67 .71
 
Example # 2

A certain car store has a difficult time of show-rooming


and that employees spend time with “false customers”. The
manager determined that if less than 15% of a salesperson’s
8-hour day (4, 320 seconds) is spent with false customers,
then show-rooming is not a major problem. One hundred
twenty-five salespersons were selected to measure their
service time to false customers and found that the mean is
equal to 3,750 seconds. The population standard deviation is
1, 958 seconds.
a) Create the 90% confidence interval.
b) Is the current state of the show-rooming a major problem?
Answer the following:

1. A sample of size n = 100 produced the sample


mean of x= 16. Assuming the population standard
deviation is 3, compute a 95% confidence interval
for the population mean.

 
2. The equatorial radius of the planet Jupiter is measured
40 times independently by a process that is practically
free of bias.  These measurements average x¯ = 71492
kilometers with a standard deviation of = 28 kilometers. 
Find a 90% confidence interval for the equatorial radius of
Jupiter.
 
3. The operations manager of a large production plant
would like to estimate the mean amount of time a
worker takes to assemble a new electronic
component. Assume that of this assembly time is 3.6
minutes. After observing 120 workers assembling
similar devices, the manager noticed that their
average time was 16.2 minutes. Construct a 98%
confidence interval for the mean assembly time.
 ConfidenceInterval
( is unknown)
´𝑥 ± 𝑡 𝑐 𝑠 𝑥̅
 

 
Where:

is the critical value through the degree of freedom


is the sample standard deviation
Example # 1

Suppose that a sample of 40 employees at a large company


were surveyed and asked how many hours a week they thought
the company wasted on unnecessary meetings. The mean
number of hours these employees stated was 12.4 with a sample
standard deviation of 5.1. Calculate a 99% confidence interval
to estimate the mean amount of time all employees at this
company believe is wasted on unnecessary meetings each week.
Example # 2

A sample of Alzheimer's patients are tested to assess the amount of


time in stage IV sleep. It has been hypothesized that individuals
suffering from Alzheimer's Disease may spend less time per night in the
deeper stages of sleep. Number of minutes spent in Stage IV sleep is
recorded for sixty-one patients. The sample produced a mean of 48
minutes (S=14 minutes) of stage IV sleep over a 24 hour period of time.
Compute a 95 percent confidence interval for this data. What does this
information tell you about a particular individual's (an Alzheimer's
patient) stage IV sleep?
Solve for each of the following:

1. With a sample size of 29, and a


standard deviation of 4.3, what is
the 90% confidence interval if the
sample mean is 4.5?
2. A random sample of size 26 is collected
and the following is determined:
Mean=521, s=28. Determine a confidence
interval of 95%.
3. Canadian record label wants to learn how
internet downloads of music in Canada are
affecting CD sales. They randomly choose 43
families in various parts of the country and
count the number of individual songs that are
downloaded in an hour. The sample mean was
3947 with a sample standard deviation of 104.
Determine a 90% confidence interval for this
data.
Estimation of
Population Proportion
Population Proportion

 
A population proportion,
generally denoted by and in some
textbooks by  is a parameter that
describes a percentage value
associated with a population.
Sample Proportion

  
 Denoted by (p-hat)
 A point estimate used for the
population proportion
Example:

A survey of 500 students in a


certain high school found that 80
of them were actually left-
handed.
 

 
 
Standard Deviation of Sample
Proportion
 

=
 
Where:
is the standard error of the mean
p is the population proportion
n is the sample size
Standard Deviation of Sample
Proportion
 
=

 
=

 
= 0.016395
Distribution of the Sampling
 Distribution
 
 If the sampling
distribution is normal

 
Confidence Interval for Population
Proportion
 
𝑝 ± 𝑧  𝑎 𝜎 𝑝̂
^
2

Obt  aina95% confidenceinterval¿thepreviousexample.
  𝑝 ± 𝑧  𝑎 𝜎 𝑝̂
^
0.0331926= 0.19
 

2
0.0331926= 0.13
 

(0.016935)
 

0.0331926
 
Solve the following:

1.A sample of 800 teacher education graduates


applied for vacant positions at a certain
province. 296 of them were males. Compute
for a 90% confidence interval of the
population proportion of males.
Solve the following:

2. In the Philippines, a survey of 1850 people


was held to know whether they reject or favor
death penalty. 1174 were in favor. Obtain a 95%
confidence interval for people who reject the
said punishment.
Solve the following:

3. Two thousand five hundred car owners were


asked what brand of car they prefer to have
namely: A, B, and C. 38% of the sample liked
brand A and 27% liked brand B. Make a 90%
confidence interval for the population
proportion of car owners who prefer brand C.
(only one choice is made, all 2500 owners made
their choice)
Thank you very much! God bless!

-Hadassah Grace G. Obdin


References:
• Sampling: http://www.investopedia.com/terms/s/sampling.asp#ixzz4jsirXaZT
• http://
keydifferences.com/difference-between-probability-and-non-probability-sampli
ng.html#ixzz4jxt1naXV
• http://
keydifferences.com/difference-between-probability-and-non-probability-sampli
ng.html#ixzz4jxxbu3kJ
• http://onlinestatbook.com/stat_sim/sampling_dist/index.html
• https://www.statlect.com/fundamentals-of-statistics/variance-estimation
• https://onlinecourses.science.psu.edu/stat100/node/58
• https://www.youtube.com/watch?v=jvoxEYmQHNM

Вам также может понравиться