Академический Документы
Профессиональный Документы
Культура Документы
I.
1.
Systematic
Sampling:
Select
some
starting
point
then
select
every
Kth
member
of
the
group.
Definitions:
1.
Simple
Random
Sample:
A
simple
random
sample
of
n
subjects
is
selected
in
such
a
way
that
every
possible
sample
of
the
same
size
n
has
the
same
chance
of
being
chosen.
2.
Convenience
Sampling:
Using
results
that
are
readily
available
or
that
are
easy
to
get.
3.
Stratified
Sampling:
We
first
subdivide
the
population
into
at
least
two
different
subgroups
(or
strata),
and
draw
samples
from
each
group.
2.
Random
Sample:
Every
individual
member
has
an
equal
change
of
being
selected
from
the
population.
**Consider
the
selection
of
three
students
from
a
class
of
six
students.
If
you
use
a
coin
toss
to
consider
to
select
a
row
where
heads
=
female
and
tails
=
male,
then
randomness
is
used
and
each
student
has
the
same
chance
of
being
selected,
but
the
result
is
not
a
simple
random
sample.
With
the
Coin
toss
some
groups
of
students
do
not
have
an
equal
chance
of
being
selected,
Look
at
our
sample
space
for
selection
of
three
students:
{bbb,
bbg,
bgb,
gbb,
bgg,
gbg,
ggb,
ggg}
So
as
we
recall
from
our
previous
lesson
,
the
group
bbb
has
only
a
1/8
probability
of
being
selected,
but
two
boys
and
a
girl
has
a
3/8
probability
of
being
selected.
Thus
although
our
sample
is
a
random
sample
(each
individual
has
the
same
chance
of
being
selected),
it
is
not
a
simple
random
sample
because
all
groups
do
not
have
the
same
chance
of
being
selected.
Sampling Methods
4.
Cluster
Sampling:
Divide
the
population
into
sections
(or
clusters)
then
randomly
select
some
of
those
clusters
and
then
sample
all
members
of
that
cluster.
Definitions
1.
Frequency
Distribution:
Shows
how
data
is
partitioned
among
several
categories
(classes)
by
listing
the
classes
along
with
the
number
(frequency)
of
data
values
in
each
of
them.
Example:
IQ
Score
of
Low
Lead
Test
Group
IQ
Score
Frequency
50-69
2
70-89
33
90-109
35
110-129
7
130-149
1
2.
Cumulative
Frequency:
The
sum
of
the
frequency
class
and
all
previous
frequency
classes.
IQ
Score
50-69
70-89
90-109
110-129
130-149
Frequency
2
33
35
7
1
Cuml.
Freq
2
35
70
77
78
III.
Normal
Distribution
1.
When
graphed
as
a
histogram
or
frequency
distribution
table,
a
normal
distribution
has
a
bell
shape.
1.
The
frequencies
increase
to
a
maximum
and
then
decrease,
and
2.
The
graph
has
symmetry
with
the
left
have
being
close
to
a
mirror
image
of
the
right
half.
3.
Relative
Frequency:
The
proportion
or
percent
of
the
observation
that
falls
within
each
class.
*This
is
obtained
by
dividing
the
frequency
by
the
last
number
in
the
cumulative
frequency
column,
which
is
the
total
number
of
samples.
IQ
Score
Frequency
Cuml
Freq
Probabil
ity
50-69
70-89
90-109
110-129
130-149
2
33
35
7
1
2
35
70
77
78
2/78
33/78
35/78
7/78
1/78
Relative
Freq.
3%
42%
45%
9%
1%
4.
Class
Width:
The
difference
between
two
lower
class
limits.
*Example:
The
class
limit
in
the
above
sample
is
70-
50=20,
90-70=20,
110-90=20
and
130-110=20.
Thus
the
class
width
is
20.
5.
Class
Boundary:
The
number
used
to
separate
the
classes
but
without
the
gaps.
*Example:
Notice
that
69.5,
89.5.
and
109.75
dont
exist
on
our
class
width,
but
the
class
boundary
includes
all
numbers
that
are
less
than
70
ie.
69.9999999999999999999999999999999-
6.
Class
Midpoint:
The
value
of
the
middle
of
the
classes.
It
is
computed
by
adding
the
two
consecutive
lower
class
boundaries
and
dividing
by
2.
IV.
Definitions
1.
Parameter:
A
numerical
measurement
describing
some
characteristic
of
a
population.
2.
Statistic:
A
numerical
measurement
describing
some
characteristic
of
a
sample.
Example:
The
population
size
of
241,472,385
is
a
parameter,
because
it
is
based
on
the
entire
population
of
the
United
States.
The
sample
size
of
2320
surveyed
adults
in
the
United
States
is
a
statistic,
because
it
is
based
on
a
sample.
The
value
of
5%
would
also
be
a
statistic,
because
it
is
also
based
on
the
sample.
3.
Quantitative
Data
(numerical)
data:
consists
of
numbers
representing
counts
or
measurements.
4.
Categorical
Data:
Consists
of
names
or
labels
that
are
not
numbers
representing
counts
or
measures.
**Example:
The
ages
(in
years)
of
survey
respondents
is
Quantitative
Data.
Party
Affiliations
like
Republican
or
Democratic
is
categorical
data.
Numbers
on
the
back
of
the
Arkansas
Razorbacks
football
team
is
categorical
data
since
the
numbers
do
not
provide
some
tangible
measurement.
5.
Discrete
data:
Results
when
the
data
are
quantitative
and
the
number
of
values
is
finite
and
consists
of
whole
numbers.
6.
Continuous
Data:
Results
when
you
have
an
infinitely
many
possible
quantitative
values.
B.
Pareto
Charts:
This
is
a
bar
graph
for
categorical
data,
with
the
added
stipulation
that
the
bars
are
arranged
in
descending
order
according
to
frequency.
The
vertical
scale
of
a
pareto
chart
represents
frequencies
or
relative
frequencies
and
the
horizontal
scale
represents
different
categories
of
qualitative
data.
**
Example:
The
number
of
eggs
that
hens
lay
in
one
week
is
discrete
data,
because
eggs
are
laid
in
whole
numbers,
i.e.
cant
lay
half
an
egg.
Then
number
of
gallons
of
milk
a
cow
produces
in
a
year
is
continuous
data,
because
a
cow
may
produce
any
amount
between
0
and
a
maximum.
A.
Bar
Graphs:
use
bars
of
equal
width
to
show
frequencies
of
categories
of
categorical
data.
The
vertical
scale
represents
the
frequencies
or
relative
frequencies
and
the
horizontal
scale
identifies
the
different
categories.
1.
Pie
Chart:
A
graph
that
depicts
categorical
data
as
slices
of
a
circle
in
which
the
size
of
each
circle
is
proportional
to
the
frequency
count
for
the
category.
5.
Probability
Histogram:
a
graph
consisting
of
bars
of
equal
width
drawn
adjacent
to
each
other.
The
horizontal
scale
represents
classes
of
quantitative
data
and
the
vertical
scale
represents
the
probability.
***
Note
that
pie
charts
are
not
as
good
at
displaying
information
as
Pareto
Charts
because
they
are
visually
less
informative.
B.
Requirements
for
a
Probability
Distribution
Definitions:
1.
There
is
a
numerical
random
variable
x
and
its
values
are
associated
with
corresponding
probabilities.
1.
Random
Variable:
a
variable
that
is
typically
represented
by
(x)
that
has
a
single
numerical
value,
determined
by
chance,
for
each
outcome
of
a
procedure.
2.
Discrete
Random
Variable:
a
variable
that
has
a
collection
of
values
that
is
finite
and
countable
(whole
numbers).
3.
Continuous
Random
Variable:
has
infinitely
many
values,
and
the
collection
of
values
is
not
countable
(not
a
whole
number).
4.
Probability
Distribution:
A
description
that
gives
the
probability
for
each
value
of
the
random
variable.
It
is
often
expressed
in
the
format
of
a
table,
formula,
or
graph.
3.
!0 P(x) 1
for
every
individual
value
of
the
random
variable
x.
This
is
just
like
before,
with
the
probability
being
between
0
and
1
inclusive.
C.
Identifying
Unusual
Results
with
the
Range
Rule
of
Thumb
1.
A
crude
but
simple
tool
for
understanding
and
interpreting
standard
deviation.
The
center
and
spread
of
a
distribution
can
be
used
to
calculate
a
maximum
usual
and
a
minimum
usual
value.
IF
a
value
falls
outside
the
min/max
usual
value
then
it
is
deemed
unusual.
**This
is
a
guesstimate
only
and
not
a
rigid
rule.
D.
Using
Probabilities
to
determine
when
results
are
unusual
1.
Rare
Event
Rule:
if,
under
a
given
assumption,
the
probability
of
an
observed
event
is
very
small,
then
we
conclude
that
the
assumption
is
probability
not
correct.
B.
Density
Curves
1.
The
graph
of
a
continuous
probability
distribution
that
satisfies
the
following
two
requirements:
1.
The
total
area
under
the
curve
equals
1,
and
2.
Every
point
on
the
curve
must
have
a
vertical
height
that
is
zero
or
greater.
C.
**Example:
given
the
uniform
distribution
illustrated,
find
the
probability
that
a
randomly
selected
voltage
level
is
greater
than
124.5
volts.
a)
x
successes
among
n
trials
is
an
unusually
high
number
of
successes
if
the
probability
of
x
or
more
success
has
a
probability
of
0.05
or
less.
!P(x) 0.05
b)
x
successes
among
n
trials
is
an
unusually
low
number
of
successes
if
the
probability
of
x
or
fewer
success
has
a
probability
of
0.05
or
less.
!P(x) 0.05
Uniform
Distributions
1.
A
continuous
random
variable
has
a
uniform
distribution
if
its
values
are
spread
evenly
over
the
range
of
possibilities.
The
graph
of
a
uniform
distribution
is
rectangular
in
shape.
Since
the
area
is
a
rectangle,
and
we
know
the
formula
for
finding
the
area
of
a
rectangle
is
base
x
height,
then
to
find
the
probability
we
multiply
the
distance
between
the
values
on
the
x
axis,
125.0-
124.5=.5
and
the
value
of
the
height
P(x)=.05.
Thus
the
problem
becomes:
(125.0124.5)0.5 = 0.50.5
!= 0.25
So
the
probability
of
randomly
selecting
a
voltage
level
greater
than
124.5
volts
is
P(x)=
0.25
.
**
Example:
For
New
York
City
weekday
late-afternoon
subway
travel
from
Times
Square
to
the
Mets
Stadium,
you
can
take
the
#7
train
that
leaves
Times
Square
every
5
minutes.
Given
the
subway
departure
schedule
and
the
arrival
of
a
passenger,
the
waiting
time
is
x
between
0
and
5
minutes,
as
described
by
the
uniform
distribution
depicted
below.
Note
that
waiting
times
can
have
any
value
between
0-5
minutes
so
it
is
possible
to
have
a
waiting
time
of
2.33457
minutes
and
that
all
waiting
times
are
equally
likely.
Given
the
uniform
distribution,
find
the
probability
that
a
randomly
selected
passenger
has
a
waiting
time
of
more
than
2
minutes.