Академический Документы
Профессиональный Документы
Культура Документы
Methods (829N1)
Lecture 2: Samples & Data
Maria Savona
SPRU (Science Policy Research Unit)
Basics on sampling
Sampling
What is sampling?
In statistics and survey methodology, sampling is concerned with the
selection of a subset of individuals from within a statistical population to
estimate characteristics of the whole population.
Population
Inference
Sample
2.
3.
4.
5.
The population is the set of all possible cases of interests (e.g. firms, people,
students, countries, patents, etc.).
Sampling frame
Target population
Sampling frame
The sampling method is the way in which the sample units are to be
selected.
Probability
sampling:
the
extraction of a population unit is known
a) Simple random sampling
b) Systematic sampling
c) Stratified sampling
probability
of
Non-probability sampling
d) Quota sampling
e) Convenience sampling
f) Snowball sampling
b) Systematic sampling
A..
B..
C..
D..
E..
F..
G..
SAMPLE
12
b) Stratified sampling
b) Stratified sampling
LIST OF ALL SOCIAL SCIENCE
POSTGRADUATES SORTED BY
DEPARTMENT
SPRU
Sociology
nSPRU
N
80
50
4
Total _ N 1000
How many
from each stratum?
SPRU
Geography
History
Etc.
SAMPLE
Nh
nh n
Nh
h
14
d) Quota sampling
15
16
f) Snowball Sampling
18
non-sampling error: errors arising from all other aspects of the procedure
a) Poorly designed sampling frame
b) Measurement errors during fieldwork
c) Systematic non response
d) Systematic attrition
20
Data...!
21
Cross-sectional data
Longitudinal data
Time series
Panel data
22
Variables
A variable is a condition or a quality that can differ from
one case to another
Conceptual definition: literal or general definition of the
variable
Operational definition: specifying the criteria for taking
a measurement of that variable
For example:
We want to measure firms innovativeness
We define innovativeness as the capacity to produce new
inventions
We measure innovativeness by taking the number of patents
of the firm
23
Scales of measurement
4 levels of measurement
Nominal measures use numbers simply as labels for different values (e.g.,
Female=1, Male=2; or Bus=1, Train=2, Car=3).
Ordinal measures are like nominal ones in that they too use numbers simply as
labels, but in this case a higher number does indicate more and a lower number less
(e.g., How often do you smoke? Never=1, Sometimes=2, Frequently=3, Very
often=4).
Interval/ratio scales are those that permit to say by how much a case is better or
stronger than another.
Interval scales measure the order of data points and the size of the intervals in
between data points.
Ratio scales are interval scales with a true zero point.
Interval scales can have an arbitrary zero reference point, while ratio scales
have a true zero point. For example 0 age and 0 income means no age and no
income (i.e. the zero reference point is non-arbitrary), while 0C only indicates
the point at which water freezes, it does not mean no heat at all! However, this
distinction is not relevant for the kind of analysis we will carry out.
25
Variable name
Marital status
1
2
3
4
5
1
2
3
4
5
Type
Age
Firms sales
Duration of
unemployment
None
Ratio scale
27
http://www.sussex.ac.uk/its/services/software/owncomputer
28