Вы находитесь на странице: 1из 49

Research Methodology

Lecture 6 : Methods of Data Collection


Data collection
✓ After a researcher defines the things, phenomena, or
variables to be studied, a problem and hypothesis are
formulated.

✓ The next step is for the researcher to determine how the


variables or things being studies must be measured,
observed, or recorded

✓ Appropriate data collection is essential to the validity of a


study
Data collection: definition

✓ Data collection is a term used to describe a process of


preparing and collecting data,
✓ The purpose of data collection is to obtain information to
keep on record, to make decisions about important issues,
to pass information on to others.
✓ Primarily, data are collected to provide information
regarding a specific topic.
Data collection plan

✓Pre collection activity: agree on goals, target


data, definitions, methods

✓Collection: data collection

✓Presenting findings: usually involves some form


of sorting analysis and or presentation
Methods of Data collection
✓ Qualitative
– typically involves qualitative data, i.e., data obtained through
methods such interviews, on-site observations, and focus
groups that is in narrative rather than numerical form

✓ Quantitative
– use numerical and statistical processes to answer specific
questions. Statistics are used in a variety of ways to support
inquiry or program assessment/evaluation.
Methods of Data collection
✓ Qualitative data collection
– they tend to be open-ended and have less structured
protocols (i.e., researchers may change the data collection
strategy by adding, refining, or dropping techniques or
informants)
– they rely more heavily on interactive interviews;
respondents may be interviewed several times to follow
up on a particular issue, clarify concepts or check the
reliability of data
Methods of Data collection
✓ Qualitative data collection
– findings are not generalizable to any specific population
– Data collection in a qualitative study takes a great deal of
time.
– The researcher needs to record any potentially useful data
– The qualitative methods most commonly used in evaluation
can be classified in:
• in-depth interview
• observation methods
• document review
Methods of Data collection
✓ Quantitative data collection
– They produce results that are easy to summarize, compare,
and generalize.
– Participants may be randomly assigned to different
treatments.
– Collect data on participant and situational characteristics in
order to statistically control for their influence on the
dependent, or outcome, variable.
Methods of Data collection
✓ Quantitative data collection
– To generalize from the research participants to a larger
population, the researcher will employ probability sampling
to select participants.
– Typical quantitative data gathering strategies include:
• Experiments trials.
• Observing and recording well-defined events (e.g., counting the
number of patients waiting in emergency at specified times of the
day).
• Obtaining relevant data from management information systems.
• Questionnaires
• Administering surveys with closed-ended questions (e.g., face-to face
and telephone interviews, questionnaires, etc.)
Methods of Data collection:
type of the study
✓ Census: Data from every member of a population. In most studies,
a census is not practical, because of the cost and/or time required.
✓ Sample survey. Data from a subset of a population, in order to
estimate population attributes.
✓ Experiment. A controlled study in which the researcher attempts to
understand cause-and-effect relationships. The study is "controlled"
in the sense that the researcher controls (1) how subjects are
assigned to groups and (2) which treatments each group receives.
✓ Observational study. Like experiments, observational studies
attempt to understand cause-and-effect relationships. However,
unlike experiments, the researcher is not able to control (1) how
subjects are assigned to groups and/or (2) which treatments each
group receives.
Methods of Data collection
Pros and cons
✓ Resources. When the population is large, a sample survey has a big resource
advantage over a census. A well-designed sample survey can provide very precise
estimates of population parameters - quicker, cheaper, and with less manpower than
a census.

✓ Generalizability. Generalizability refers to the appropriateness of applying findings


from a study to a larger population. Generalizability requires random selection. If
participants in a study are randomly selected from a larger population, it is
appropriate to generalize study results to the larger population; if not, it is not
appropriate to generalize.
Observational studies do not feature random selection; so it is not appropriate to
generalize from the results of an observational study to a larger population.

✓ Causal inference. Cause-and-effect relationships can be teased out when subjects are
randomly assigned to groups. Therefore, experiments, which allow the researcher to
control assignment of subjects to treatment groups, are the best method for
investigating causal relationships.
Where do data come from?

✓Take a step back – if we’re starting from


baseline, how do we collect / find data?

– Secondary data
• data someone else has collected
– Primary data
• data you collect
Secondary Data: Sources
✓ County transportation departments
✓ Vital Statistics – birth, death certificates
✓ Private and foundation databases
✓ City and county governments
✓ Surveillance data from state government programs
✓ Federal agency statistics - Census, etc.
Secondary Data: Limitations
✓When was it collected? For how long?
– May be out of date for what you want to analyze.
– May not have been collected long enough for detecting
trends.
✓Is the data set complete?
– There may be missing information on some observations
– Unless such missing information is caught and corrected
for, analysis will be biased.
Secondary Data: Limitations
✓ Are there confounding problems?
– Sample selection bias?
– Source choice bias?
– In time series, did some observations drop out over time?
✓ Are the data consistent/reliable?
– Did variables drop out over time?
– Did variables change in definition over time?
✓ Is the information exactly what you need?
– In some cases, may have to use “proxy variables” – variables that may
approximate something you really wanted to measure.
– Are they reliable?
– Is there correlation to what you actually want to measure?
Secondary Data – Advantages
✓ No need to reinvent the wheel.
– If someone has found the data, take advantage of it.
✓ It will save you money.
– Even if you have to pay for access, often it is cheaper in terms of money than
collecting your own data.
✓ It will save you time.
– Primary data collection is very time consuming.
✓ It may be very accurate.
– When especially a government agency has collected the data.
✓ It has great exploratory value
– Exploring research questions and formulating hypothesis to test.
Primary Data - Examples
✓ Surveys
✓ Focus groups
✓ Questionnaires
✓ Diaries
✓ Personal interviews
✓ Biophysiologic Measures (in vivo/in vitro)
✓ Experiments and observational study
Questionnaires
Advantages:
✓Can be posted, e-mailed or faxed with a wide
geographic coverage
✓Can cover a large number of people or
organizations.
✓Relatively cheap.
✓Avoids embarrassment on the part of the
respondent.
✓Possible anonymity of respondent.
✓No interviewer bias.
Questionnaires
Disadvantages:
✓ Design problems.
✓ Questions have to be relatively simple.
✓ Historically low response rate (although inducements
may help).
✓ Time delay whilst waiting for responses to be returned.
✓ Require a return deadline.
✓ Several reminders may be required.
✓ International validity
✓ Not possible to give assistance if required.
✓ Problems with incomplete questionnaires.
✓ Respondent can read all questions beforehand and
then decide whether to complete or not.
Personal Interviews
(structured; semistructured; unstructured)
Advantages:
✓ Serious approach by respondent resulting in accurate
information.
✓ Good response rate.
✓ Complete and immediate.
✓ Interviewer in control and can give help if there is a
problem.
✓ Can investigate motives and feelings.
✓ Can use recording equipment.
✓ If one interviewer used, uniformity of approach.
✓ Used to pilot other methods.
Personal Interviews
(structured; semistructured; unstructured)
Disadvantages:
✓Need to set up interviews, time consuming
and geographic limitations.
✓Can be expensive.
✓Normally need a set of questions.
✓Respondent bias – tendency to please or
impress, create false personal image, or end
interview quickly.
✓Embarrassment possible if personal questions.
✓If many interviewers, training required.
Phone interviews
Advantages:
✓Relatively cheap and quick.
✓Can cover reasonably large numbers of
people or organisations.
✓Wide geographic coverage.
✓High response rate.
✓Help can be given to the respondent.
✓Can tape answers.
Phone interviews
Disadvantages:
✓Questionnaire required.
✓Not everyone has a telephone.
✓Repeat calls are inevitable – average 2.5 calls
to get someone.
✓Time is wasted.
✓Respondent has little time to think.
✓Cannot use visual aids.
✓Can cause irritation.
✓Good telephone manner is required.
Primary Data - Limitations
✓ Do you have the time and money for:
– Designing your collection instrument?
– Selecting your population or sample?
– Administration of the instrument?
– Entry/collection of data?
✓ Uniqueness
– May not be able to compare to other populations
✓ Researcher error
– Sample bias
– Other confounding factors
Data collection:
Take home message
✓ Data collection is essential for study validity
✓ Experimental research is mostly based on quantitative method of
data collection
✓ Will the data answer my research question?
✓ If that data exist in secondary form, then use them to the extent
you can, keeping in mind limitations.
✓ But if it does not, and you are able to fund primary collection,
then it is the method of choice.
Quantitative Methods
• Experiment: Research situation with at least
one independent variable, which is
manipulated by the researcher
Dependent and Independent Variable
• Independent Variable: The variable in the
study under consideration. The cause for the
outcome for the study.
• Dependent Variable: The variable being
affected by the independent variable. The
effect of the study
y = f(x)
Which is which here?
Key Factors for High Quality
Experimental Design
• Data should not be contaminated by poor
measurement or errors in procedure.
• Eliminate confounding variables from study or
minimize effects on variables.
• Representativeness: Does your sample
represent the population you are studying?
Must use random sample techniques.
What Makes a Good Quantitative
Research Design?
4 Key Elements

• Freedom from Bias


• Freedom from Confounding
• Control of Extraneous Variables
• Statistical Precision to Test Hypothesis
Bias, Confounding and Extraneous
Variables
• Bias: When observations favour some individuals
in the population over others.
• Confounding: When the effects of two or more
variables cannot be separated.
• Extraneous Variables: Any variable that has an
effect on the dependent variable.
• Need to identify and minimize these variables.
e.g., Erosion potential as a function of clay
content. rainfall intensity, vegetation & duration
would be considered extraneous variables.
Precision versus accuracy
• "Precise" means sharply
defined or measured.

• "Accurate" means truthful or


correct.
Both Accurate Accurate
and Precise Not precise

Not accurate
But precise
Neither accurate
nor precise
Interpreting Results of Experiments
• Goal of research is to draw conclusions. What
did the study mean?
• What, if any, is the cause and effect of the
outcome?
Introduction to Sampling
• Sampling is the problem of
accurately acquiring the necessary
data in order to form a
representative view of the problem.
• This is much more difficult to do than
is generally realized.
Overall Methodology:
• State the objectives of the survey
• Define the target population
• Define the data to be collected
• Define the variables to be determined
• Define the required precision & accuracy
• Define the measurement `instrument'
• Define the sample size & sampling method,
then select the sample
Sampling
Distributions:
• When you form a sample you often show it by
a plotted distribution known as a histogram .
• A histogram is the distribution of frequency of
occurrence of a certain variable within a
specified range.
• NOT A BAR GRAPH WHICH LOOKS VERY
SIMILAR
Interpreting quantitative findings
• Descriptive Statistics : Mean, median, mode,
frequencies

• Error analyses
Mean
• In science the term mean is really the
arithmetic mean
• Given by the equation

n
1
• X = /n xi
i=1

Or more simply put, the sum of values divided by the number of values
summed
Median
• Consider the set
• 1, 1, 2, 2, 3, 6, 7, 11, 11, 13, 14, 16, 19
– In this case there are 13 values so the median is the middle
value, or (n+1) / 2
– (13+1) /2 = 7
• Consider the set
• 1, 1, 2, 2, 3, 6, 7, 11, 11, 13, 14, 16
– In the second case, the mean of the two middle values is
the median or (n+1) /2
(12 + 1) / 2 = 6.5 ~ (6+7) / 2 = 6.5
Mode
The most frequent value in a data set
• Consider the set
• 1, 1, 1, 1, 2, 2, 3, 6, 11, 11, 11, 13, 14, 16, 19
– In this case the mode is 1 because it is the most common value

• There may be cases where there are more than one


mode as in this case

• Consider the set


• 1, 1, 1, 1, 2, 2, 3, 6, 11, 11, 11, 11, 13, 14, 16, 19
– In this case there are two modes (bimodal) : 1 and 11 because
both occur 4 times in the data set.
USES AND MISUSES OF
STATISTICS
Uses of Statistics
• Describe data
• Compare two or more data sets
• Determine if a relationship exists between
variables
• Test hypothesis (educated guess)
• Make estimates about population characteristics
• Predict past or future behavior of data

Use of statistics can be impressive to employers.


Sources of Misuse
• There are two main sources of misuse of
statistics:
– Evil intent on part of a dishonest researcher
– Unintentional errors (stupidity) on part of a
researcher who does not know any better
Misuses of Statistics
• Samples
– Voluntary-response sample (or self-selected sample)
• One in which the subjects themselves decide whether to be
included---creates built-in bias
– Telephone call-in polls (radio)
– Mail-in polls
– Internet polls
– Small Samples
• Too few subjects used
– Convenience
• Not representative since subjects can be easily accessed
Misuses of Statistics
• Graphs
– Can be drawn
inappropriately leading
to false conclusions
• Watch the “scales”
• Omission of labels or
units on the axes
• Exaggeration of one-
dimensional increase by
using a two-dimensional
graph
Misuses of Statistics
• Survey Questions
– Loaded Questions---unintentional wording to elicit
a desired response
– Order of Questions
– Nonresponse (Refusal)—subject refuses to answer
questions
– Self-Interest ---Sponsor of the survey could enjoy
monetary gains from the results
Misuses of Statistics
• Missing Data (Partial Pictures)
– Detached Statistics ---no comparison is made
– Percentages --
• Precise Numbers
– People believe this implies accuracy
• Implied Connections
– Correlation and Causality –when we find a statistical
association between two variables, we cannot
conclude that one of the variables is the cause of (or
directly affects) the other variable

Вам также может понравиться