Вы находитесь на странице: 1из 271

Research Design in Counseling

Psychology
Fall, 2014
Tuesdays, 1:00 to 3:50; 142 HEDCO Building

Instructor:

Elizabeth A. Skowron, Ph.D.,


257 HEDCO Building
541-346-0913
eskowron@uoregon.edu
office hrs: after class & by appt.
1

Course Overview

Scientific Methods
Ethical research practice
Sampling, measurement, and methods of data collection
Research designs
o
o
o
o

Experimental
Quasi-experimental
Correlational
Longitudinal

Types of validity & plausible threats


Culturally competent research
Randomized clinical trials
Implementation (taking effective interventions to scale)
2

Course Overview
Scheduled
date

Weight
(in %)

Activity

All term

15

Class Preparation & Participation

Week 3

CITI Certification

Week 5

25

Exam I

Week 8

30

Exam II

All term

25

In-class/Homework Activities
(5@10 pts each)

Introductions
Three Programs
Counseling Psychology
Couples & Family Therapy
Prevention Science
Introductions
Name __________________
Program ____________________
Interests ____________________
RATE YOUR Research self-efficacy
Research skill/experience

(1 2 3 4) (a little a lot)
(1 2 3 4) (a little a lot)
4

Ways of Knowing
Method of tenacity
o The beliefs I firmly adhere to are true

Method of authority
o If noted authorities (i.e., my father, the president, my therapist, my pastor) say it is so,
then it is truth

A priori method
o What makes sense is true

Scientific method
o What is discovered through empiricism is true

o Empiricism = the approach of collecting data and using it to develop,


support, or challenge a theory

Science
A dynamic view regards science as an activity
Make discoveries
Learn facts
Advance knowledge
o Establish general laws & connect knowledge of separately known
events, make reliable predictions of events yet unknown
Improve quality of life

The basic aim of science is discovery that leads to theory


Theory
1.
2.
3.

a set of interrelated constructs, definitions, and propositions


presents a systematic view of phenomena by specifying relations among
variables,
its purpose is to explain and predict phenomena

The Research Process

(theory building & testing ~ inextricably related)

Theory Building & Theory Testing in


Research

Fundamentals of Scientific Exploration


1. Describe
o What is happening? How does it occur?
o Identify and understand phenomenaspecial, meaningful events whose cause is in
questionin order to reveal their underlying regularities
o Enables us to build models and construct theory to account for those regularities

2. Explain
o Why is it happening? How are things interrelated?
o Involves revealing the nature and structure of phenomena and their operation in
specific conditions
Empirical pattern identification
Theory testing

3. Predict
o Speculate or test what will happen in the future, based on our (theoretical/empirical)
models for what happens and why it happens

4. (Influence)
9

The Scientific Method


Make observations
Ask questions about the observations
(i.e., frequency, association, causal)
Form a hypothesis
Design research study appropriate to test the hypothesis
Collect data
Analyze data
Accept or reject the hypothesis
o Accept ~ confirm theory
o Reject ~ reject or revise theory

10

Three Kinds of Conclusions To


Draw from Research
Frequency Claims
o
o
o

Describe a particular rate or level of something


Typically focus on 1 variable
Variable is measured, not manipulated
More than 2 million U.S. teens depressed
Half of Americans struggle to stay happy
Almost 1 million children were abused or neglected last year

Associational Claims
o
o
o

Argue that 2+ variables are related (+/-)


Involve at least 2 variables
Variables are measured, not manipulated
Belly fat linked to dementia
Laptop computer use linked to poor sperm quality
Poor nutrition associated with school failure

Causal Claims
o
o

Argue that one variable causes another


Study must have meet 3 criteria of covariation, temporal precedence, and internal validity
Music lessons enhance IQ
PCIT reduces child maltreatment recidivism
Debt stress causes health problems
11

Three Rules for Causation


In order to make the claim that one variable CAUSES
another variable, the following 3 conditions must apply:
1. Covariation
o

Two variables are associated. As A changes, B changes


(e.g., A= public service ad on parenting, B = child abuse)

2. Temporal Precedence
o

Cause precedes effect. A appears and then B follows; changes in A precede


changes in B
(e.g., public service ads appear on TV, then child abuse rates drop)

3. Internal Validity
o

Plausible alternative explanations for the results (i.e., 3rd variable threats) are
ruled out
There are no likely alternative explanations for the change in B; A is the only
thing that changed
12

Instructions for Registering and Completing


CITI RCR training

Go to the https://www.citiprogram.org/ CITI Website

New Users: Click on the New Users Register Here link.


o
o
o

From the Participating Institutions drop-down menu, select University of Oregon as your institution.
Create your username, password and security question and answer.
Enter your contact information.

To complete the CITI course, you must complete all required modules
and quizzes, achieving a minimum passing score of 80%. A quiz can be
taken more than once to achieve this minimum score. You are not
required to complete the course in one sitting. Your progress will be
saved if you choose to stop the course and return at a later time.

When you complete all required modules successfully, please print or


download your completion report. A copy will be sent automatically to
Research Compliance Services. Send a copy of your completion report
to Dr. Skowron, at eskowron@uoregon.edu with the message topic
CITI training completed. You can return to the CITI site at any time to
obtain a copy of your completion report.
13

Next Week
Research Continuum
Variables & methods of
measurement
Developing a research study
_________________
Ethical conduct in research
CITI training review

14

Research Design in Counseling


Psychology

Class 2

Instructor:

Elizabeth A. Skowron, Ph.D.,


257 HEDCO Building
541-346-0913
eskowron@uoregon.edu
1

Fundamentals of Scientific Exploration


aka the course of scientific progress
Description
o What is happening? How does it occur?
o Identify and understand phenomenaspecial, meaningful events whose cause is in
questionin order to reveal their underlying regularities
o Enables us to build models and construct theory to account for those regularities

Explanation
o Why is it happening? How are things interrelated?
o Involves revealing the nature and structure of phenomena and their operation in
specific conditions
Empirical pattern identification
Theory testing

Prediction
o Speculate or test what will happen in the future, based on our (theoretical/empirical)
models for what happens and why it happens

(Influencing)
2

TheoryData Feedback Loop


The basic aim of science is to understand/explain natural
phenomena
These explanations are called Theories
Instead of trying to explain each and every separate behavior of children, we seek
general explanations that encompass and link together many kinds of (similar)
behavior

We formulate hypotheses based on our theory


We collect data to test the hypotheses
Data informs accuracy of our theory, and leads to revisions &
modifications to theory
We formulate new hypotheses
We collect data to test the hypotheses
AND SO ON
3

The Contact-Comfort Theory


(Another example of Theory-Data Cycle)

Example: TheoryData Loop


Repairing ruptures in the therapeutic alliance in psychotherapy
(Safran et al., 2011)
Roughly 50% of psychotherapy cases experience an alliance rupture

Theory of how to repair alliance ruptures was constructed,


Data collected via studies of psychotherapy process during
sessionswhat rupture/repair processes lead to + outcomes?
Alliance ruptures & repairs defined & measured from client,
therapist, & observer perspectives
Rupture = disagreements about the tasks of therapy, goals of treatment, or strains in the
client-therapist bond

Pattern of rupture repairs linked with good outcomes, refined,


retested
Common rupture-repair interventions:

Therapist acknowledges rupture,


explores it with client,
clarify misunderstandings,
Therapist takes responsibility for his/her contribution,
explore relational themes (in clients life) associated with rupture,
link therapy rupture to common patterns in clients life,
facilitate new experience
5

Research Continuum
BasicTranslationalApplied
Basic: Pure research that advances fundamental knowledge
about the human world. Focuses on refuting or supporting
theories. The source of most new scientific ideas and ways of
thinking about the world. It can be descriptive or explanatory.
Translational: Research that applies findings from basic
science to practical applications that enhance human health
and well-being. Applying knowledge from basic research is a
major stumbling block in science, partially due to
compartmentalization of work based on expertise.

Applied: Form of research involving the practical application of


science.
6

Developing a Research Project


Identify a topic or area of interest
Formulate research problem
Specify in terms of question re: relationship between 2+
variables
Translate question into a testable hypothesis
o Is it falsifiable?

Design study to test your hypothesis

Identifying Research Topics


Personal interests/experience
Read journals
Study theory
____________________________________

Science must operate at the level of observation, and


gather data to test hypotheses
o Requires us to move from the construct level to the observational level
e.g., early deprivation and learning problems
o We have to define our constructs clearly enough so that observations are
possible
8

Operationalizing Research Topics


Constructs: are concepts that cannot be directly observed
Variable: is a symbol to which numbers or values are
assigned; can take on any set of values; can be dichotomous
to continuous
o When operationally-defined, they are observable

Operational definitions: assign meaning to a construct/


variable by spelling out what the investigator must do to
measure it
o (1) measured: describes how the variable will be measured
o (2) experimental: spells out the details of the investigators manipulation of a variable
o Reinforcement schedule
o Intervention type & dosage
o No operational definition can ever reflect all of a variable
9

Types of Variables
1. Independent and Dependent variables
o
o
o
o
o

We are trying to explain the DV or predict the DV


In correlational and/or experimental studies, we look for variation in the IV to predict
the DV
In experiments we manipulate the IV and look for effects on the DV
Causal claims: IV is presumed cause of DV; IV is antecedent & DV is consequent
Association claims: Variables may be called predictor (IV) and criterion (DV)

2. Active and Attribute variables


o Active variables are manipulated (e.g., dose of prevention; experimentally-induced
stressor)
o Attribute variables cannot be manipulated, can only be measured
(e.g., most human characteristics: ethnicity, age, sex)
o Some attribute variables may also be active, depending on your design
(i.e., anxiety)

3. Continuous and Categorical variables


o Continuous variables take on an ordered set of values (rank, interval, ratio scale)
o Categorical variables belong to a nominal scale of measurement (two or more subsets
of sets are measured (i.e., political party membership, sex, college alma mater,
religion, etc.)
10

Methods of Measurement
1. Self-report
o
o
o

Participant makes an observation or report on self


+ : easy to administer, economical, accesses private thoughts, feelings, behavior not
accessible to investigators
-- : vulnerable to distortion, presume client insight/understanding about construct being
measured

2. Other-report (parents, therapist, teacher, etc.)


o
o
o

Respondents rate the participant on some dimension(s)


+ : easy to administer, economical
-- : potential systematic bias (e.g., cultural competence of rater cross-cultural child
development study)

3. Behavioral observations
o
o
o

Measures of overt behavior by trained observers using coding system


+ : direct and objective
-- : presumption that observed behavior is representative; costly; feasibility?

4. Neurobiological indices
5. Interviews
o
o

+ : flexible, high completion rate


-- : costly; feasibility?

6. Unobtrusive measures
o
o
o

Assessment conducted without participants awareness


+ : eliminates reactivity to measurement
-- : expensive?; some types are unethical

11

Writing Research Problems and Hypotheses


1. Work with your table group to brainstorm a list of interesting
research topics (tables).
2. Work with a partner to identify two topics of interest to you from
the list of topics. State each of these topics as a question about
the relationship between 2 variables (2 person groups).
3. Write a definition for each of your variables.
1. Identify IVs and DVs; active vs. attribute variables
2. What methods can be used to measure each of your
variables?
(e.g., self-report, observation, performance, others report
teacher/parent/spouse, other)

4. Discuss in class

12

Goal of the Ethical Research


to create new knowledge (beneficence)
while preserving the dignity and welfare of
participants (non-maleficence &
autonomy)

13

Ethical scholarship
As researchers, we have responsibility to seek and share
accurate information in our scholarly endeavors, in:
1. Executing a research study
o
o

Respect Ss rights, conduct study carefully, minimize bias in methods & measures, &
ensure both data & analyses are error-free
Maintain raw data for 5+ years post-publication

2. Reporting our results


o
o

Accurately, honestly, note limitations& guard against misuse of results


the facts are always friendly Carl Rogers

3. In presentations & publications


o
o

Avoid duplicate/piecemeal publication


Clearly identify multiple publications from same data set

4. Giving accurate publication credit


Major contributions = authorship
formulate research question/hypotheses, design study, conduct analyses, write
manuscript
Minor contributions = footnote (i.e., editing, collect data, code data, clerical work
14

Publication Credit
In case of student thesis or dissertation, APA guidelines
state that except under exceptional circumstances, a
student is listed as principal author on any multipleauthored article that is substantially based on the students
doctoral dissertation

Plagiarism
1. Omitting necessary citations
2. Failing to cite relevant work
3. Verbatim copying of anothers writing
FIX: Give credit where/when it is due
15

History of unethical treatment of


research participants
Nazi prison camp experiments
Nuremberg Code
o Basis for first guidelines regarding ethical treatment of research participants

Tuskegee Syphilis Study (1932-1972)


o Whistle-blower ends study

1974: Code of Federal Regulations


implemented Public Law 93-348 (rev. 1983)
o establishing Institutional Review Boards (IRBs) to protect human participants in
biomedical & behavioral research

16

Ethical Violations
Tuskegee Syphilis Study
In 1932, U.S. Public Health Service in
cooperation with the Tuskegee Institute
began a 40-year study of 600 Black
men to understand effects of syphilis
on health over time
400 already infected
200 were not

Researchers lied
o
o

Told men they were being treated, but none were


Conducted painful spinal taps to track disease progression, but told men it was a special free treatment

Withheld information
o
o

Men who contracted the disease were not informed


1947: penicillin discovered as cure, but this fact was not shared with participants

Actively interfered with mens efforts to get treatment


Acts prevented men from serving in armed forces and benefiting from GI bill and benefits

1969: PHS employee blows whistle, no action, 1972 breaks story to Associated Press
1972: Study ends

17

The Belmont Report: Each Principle


Has an Application
Respect for persons
Informed consent
Protection of vulnerable populations

Beneficence
Cost-benefit analysis for participants
Cost-benefit for society

Justice
How are participants selected? Do they
represent the people who will benefit from the
study?

high benefit
Low benefit

Benefit to society

Beneficence: Cost-Benefit Balance


Do the study

Do the
study?

Do the
study?

Dont do
the study

low risk

high risk
Risk to participants

APA Guidelines for ethical research practice


Guiding principles
1. Non-maleficence
o First, do no harm

2. Beneficence
o

Do good & give back to the community

Respect for persons

3. Justice
o

Fairness, including rewards for ones labor

4. Autonomy
o
o

Right to voluntarily participate or decline to


Underpins informed consent

5. Fidelity
o

Faithfulness, loyalty, keeping promises to maintain


confidentiality, etc.
20

IRB Guidelines for Ethical Treatment of

Research Participants
1. Risks and Benefits
o
o
o
o

ID risks & work to eliminate or minimize these; protect SS from harm


ID potential benefits to SS; clarify benefits for whom?
Weigh the balance of risks-benefits
Pilot all new procedures, measures

2. Informed Consent
o Give SS a fair, clear, explicit summary including risks & benefits, then seek consent to
participate
o Obtain assent from children
o Consider ability to provide consent mental competence, etc.
o Voluntariness: consent must be free of any coercion (i.e., students, institutionalized
persons, client status, etc.)
o Document

3. Deception & Debriefing


o Involves deliberate withholding of info or providing misinformation to SS (i.e., Cole et
al.s Disappointment task)
o Additional responsibilities & safeguards are required with use of deception

21

IRB Guidelines for Ethical Treatment of

Research Participants
4. Confidentiality & Privacy
o Protect any information that a SS shares during the study
o Concern for well-being may necessitate
Any exceptions are clearly stated (i.e., harm to self/others)
o Anonymity = no identifiers can link you to your data

5. Treatment issues
o (withholding effective treatment, deception)
o Great concern when withholding a treatment known to be effective
Strategies: wait-list & delayed treatment groups; contrast with treatment as
usual

22

Instructions for Registering and Completing


CITI RCR training

Go to the https://www.citiprogram.org/ CITI Website

New Users: Click on the Register link under Create an Account.


o Start typing University of Oregon as your organization and click the option when it appears.
o Enter your contact information
o Create your username, password and security question and answer.
o The next step involves optional collection of demographic information. Answer as you prefer and continue to the next step.
o Answer No regarding professional continuing education requirements (Not applicable to RCR users)
o Complete required questions in the next step, regarding institutional e-mail address, gender, etc.
o Skip the Human Subjects Research question and move on to the Responsible Conduct of Research (RCR) training question
o Select the RCR course most appropriate to your research discipline (i.e., social and behavioral sciences) and your status at the University
(undergraduate student, graduate student, or postdoctoral researcher). If you have any questions regarding which course you should take,
please contact me.
o The remaining courses do not apply to the RCR training. Click the Complete Registration button at the end.

To complete the CITI course, you must complete all required modules and quizzes, achieving a minimum passing
score of 80%. A quiz can be taken more than once to achieve this minimum score. You are not required to
complete the course in one sitting. Your progress will be saved if you choose to stop the course and return at a
later time.

When you complete all required modules successfully, please print or download your completion report. A copy will
be sent automatically to Research Compliance Services. Send a copy of your completion report to Dr. Skowron, at
eskowron@uoregon.edu with the message topic CITI training completed. You can return to the CITI site at any
time to obtain a copy of your completion report.

23

In-Class Activity 1
Ethical Concerns in Human Subjects Research
1.

A prevention science researcher applies to an IRB, proposing to


observe children ages 2 to 10 eating their meals and playing in the
local McDonalds play area. Because the area is public, the
researcher does not plan to ask for informed consent from the
childrens parents.

What ethical concerns exist for this study?

What questions might an IRB ask?

2.

A psychologist plans to hand out surveys in her 300-level


undergraduate class. The survey asks about student study habits and
substance use. The psychologist does not ask the students to put
their names on the survey; instead, students will put completed
surveys into a large box at the back of the room. Because of the low
risk involved in participation and the anonymous nature of the survey,
the researcher requests to be exempted from formal informed consent
procedures.

What ethical concerns exist for this study?

What questions might an IRB ask?

3.

Discuss in class & submit for grading

24

Three Kinds of Conclusions To


Draw from Research
Frequency Claims
o
o
o

Describe a particular rate or level of something


Typically focus on 1 variable
Variable is measured, not manipulated
More than 2 million U.S. teens depressed
Half of Americans struggle to stay happy
Almost 1 million children were abused or neglected last year

Associational Claims
o
o
o

Argue that 2+ variables are related (+/-)


Involve at least 2 variables
Variables are measured, not manipulated
Belly fat linked to dementia
Laptop computer use linked to poor sperm quality
Poor nutrition associated with school failure

Causal Claims
o
o

Argue that one variable causes another


Study must have meet 3 criteria of covariation, temporal precedence, and internal validity
Music lessons enhance IQ
PCIT reduces child maltreatment recidivism
Debt stress causes health problems
25

Technical function of good research


design = To control variance
(attend to the 4 validities)
MAXMINCON (Kerlinger, 1973, 1986)
Maximize systematic variance
Maximize variance of the variables in your substantive research hypothesis
Experimental variable: make conditions as different as possible
Associational variable: seek wide range of scores/levels as possible

Minimize error variance


Reduce the errors in measurement of your constructs and increase the reliability
of your measures

Control extraneous variance


Control variance of extraneous or unwanted variables that may effect or relate to
your variables of interest
3 ways to control these

26

MAX

Violence exposure

Maximize systematic variance

Dependent variable:
Emotion dysregulation
Emotion Dysregulation

27

MIN
Minimize error variance
Give the systematic variance (the stuff youre interested in)
a chance to show itself
1. Sources of error variance (errors in measurement):
Guessing, fatigue over time, momentary inattention, variation in responses from
trial to trial
Solutions:

2. (Un)reliability of measures:
Consistency in measurement across items, raters, time, etc.

28

MIN
Reliability of your measures will constrain the strength of
association you can observe between the variables of
interest (e.g., Ghiselli et al., 1981)
Correction for
attenuation

ryy = reliability of the y scores


rox,oy = observed correlation between x and y
rtx,ty = true correlation between x and y

29

MIN

30

MacCoun, 2006

MIN
If our dependent variable measure is unreliable, it will
drastically underestimate the true x y relationship.

31

CON
Control

Extraneous Variance

Identify plausible 3rd variables and control their influence


on your study variables of interest in 1 of 3 ways
Principle 1: To eliminate the effect of a possible influential 3rd variable
on a dependent variable, chose participants so that they are as
homogeneous as possible on that 3rd variable

Principle 2:

Whenever possible, randomly assign participants to


experimental groups and conditions

Principle 3: control the effects of a 3rd variable by building it into the


research design as an attribute variable that is measured and then
statistically controlled

Principle 3a: Match participants across conditions or groups by


splitting a variable into 2 or more parts, then randomize within
each level
32

CON
Extraneous 3rd variables to control for?

Violence exposure

Child Age

Caregiving

SES

Dependent variable:
Emotion dysregulation

33

Next week
(Morling Ch. 3, 14; CITI)
____________________________

3 Claims
Research Designs
4 Validities
34

Research Design in Counseling


Psychology

Class 3

Instructor:

Elizabeth A. Skowron, Ph.D.,


257 HEDCO Building
541-346-0913
eskowron@uoregon.edu
1

Week 3
________________

3 Claims
4 Validities
________________________________

Next Week: Research Designs


2

Three Kinds of Conclusions To


Draw from Research
Frequency Claims
o
o
o

Describe a particular rate or level of something


Typically focus on 1 variable
Variable is measured, not manipulated
More than 2 million U.S. teens depressed
Half of Americans struggle to stay happy
Almost 1 million children were abused or neglected last year

Associational Claims
o
o
o

Argue that 2+ variables are related (+/-)


Involve at least 2 variables
Variables are measured, not manipulated
Belly fat linked to dementia
Laptop computer use linked to poor sperm quality
Poor nutrition associated with school failure

Causal Claims
o
o

Argue that one variable causes another


Study must have meet 3 criteria of covariation, temporal precedence, and internal validity
Music lessons enhance IQ
PCIT reduces child maltreatment recidivism
Debt stress causes health problems
3

Validity Issues in Research Design


To draw valid conclusions about research
questions, we must design studies to
minimize the potential for alternative
explanations of the results

Three Claims

Three Claims
Frequency claims
Association claims
(types of associations)

Causal claims

Practice Identifying Claims


a.
b.
c.
d.
e.
f.
g.
h.
i.
j.
k.
l.
m.

Worry may make womens brains work overtime.


High normal blood sugar may still harm brain.
Want a higher GPA? Go to a private college.
Those with ADHD do one months less work a year.
When moms criticize, dads back off baby care.
Report: 16% of teens have considered suicide.
MMR shot does not cause autism, large study says.
Breastfeeding may boost childrens IQ.
Breastfeeding rates hit new high in United States.
Smiling may lower your heart rate.
OMG! Texting and IM-ing doesnt affect spelling!
Facebook users get worse grades in college.
Mothers heartburn means a hairy newborn.

Practice Identifying
Claims
a. Indicate if the claim is frequency, association, or
cause.
b. For each claim, identify the variable(s).
c. For each variable, is it manipulated or measured?
d. State each variable at the conceptual level.
e. State each variable in terms of its operational
definition: How might it have been operationalized?

Interrogating the Three Claims

Using the Four Big Validities

Four (Big) Validities


Statistical Conclusion Validity:
Are the variables actually statistically related?
Is the statistical test able to detect small associations/small differences (i.e., small
effects)?

Internal Validity
(most relevant in studies that test for causal relations)
The extent to which observed changes in a DV are attributable to/caused by an IV?
What 3 conditions need met to establish a causal relationship? (REVIEW)

Construct Validity
Do the measured variables reflect the actual constructs of interest?
Are all important aspects of the constructs represented in the study variables?

External Validity
Are the study results applicable (i.e., generalizable) to other groups, settings, timeframes?
12

Threats to Validity

13

Threats to Statistical Conclusion Validity

Low statistical power


Violated assumptions of your statistical tests
Fishing and error rate problems
Unreliability of measures/treatment implementation (MIN)
Restriction of range (MAX)
Extraneous variance (3rd variable threats) (CON)

14

Threats to Statistical Conclusion Validity


Low statistical power
o Statistical power = probability of finding a relationship or effect when it really
exists (i.e., power to find a true effect)
o Type II Error = risk of failing to find a relationship (significant effect) that really
exists

Steps to increase statistical power


1.
2.
3.

Use a larger sample size


Increase the effect size (MAX your systematic variance)
Decrease noise (MIN your error variance)

15

Threats to Statistical Conclusion Validity


Fishing and error rate problems
Conducting lots of analyses on a data set and treat each as independent

In stats analyses, we use p < .05 level of significance


Result we obtain in our study is expected to occur by chance in only 5 X out of
every 100 times we run the analysis

Odds are 5 out of 100 that we will see a relationship (i.e.,


significant effect) even if none exists
o Query: What are the chances of finding a significant effect if you:
conduct 10 separate tests with your data? p = ______
20 separate tests with your data? p = ______

Solution:
o Adjust the error rate (i.e., p-value, significance level) to reflect the number of analyses
you plan to conduct
o Experiment-wise p = ____.05____
N of tests

N tests = 4, experiment-wise p = _.0125__


= 6,

p = _.008__
16

Threats to Statistical Conclusion Validity


Unreliability of measures/treatment implementation
(MIN)

If measurement of variables
measure is unreliable,
it will drastically underestimate
the true x y relationship.

17

MIN

18

MacCoun, 2006

Threats to Statistical Conclusion Validity


Restriction of range (MAX)

19

Threats to Statistical Conclusion Validity


Extraneous variance
3rd variable threats must be identified (CON)
Control their influence on your study via
Principle 1: To eliminate the effect of a possible influential 3rd variable
on a dependent variable, chose participants so that they are as
homogeneous as possible on that 3rd variable
Principle 2: Whenever possible, randomly assign participants to
experimental groups and conditions
Principle 3: control the effects of a 3rd variable by building it into the
research design as an attribute variable that is measured and then
statistically controlled
Principle 3a: Match participants across conditions or groups by
splitting a variable into 2 or more parts, then randomize within each
level
20

Threats to Internal Validity


Compromise our confidence in assertions that a relationship/effect
exists between the independent and dependent variables.

History
Maturation
Statistical regression (law of initial values)
Selection
(Differential) Attrition
Testing
Instrumentation
Compensatory equalization of treatments
Resentful demoralization
Treatment diffusion
21

Threats to Internal Validity


History: Did some unanticipated event occur while
the experiment was in progress and did these
events affect the dependent variable?
A threat for the one-group design, but not for two-group designs
In the one-group pre-test post-test design, the effect of the treatment
produces the difference in the pre- and post-test scores. This difference
may be due to the treatment or to history.

22

Threats to Internal Validity


History:
Not a threat for two-group designs (i.e., treatment/experimental group vs.
comparison/control group).
If the history threat occurs for both groups, the difference between the
two groups will not be due to the history event.

23

Threats to Internal Validity


Maturation: were changes in the dependent
variable due to normal developmental processes
operating within the participant as a function of
time?
Is a threat for the one-group design.
Is not a threat for the two-group design, assuming that participants in
both groups change (mature) at the same rate.

24

Examples: Threats to Internal Validity


History: In a short intervention designed to
investigate the effect of computer-based self-control
instruction, participants missed some instruction
because of a power failure at the school.
Maturation: the performance of 1st graders in a
learning experiment begins decreasing after 45
minutes due to fatigue

25

Threats to Internal Validity


Statistical regression: An effect that is the result
of a tendency for participants selected on the
bases of extreme scores to regress towards the
mean on subsequent tests.
When measurement of the dependent variable is not perfectly reliable,
there is a tendency for extreme scores to regress or move toward the
mean over time.
The amount of regression to the mean is inversely related to the
reliability of the test.

26

Examples: Threats to Internal Validity


Statistical regression:
In a study of family therapy, participating children grouped
because of high anxiety scores show considerably greater
reductions in anxiety than do the groups who scored average
and low on anxiety at the pre-test.

27

Threats to Internal Validity


Selection: Refers to selecting participants for the
various groups in the study. Are the groups
equivalent at the beginning of the study?
This is not a threat in studies that employ random sampling and random
assignment. All participants have an equal chance of being in the
treatment or comparison groups and the groups are equivalent.
Were participants self-selecting into experimental and comparison
groups? This could compromise the internal validity of the study.

Selection is not a threat for the one-group design but is a threat for the
two-group design.

28

Threats to Internal Validity


Differential Attrition: Differential loss of
participants across groups.
Did some participants drop out? Did this affect the results?
Did about the same number of participants make it through the entire
study in both experimental and comparison groups?
This is a threat for any design with more than one-group.

29

Threats to Internal Validity


Testing: Did the pre-test affect scores on the posttest?
o A pre-test may sensitize participants in unanticipated ways and their
performance on the post-test may be due to the pre-test, not to the
treatment, or more likely, an interaction of the pre-test and treatment.
o This is a threat for one-group designs.
o Not a threat for two-group designs. Both groups are exposed to the pretest and so the difference between groups will not be due to testing.

30

Examples: Threats to Internal Validity


Selection: The experimental group in a study of selfcontrol consisted of a high-ability class, while the
comparison group was an average-ability class.
(Differential) Attrition: In a health-promotion
intervention designed to test the effect of various
exercises, those participants who dislike exercise most,
stopped participating.
Testing: In an experiment with logical reasoning
performance as the dependent variable, a pre-test
familiarizes the participants with the post-test and how
to perform well
31

Threats to Internal Validity


Instrumentation: Did any change occur during the
study in the way that the dependent variable was
measured?
o Is a threat for one-group designs, not for the two-group designs.
o Why? _________________________________

Treatment diffusion: Did the comparison group


know or find out about the experimental/
intervention group and what transpired?
o A threat for two-group designs.

32

Examples: Threats to Internal Validity


Instrumentation: Two research assistants for a
self-control experiment with preschoolers
administered the post-test with different
instructions and procedures.
Treatment diffusion: In an intervention study to
enhance college student adjustment, students in
the treatment and placebo control groups compare
notes about what they are learning in sessions.

33

Threats to Internal/Construct Validity


Compensatory equalization or rivalry: These
simply weaken or strength the effect sizes
associated with the intervention.
Resentful demoralization: If participants learn
that their group receives less desirable goods or
services, they may feel resentful, demoralized and
perform particularly low on the dependent variable.
o What effect would this have on treatment vs. control group differences
_____________________________?
May increase magnitude of group differences, leading to an
overestimate of the effect

34

Threats to Construct Validity

Inadequate explication of the constructs


Construct confounding
Mono-operation bias
Mono-method bias

35

Threats to Construct Validity


Are all important aspects of the constructs represented in the
independent and dependent variables?
o If yes, good
o If not, the constructs are underrepresented

Do the independent and dependent variables also represent


constructs that are not of interest in the study?
o If yes, there are surplus construct irrelevancies
o If no, good

Inadequate explication of the constructs


Construct confounding
36

Threats to Construct Validity


Campbell and Fiske (1959) proposed two kinds of construct-validation
evidence:
1. evidence of convergent validity
o

evidenced by achieving similar results (convergence) across different measures of the


same construct or different manipulations of the same construct. In other words, your
measure of binge drinking would be expected to correlate with other existing measures of
binge drinking and similar constructs, such as ____________ and __________________.

2. evidence of discriminant validity


o

evidenced by observing no associations between your measure of ___________ and


measures of other, unrelated constructs, such as ___________________ and
____________________. For example we would expect that your measure of binge drinking
would not be correlated with unrelated constructs, such as __________ or _____________.

Mono-operation bias
Mono-method bias
http://www.youtube.com/watch?v=1Y3v5dgWlWM
37

External Validity
External validity refers to the degree to which the results of
an empirical investigation can be generalized to and across
individuals, settings, and times
External validity can be divided into
Population validity
Ecological validity

38

External Validity
Population Validity:

How representative is the sample of the population?


The more representative, the more confident we can
be in generalizing from the sample to the population.
How widely does the finding apply? Generalizing
across populations occurs when a particular research
finding works across many different kinds of people,
even those not represented in the sample.
39

External Validity
Ecological Validity is present to the degree that a result
generalizes across settings. Types include:
Interaction effect of testing
Interaction effects of selection biases and
experimental treatment
Reactive effects of experimental arrangements
Multiple-treatment interference
Experimenter effects.
40

Threats to External Validity


Interaction of selection and treatment
o A characteristic of the treated group that interacts with the treatment
o Randomization would correct

Example:
o An experimental evaluation of a new teaching method is conducted in a sample
of low achieving students.
o Results will not generalize to a sample of students with heterogeneous
abilities/achievement levels

41

Summary: External Validity


Its a population, not the population

External validity comes from how, not how many.


Just because a sample comes from a population doesnt
mean it generalizes to that population.

To Be Important, Must a Study Have


External Validity?
Generalizing to other
participants
Generalizing to other settings
Does a study have to be
generalizable to many
people?
Does a study have to take
place in a real-world setting?

Does a Study Need to Be


Generalizable to Many People?
Generalization mode
Frequency claims
Goal is to make a claim
about a population
Real-world matters
External validity is essential!

Theory-testing mode
Association and causal
claims
Goal is to test a theory
rigorously, isolate variables
Prioritize internal validity
Artificial situations may be
required
Real world comes later
External validity is not the priority!

Does a Study Have to Take Place in a

Real-World Setting?
Theory-testing mode often requires
artificial settings.

Even laboratory settings can feel


emotionally real.
Experimental realism

Prioritizing Validities
That
studys just
not valid!

Which validity is appropriate to


interrogate for every study?
Which validities are not always relevant
for a study?
Why cant researchers achieve all four
validities in a single study?
Which two validities are most often in
trade-off?
Which validity is most under the
researchers control?

In-Class Activity 2
Return to 2 of your research topics of interest.
a. For 1 of your topics of interest, construct a research question that is
framed as:
A frequency claim
An associational claim
A causal claim

b. For each research question (3 total) prepare an operational


definition of your constructs (i.e., measureable variables) and
specify the method you will use to measure the construct
Identify your IVs and DVs, and note whether your variables are Active or
Attribute variables, and Continuous or Categorical variables

c.

Restate each of your research questions as directional


relationships between your measured constructs

d. Identify at least 3 possible threats to validity that may be relevant to


the studies you design to test the associational and causal claim
questions. (Select at least 2 threats to internal validity.) How could
interpretation of your findings be impacted by each of these
potential threats?

47

To Be Important, a Study

Must Be Replicable

Replication
Replication
Replication
Replication
Replication
Replication
Replication
Replication
Replication
Replication
Replication
Replication
Replication
Replication

Replication Studies
Direct replication
Conceptual replication
Replication-plus-extension
Meta-analysis

Replication
Replication
Replication
Replication
Replication
Replication
Replication
Replication
Replication
Replication
Replication
Replication
Replication
Replication

Replication Studies
Direct replication
Same variables, same
operationalizations

Conceptual replication
Same variables, different
operationalizations

Replication-plusextension
Same variables, plus some
new variables

How Meaningful Is That Effect Size?

Say this:
Hows the construct
validity?
Is external validity
relevant here?

Can the study support


a causal claim?

Not that:

The question is,


is the study
valid?

That is not a
valid study.

Week 4
Research Designs
________________________________
o Pre-experimental
o Experimental
o Quasi-experimental

56

Research Design in Counseling


Psychology

Class 4

Instructor:

Elizabeth A. Skowron, Ph.D.,


257 HEDCO Building
541-346-0913
eskowron@uoregon.edu
office hrs: Tues 12-1 pm
1

Review In class activity #2

In-Class Activity 2
Return to 2 of your research topics of interest.
a. For 1 of your topics of interest, construct a research question that is
framed as:
A frequency claim
An associational claim
A causal claim

b. For each research question (3 total) prepare an operational


definition of your constructs (i.e., measureable variables) and
specify the method you will use to measure the construct
Identify your IVs and DVs, and note whether your variables are Active or
Attribute variables, and Continuous or Categorical variables

c.

Restate each of your research questions as directional


relationships between your measured constructs

d. Identify at least 3 possible threats to (internal) validity that may be


relevant to the studies you design to test the associational and
causal claim questions. Why might these potential threats be an
issue with tests of your research question?
3

Research Designs
Pre-Experimental
Experimental

Quasi-Experimental
Conducted in the lab vs. in the field
Making associative or causal claims
4

Three Criteria for


Causation
Covariance
Temporal precedence
Internal validity

Research Design
Random Assignment
used?
yes

no

IV manipulated?

yes

Experiment

One or more IVs are


manipulated?

yes

Quasiexperimental

no

Preexperimental

In-Class Activity 3
List your research questions that frame
An associational claim
A causal claim

a. Use an experimental design to test your causal research question


Identify one IV and one DV
Select and describe your design choice
o Which threats to internal validity does it control and why?
o List 1 threat to external validity that exists and why.
o Explain how each threat would impact interpretation of your findings.

b. Use a quasi-experimental design to test your research question


Identify one IV and one DV
Select and describe your design choice
o Which threats to internal validity does it control and why?
o Which 2-3 threats to internal validity does it NOT control and why?
o Explain how each threat would impact interpretation of your findings.
7

Pre-Experimental Designs

Heppner, Kivlighan, & Wampold


(2008) refer to these three
designs as uninterpretable

Multiple threats to internal


validity of these studies
No way to infer that any change has taken
place; maturation & history cant be ruled
out because no control group was used.

Better than one-shot case study, because


we can determine if change occurred.
Cause of change remains ambiguous.

Great difficulty attributing results to the


intervention. Groups could differ in many
different ways beyond treatment effects.
Cant discern those possible differences.

Pre-Experimental Designs
One shot case study

One group pretest-posttest study

Static group comparison study

10

11

Research Design
Random Assignment
used?
yes

no

IV manipulated?

yes

Experiment

One or more IVs are


manipulated?

yes

Quasiexperimental

no

Preexperimental

12

Experimental Designs
Pretest-Posttest Control Group Design
R
R

O1
O1

Randomize participants to 2+ groups (1


treatment & 1 no-tx, i.e., control). Both
groups get a pre- and post-test. Enables
test of X on O2, reflected in the differences
observed across groups.
Pretest: helps clarify source of diff
attrition, strengthens stat test by
controlling for pre-tx differences in the DV;
assist in testing moderation effects

O2
O2

PosttestOnly Control Group Design


R
R

O2a
O2b

Randomize participants to 2+ groups (1


treatment & 1 no-tx, i.e., control). Both
groups get a post-test. Enables test of X
on O2a. Less time, expense, & avoid
repeated testing.

Solomon Four-Group Design


R
R
R
R

O1
O1

O2
O2
O2
O2

Key
R
O1
O2
X

randomization
pretest
posttest
intervention

Used when there are


concerns about the effect
of a pretest on participants.
Added value is ability to
examine effects of pretest.
Controls for most threats
to internal validity. Is
costly in time & resources.
13

Experimental Designs
Pretest-posttest control group design

Posttest only control group design

Solomon four-group design

14

15

Research Design
Random Assignment
used?
yes

no

IV manipulated?

yes

Experiment

One or more IVs are


manipulated?

yes

Quasiexperimental

no

Preexperimental

16

Quasi-Experimental Designs
No randomization
One or more IVs are experimentally-manipulated
4 reasons to select these over a true experimental design
1.
2.
3.
4.

Cost
Sample selection
Ethical considerations
(un)Availability of suitable control groups

17

Quasi-Experimental Designs
Three good non-equivalent
groups designs
Nonrandom assignment to groups. Pretest
enables us to assess for similarity of
participants on the DV (though groups
wont be similar on other 3rd variables).
Selection may still be a threat. Less time,
expense, & avoid repeated testing.

Enables us to clarify and control for


maturation effects. Must deal with the
autocorrelations in data when they are
analyzed.

Strengthens 1st design by adding another


pretest. Clarify whether maturation is
different across groups.

18

Quasi-Experimental Designs
Pretest-posttest nonequivalent groups
Time series designs
Nonequivalent before-after design

19

20

Technical function of good research


design = To control variance
(attend to the 4 validities)
MAXMINCON (Kerlinger, 1973, 1986)
Maximize systematic variance
Maximize variance of the variables in your substantive research hypothesis
Experimental variable: make conditions as different as possible
Associational variable: seek wide range of scores/levels as possible

Minimize error variance


Reduce the errors in measurement of your constructs and increase the reliability
of your measures

Control extraneous variance


Control variance of extraneous or unwanted variables that may effect or relate to
your variables of interest
3 ways to control these

21

In-Class Activity 3
List your research questions that frame
An associational claim
A causal claim

a. Use an experimental design to test your causal research question


Identify one IV and one DV
Select and describe your design choice
o Which threats to internal validity does it control and why?
o List 1 threat to external validity that exists and why.
o Explain how each threat would impact interpretation of your findings.

b. Use a quasi-experimental design to test your research question


Identify one IV and one DV
Select and describe your design choice
o Which threats to internal validity does it control and why?
o Which 2-3 threats to internal validity does it NOT control and why?
o Explain how each threat would impact interpretation of your findings.
22

Exam 1
Due Tuesday, October 28thth, 2014 by 4:30 PM
o Located in Blackboard Research Design, in Assignments, named Exam 1
o Exam goes live Wed 8:00 am and closes following Tues 4:30 PM
Multiple choice
Short answer / essay questions (prepare in paragraph format using APA
style)
No backtracking option enabled

23

Review In class activity #3

24

Research Design in Counseling


Psychology
Class 6
Instructor:

Elizabeth A. Skowron, Ph.D.,


257 HEDCO Building
541-346-0913
eskowron@uoregon.edu
1

Measurement
Data Collection
Sampling

Operationalizing
Study Variables
3

Measurement
Constructs: are concepts that cannot be directly observed
Variable: is a symbol to which numbers or values are assigned;
can take on any set of values; can be dichotomous to
continuous
When operationally-defined, they are observable

Operational definitions: assign meaning to a construct/


variable by spelling out what the investigator must do to
measure it
(1) measured: describes how the variable will be measured
(2) experimental: spells out the details of the investigators
manipulation of a variable
Reinforcement schedule
Intervention type & dosage
4

Construct definition & operationalization


Each construct has only one Conceptual definition
(i.e., researchers definition of the variable at an
abstract level)

Each construct may have multiple Operational


definitions (i.e., representing the researchers
specific decision about how to measure or
manipulate the variable)
5

Conceptualizing Race, Culture, & Ethnicity


Race
The presumed classification of all human groups on the basis of
visible physical traits or phenotype and behavioral differences (Robert
Carter, 1995)
Not a biological reality
A social construct used to categorize people
Referenced to perpetuate power differences and social inequalities

Ethnicity
Ones national origin, religious affiliation, or other type of socially or
geographically-defined group (Carter, 1995)

Culture
The values, beliefs, language, rituals, traditions, and other behaviors
that are passed down from one generation to another within any
social group (Helms & Cook, 1999).

Methods of Measurement
1.

Self-report

2.

Other-report (parents, therapist, teacher, etc.)

3.

Measures of overt behavior by trained observers using coding system


+ : direct and objective
-- : presumption that observed behavior is representative; costly; feasibility?

Neurobiological indices
Interviews

6.

Respondents rate the participant on some dimension(s)


+ : easy to administer, economical
-- : potential systematic bias (e.g., cultural competence of rater cross-cultural child development study)

Behavioral observations

4.
5.

Participant makes an observation or report on self


+ : easy to administer, economical, accesses private thoughts, feelings, behavior not accessible to
investigators
-- : vulnerable to distortion, presume client insight/understanding about construct being measured

+ : flexible, high completion rate


-- : costly; feasibility?

Unobtrusive measures

Assessment conducted without participants awareness


+ : eliminates reactivity to measurement
-- : expensive?; some types are unethical

Operationalizing the Independent


Variable
Those you can manipulate (i.e., active IVs)
1. Determining conditions of the IV
Referred to as levels of the IV, groups, categories, and
treatments interchangeable terms
These are often categorical variables (but dont have to
b)
Conditions of the IV are determined by YOU, the
researcherbc they are manipulated

2. Adequately reflecting the constructs of interest


Your IV must be well-defined and operationalized
See psychometrics section below
8

Operationalizing the Independent


Variable
3. Limiting differences between conditions
Try to make sure that the different conditions of the IV
differ only on the dimension of interest (e.g., math
problem difficulty groups-easy, moderate, hard) and not
other dimensions (how much tutoring was
available.etc.)

4. Establishing the salience of differences in conditions


______________________
Manipulation checks
Used to verify that the conditions of the IV
differed as intended
Didnt differ on other dimensions
And that treatments were implemented in the intended fashion

Operationalizing the Independent


Variable
Those you cannot manipulate, i.e., attribute IVs, aka status
variables in HWK
Statistical tests with these variables are used to detect
associations
FYI: Stats used in tests of associational and causal claims
are basically the same, but it is more difficult to draw causal
inferences with status variables because they are not
manipulated
IT IS THE RESEARCH (STUDY) DESIGN, NOT THE STATS
ANALYSIS USED, THAT DETERMINES THE INFERENCE STATUS
OF THE STUDY
i.e., associational claims vs. causal claims, etc.
10

Operationalizing the Dependent


Variable
Have a rationale for why you selected the DVs of choice, and not
others, and why you operationalized the DV in the manner you did
e.g., Webster-Stratton (1988) parent-report of child behavior is a function of
parent psychopathology
Orlinsky et al. (1994) psychotherapy outcome ratings differ per therapist,
client, and observer ratings

1. Insure measure used to operationalize the DV is psychometricallystrong (i.e., good reliability & validity)
2. Consider role of reactivity in DV assessment
3. Consider other procedural issues with DV assessment
Administration time
Order of presentation
Reading level

11

Scales of Measurement
Categorical scales

Nominal (i.e., categorical)


A scale with numerical values that represent categories of an attribute or "name" the attribute
uniquely
e.g., sex or ethnicity
NOTE: You cannot subject nominal scale measures to the same statistical tests that other three
can

Quantitative scales
Ordinal
measurement of some the attributes that can be rank-ordered
e.g., years of schooling completed

Interval
Measurement that is rank-ordered AND the distance between locations on the scale do have
meaning
e.g., measurement of temperature in Fahrenheit or Celsius (40 degrees is twice as hot as 20
degrees)

Ratio
Measurement that is rank-ordered AND the distance between locations on the scale do have
meaning and there is an absolute zero that is meaningful
e.g., number of study participants who re-abused their children following treatment
12

Measurement Activities
1. Classify each operational variable below as categorical or
quantitative. If the variable is quantitative, further classify it
as ordinal, interval, or ratio.
a) Number of books a person owns
b) A books sales rank on amazon.com
c) Location of a persons hometown (urban, rural, or
suburban)
d) Nationality of the participants in a cross-cultural study
of Canadian, Ghanaian, and French students
e) A students grade in school

13

Psychometrics
Reliability of measures
Validity of measures
Relationship between R & V

14

Reliability
Internal consistency: the extent to which items within a test are similar or
hang together
use a single instrument administered to a group of people on one occasion
Compute Cronbach's Alpha: an index of intercorrelations between all items on test
Reliability estimates = .70 or higher indicate very good reliability

Inter-rater: degree to which different raters/coders give consistent


ratings/scores of the same phenomenon
Two or more raters code same phenomenon
Categorical measures:
Calculate the percent of agreement between the raters
Adjust this for chance agreement using kappa coefficient

Continuous measures:
calculate the correlation between the ratings of the two observers

Test-retest: consistency of a measure from one time to another


Use a single instrument administered to a group of people on two+ occasions
Calculate a correlation
Is this best used for measures of constructs that are State or Trait-like? Stable or shifting
over time? Why?
Shorter the time gap, the higher the correlation; the longer the time gap, the lower the
correlation
15

2. For each measure below, indicate which kinds of reliability would be

appropriate to evaluate.
a) Researchers place unobtrusive video recording devices in the living rooms of
20 children. Later, coders view tapes of the living areas and code how many
minutes each child spends playing video games.
b) Clinical psychologists have developed a seven-item self-report measure to
quickly identify people who are at risk for post-traumatic stress disorder.
c) Psychologists measure how long it takes a mouse to learn an eye-blink
response. For 60 trials, they present a mouse with a distinctive blue light
followed immediately by a puff of air. The 5th, 10th, and 15th trials are test
trials, in which they present the blue light alone (without the air puff). The
mouse is said to have learned the eye blink response if observers record that it
blinked its eyes in response to a blue light test trial. The earlier in the 60 trials
the mouse shows eye-blink response; the faster it has learned the response.
d) A restaurant owner uses a response card with four items in order to evaluate
how satisfied customers with the food, service, ambience, and overall
experience. Each item is scaled from one to four stars.
e) Educational psychologists use teacher ratings of classroom shyness (on a ninepoint scale, where 1 = not at all shy in class and 9 = very shy in class) to
measure childrens temperament.
16

Validity
(of measures)
Physical science is fortunate to have standard measurements
e.g., Platinum-iridium bar kept at U.S. NIST international standard for length
of 1 meter
I can compare my 1 meter ruler to this standard and know if it measures what
its supposed to measure
No such luck in the social sciencesour constructs are typically not directly
observable (i.e., anxiety, happiness, self-regulation)
No way to directly measure these constructs
We work with estimations (via self report, observed behavior, neurobiological
measures, others reports, etc.)

Construct validity = to what extent is our measure of X really tapping into


it?
Definition: to what extent does this test/measure (i.e., an operationalization)
accurately reflects the construct its intended to measure?

17

Measurement
(construct)
Validity

Two subjective ways


to assess validity

FACE VALIDITY
It looks like what
you want to
measure

Reliability
Do you get consistent
scores every time?

Does it measure what you


intend to measure?

CONTENT VALIDITY
The measure contains
all parts that your
theory says it should
contain

TEST-RETEST
RELIABILITY

Four Empirical
Ways to Assess
Validity

Predictive validity
Your measure is correlated
with a relevant outcome in
the future
Concurrent validity
Your measure is
correlated with a
relevant outcome now,
in the present

People get consistent


scores every time they
take the test

INTERNAL CONSISTENCY
RELIABILITY
People give consistent
scores on every item on a
questionnaire

INTERRATER RELIABILITY
Two coders ratings of a
behavior are consistent with
each other

Convergent validity
Measure is more strongly
associated with measures of
similar constructs

Discriminant validity
Measure is less
strongly associated
with measures of
dissimilar constructs

18

Morling, 2012

Relationship between Reliability &


Validity

Reliability is a necessary but not sufficient condition for validity


19

Relationship between Reliability &


Validity

20

Concurrent & Predictive Validity


Both evaluate whether scores on your measure are related to
scores on other concrete outcomes that they should be related to
e.g., measure of clinical skills/aptitude or graduate school aptitude
Concurrent Validity
Does your measure correlate with a relevant outcome right now, in
the present
e.g., correlate scores on your measure of clinical skill with outcome
(client ratings of therapeutic alliance; ______________)

Predictive validity
Does your measure correlate with a relevant outcome measured in
the future
e.g., correlate scores on your measure of clinical skill assessed now,
with an outcome measured in the future (client improvement in
therapy; ____________)

Can calculate via a correlation coefficient, r


21

Convergent & Discriminant Validity

Does the test show a meaningful pattern of associations with other measures
Your measure should:

Correlate more strongly with other measures of similar constructs, and


Correlate less strongly with measures of other, different constructs

Convergent Validity
Your measure correlates more strongly with other measures of similar constructs

e.g., Differentiation of self scores should correlate with: __________________


______________________________________________________________

Discriminant Validity
Your measure correlates less strongly with measures of other, different constructs

e.g., Differentiation of self scores should NOT correlate with: ______________


______________________________________________________________

Can also calculate via a correlation coefficient, r


No absolute level of correlation indicates convergent or discriminate validity
evidencelook to the pattern of findings across the nomological net

22

Cultural Validity
is concerned with the construct, concurrent, and predictive validity of theories and models across cultures, i.e., culturally different individuals (Leong & Brown,
1995, p. 144)

Planning your study


Use MC theories to conceptualize the research; consult with cultural communities
Translate demographics into salient psychological characteristics (e.g., ethnic identity
development, experience of micro-aggressions)

Selection of measures

Use multiple measures to represent each construct


Pilot test measures with your target population
Use culturally congruent measures in your study
Create or adapt ethnocentric measures

Recruiting participants
Representative of your target population
Use procedures congruent for this cultural group
Recruit to represent underlying psychological
characteristics of interest

Analyzing your data


Evaluate cultural hypotheses & rival, competing hypotheses
Examine moderator effects of cultural variables

Interpreting results

Design your study to benefit participants directly


Represent participants voices authentically when interpret data
Integrate service into community as way of giving back
Engage participants in interpretation of data and share findings

23

Using Factorial Designs to Study External


Validity
Factorial designs are comprised of at least two independent
variables, and each IV has 2+ levels
IV-1: intervention (treatment, control group: 2 levels)
IV-2: status variable (i.e., demographic or individual difference
variable) (e.g., gender: male, female: 2 levels)

Independent Variable
1. Treatment
2. Control
Gender

1.male
2. female

Enables us to learn whether the treatment works or works


better for one level of the status variable than another (via
interaction effects)
24

Recommendations for conducting


culturally-valid quantitative research
Identify demographic variables that serve as proxy variables & measure those
social-psychological variables directly
Ethnic & racial group status as a proxy for socio-economic status
Racial group status as proxy for stage of racial identity development
Evaluate external validity of studies, not solely based on demographic
characteristics of a sample, but on salient psychology characteristics & a strong
theoretical rationale
e.g., potential generalization of research on racial identity development from AfricanAmerican samples to other stigmatized ethno-cultural populations

Benefits
Conceptual generalization promotes better theory building, and
Use of social-psychological characteristics (rather than simple demographics)
may limit use of inappropriate generalizations to an entire population
would enable focus on psychological antecedents for psychological
outcomes
could divert efforts away from token sampling of ethno-cultural groups that
include only highly acculturated members of who fail to represent the
important
psychological characteristics of the larger population
25

Recommendations continued
Improve construct validity in measurement via evidence of cultural
equivalence of tests/measures

Linguistic equivalence: do translated items carry the same meaning in the


target
language as they do in their source language?
Functional equivalence: does the phenomenon have similar functions across
cultures? (e.g., assertiveness as adaptive)
Conceptual equivalence: does the concept have an equivalent in other
cultures?
(e.g., defining IQ)
Psychometric equivalence: are the ways in which the concept is quantified
equivalent
across cultural groups? (e.g., timed
components of IQ test)

Involve indigenous experts in formulating theory, study hypotheses,


research procedures, & interpretation of results
Strengthen cultural validity of your research study

26

Face & Content Validity


(most subjective)
Face Validity
Weakest way to try to demonstrate construct validity
To what extent does this measure appear "on its face" to
be a good translation of the construct
Is essentially a subjective judgment call

Content validity
Involves a subjective check the operationalization against
the relevant content domain for the construct.
Often involves surveying expert in the content domain to
evaluate content capture of your measure
27

Concluding Notes re:


Measurement of Constructs
1. A single operationalization (i.e., single scale or instrument) will
almost always poorly represent a construct
2. The correlation between two constructs is attenuated (i.e.,
weakened) by unreliability in measurement
3. Unreliability always makes it more difficult to detect true effects
(should any be present) because of reduced statistical power.
4. The correlation between two measures using the same method is
inflated by (shared) method variance.
5. If possible, multiple measures using multiple methods should be
used to operationalize a construct.
6. Typically, interpretations of relationship should be made at the
construct level, for seldom are we interested in the measures per
se. Awareness of the effects of unreliability and method variance
is critical for drawing proper conclusions.

28

SAMPLING

29

Sampling
When we consider external validity, we ask whether results of a particular
study can be generalized, to other people in the population, or to kinds of
settings were interested in.
To interrogate the external validity of a frequency or causal claim, we ask
for example:
Do clients who rated this therapists warmth adequately represent all of the
therapists former clients?
Can we predict the results of the presidential election from the results of this
poll taken from these 1,500 people?

Sample: portion of the population, e.g., one potato chip


Population: all, e.g., the whole bag of chips
You dont need to study the whole population. You just need to insure
that the sample you study adequately reflects the population

30

Sampling
Define your population of interest
Now you can assess how well your sample represents it

Bias
Samples are bias when they are unrepresentative of the population
Biased samples lead you to draw the wrong conclusions about the
population
e.g., your 1 potato chip is burnt (biased sample)
This would lead you to conclude something wrong about the whole bag of chips

e.g., Presidential election poll


Biased sample would include too many of the most unusual (not typical) people
e.g., Therapist ratings
Clients who rate their therapist on a website may tend to be ones who are angry
or disgruntled, and not represent the rest of the therapists clients very well

31

Sources of Biased Samples


Sampling only those people who are easy to
contact

Sampling only those who you can contact


Sampling only those who self-select (i.e.,
invite themselves)
32

Getting a Representative Sample


Probability sampling
Draw the sample at random from that population
Every member of the population has an equal chance of being in the sample

1.

Simple random sampling


1.

Most basic form of prob. sampling, but difficult and time-consuming


1.
2.

2.

Cluster sampling
1.
2.

3.

Start with a list of clusters and take a random sample of clusters from that list and include
every person from each of those selected clusters
E.g., what to randomly sample school districts in OR; start with list of districts (clusters) in
the area, and randomly select 4 of those districts (clusters) and include every child from each
cluster in your sample

Multistage sampling
1.
2.

4.

Assign a number to every person in the pop.


Use a table of random numbers to select a sample from the pop

Similar to #2: but you select two random samples


Start with a list of clusters and take a random sample of clusters from that list but then take a
random sample of children rom each selected clusters

Stratified random sampling


1.
2.

Select particular demographic characteristics on purpose and then randomly select


individuals within each of the categories
e.g., in a study of self regulation development, stratify on child age to obtain at least X
number kids from age 3, age 4, and age 5 into the study

33

Getting a Representative Sample


Probability sampling
Draw the sample at random from that population
Every member of the population has an equal chance
of being in the sample

4. Stratified Random Sampling cont.


Oversampling

Is a variant of stratified random sampling


Use stratified random sampling and deliberately include
more of one group, usually when that group is difficult to
engage in research or in low numbers in your population.

e.g., oversampling for physically-abused children helps to insure


there are adequate numbers of participants in the sample
34

Random Sampling
vs.
Random Assignment
Random sampling (i.e., probability sampling)
Get a sample using some random method so that each member of the
population of interest has equal chance of being in the sample
Enhances ___________ validity

Random assignment (used only in experimental designs)


Assign members of the sample at random to the groups or conditions
of the IV, for example, by flipping a coin
Enhances ____________ validity
35

Non-Representative Sampling Methods


Convenience sampling
Samples chosen on the basis of who is easy to access

Purposive sampling
Choosing a sample of only certain kinds of people you want to
study

Snowball sampling
A variation of purposive sampling used to find rare individuals
for a research study or the sample is otherwise hard to obtain
Each participant in the study is asked to recommend a few
acquaintances to the study
36

Research Design in Counseling


Psychology

Class 7

Instructor:

Elizabeth A. Skowron, Ph.D.,


257 HEDCO Building
541-346-0913
eskowron@uoregon.edu
1

Three conditions for determining

causality
1. Co-variation (i.e., correlation)
2. Temporal precedence
3. Ruling out alternative explanations (due to extraneous
3rd variable threatsi.e., internal validity)

Tools for Testing Associational


Hypotheses
Kinds of studies that lead to associational claims
Correlational research (i.e., ex post facto)
o 2+ measured variables (regardless of the stats used) make a study correlational
o Prioritize construct validity & statistical conclusion validity, & external validity
o Avoid temptation to make causal inferences from these kinds of studies

Kinds of graphs & statistics used to describe associations


Bivariate correlations
o Positive, negative, zero, & curvilinear
o Graph association between scores on 2 variables using Scatterplot and a
correlation coefficient, r

Designing and evaluating studies that make an


associational claim (via the 4 big validities)
3

Testing Associational Hypotheses


Bivariate correlations
o Positive, negative, zero, & curvilinear
o Graph association between scores on 2 variables using scatterplot
o Calculate strength of correlation coefficient, r

Testing Associational Hypotheses


Bivariate
correlations

Correlation coefficients (r)


Associations:
between 2 continuous variables: correlation coefficient, r
when 1 variable is categorical: t test (or a point-biserial correlation)
when both variables are categories: phi coefficient
Ascertain strength of associations: Cohens conventions

Effect Size

Type
Small

Medium

Large

.10

.30

.50

d/g

.20

.50

.80

ratio

1.50

2.50

4.25

Testing Associational Hypotheses


Designing and evaluating studies that make an
associational claim (via the 4 big validities)

Statistical conclusion validity


1.
2.
3.

Effect size?
Correlation statistically significant?
Are there subgroups?

Testing Associational Hypotheses


Statistical conclusion validity
o

Are there subgroups?

Interpret scatterplot below

Consider subgroups (class standing)

GPA

# Absences

Testing Associational Hypotheses


Statistical conclusion validity
o

Are there subgroups?

Now consider subgroups (class standing) and interpret scatterplot

Freshmen
Juniors
GPA
Seniors

# Absences

Testing Associational Hypotheses


Statistical conclusion validity
o
o

Could outliers (extreme scores) be affecting the relationship between


variables?
More likely with smaller samples

GPA

# Absences

10

Testing Associational Hypotheses


Construct validity
o

How well were our variables measured?

Good Reliability?

Does each measure what its intended to measure (Validity)?

11

Testing Associational Hypotheses


External validity
o

To whom can we generalize?

o
o
o

To whom do we wish to generalize?


Which population(s) did we sample from?
What methods did we employ to sample?

Moderating variables

In what subgroups does the association exist?

Goal: to learn whether the association is different within different levels of


the moderator (e.g., at low SES, moderate SES, or high SES)

12

Three conditions for determining

causality
1. Co-variation (i.e., correlation)

2. Temporal precedence
3. Ruling out alternative explanations (due to extraneous
3rd variable threatsi.e., internal validity)

13

Establishing Temporal Precedence


Longitudinal designs: enable us to examine evidence for
temporal precedence in the relation between our 2
variables of interest
o Useful for other reasons as well
o There are many variables that we cannot manipulate, or it would be unethical to
do so (e.g., exposure to violent TV shows; smoking)
o Thus useful when experiments are not practical

How to:
o Measure same variables in same people over two+ different time points

14

Longitudinal designs

Testing temporal associations between watching violent TV shows and


aggression
TV
Violence
3rd grade

TV
Violence
13th grade

Aggression
3rd grade

Aggression
13th grade

1. Cross-sectional correlations
2. Autocorrelations
3. Cross-lagged correlations

15

Longitudinal designs
(intensive repeated measures)

Temporal associations between maternal physiology & harsh parenting


Hostile
control

Physiological
arousal

1. Cross-sectional correlations
2. Autocorrelations
3. Cross-lagged correlations

Hostile control
(30 later)

Physiological
arousal
(30 later)

16

Longitudinal Designs
Interrupted Time Series

17

Longitudinal Designs
Stable Baseline Designs
o Assess baseline via multiple assessments over time in an extended fashion to
establish consistent scores, then introduce the intervention/experimental
condition and continue with over time assessments post-intervention

Multiple Baseline Designs


o Introduction of intervention components is staggered across time, contexts, or
situations (e.g., 3 problem behaviors in classroom identifiedintroduce
intervention for each one in staggered fashioncontinue to assess all behaviors)

Reversal Designs
Best used in situations when the intervention would not cause lasting
change (i.e., to test a therapy or educational intervention)
Some ethical concerns with withdrawal a treatment

18

Longitudinal Designs
Stable Baseline Designs
o Assess baseline via multiple assessments over time in an extended fashion to
establish consistent scores, then introduce the intervention/experimental
condition and continue with over time assessments post-intervention

Intervention

On-task
behavior

---------Baseline --------------- -----Post intervention--------

19

Longitudinal Designs
Multiple Baseline Designs
o Introduction of intervention components is staggered across time, contexts, or
situations (e.g., 3 problem behaviors in classroom identifiedintroduce
intervention for each one in staggered fashioncontinue to assess all behaviors)

BASELINE

INTERVENTION

Poking
neighbor

Grabbing
objects

Not raising
hand
SESSIONS

20

Three conditions for determining

causality
1. Co-variation (i.e., correlation)

2. Temporal precedence
3. Ruling out alternative explanations (due to
extraneous 3rd variable threatsi.e., internal validity)

21

Bivariate correlations show covariance.


_______
But not temporal precedencenot sure which
variable came first
Solution: cross-lag panel designs (longitudinal designs)

And not internal validityno control for third


variables
Solution: multiple regression

Ruling Out Third Variables with


Multiple-Regression Designs
Measuring more than two variables
Regression results indicate if a third variable affects the
relationship
Adding more predictors to a regression
Regression does not establish causation

The Third Variable


Problem

Multiple Regression Helps with the


Third Variable Problem

Adding More Predictors

Multiple Regression and the Third


Variable Problem
Review: Are multiple regression studies able to show
causation?

Temporal precedence? (maybe not)


Internal validity? (You can only control
for variables that you thought to
measure.)
Good experiments are still the best.

Multiple Regression Helps with the


Third Variable Problem

Regression Does Not (Definitively)


Establish Causation

Getting at Causality

Mediation
Start with an association between two variables:
(IV) RECESS and (DV) BEHAVIOR PROBLEMS (link C).
Mediation hypotheses propose a mechanism for a bivariate relationship. Why are these
two variables correlated? (i.e., Recess affects Physical Activity which then impacts
Behavior Problems)

Mediation hypotheses are causal statements.


Mediators specify a time sequence for the three variables (temporal precedence).
Mediators also specify the mechanism (IV affects DV through the mediator).

Steps in Testing Mediation


1.
2
3
4

Test path c
Test path a
Test path b
Regression (test path c):
DV is behavior problems
IVs are physical activity and recess
Does the recess beh problems link (path c) get
smaller when physical activity is
controlled/accounted for?
If YES, then physical activity is a mediator.

Mediators Versus Third


Variables
Mediation Model

3rd Variable
Problem

Moderator Effect
Extroversion

Gender

Group
conversations

In Class Activity #4
Indicate whether each statement below is describing a mediation hypothesis, a third variable argument, or a
moderator result. First, identify the key bivariate relationship. Next decide whether the extra variable comes
between the two key variables or is causing the two key variables simultaneously. Then draw a sketch of
each explanation, following the examples in Figure 9.13 in the text.
1.

2.
3.

1.
2.

3.

Having a cognitively demanding job is associated with cognitive benefits in later years, because
people who are highly educated take cognitively demanding jobs, and people who are highly
educated have better cognitive skills.
Having a cognitively demanding job is associated with cognitive benefits in later years, but only
among men, not among women.
Having a cognitively demanding job is associated with cognitive benefits in later years, because
cognitive challenges build lasting connections in the brain.
Viewing violent television is associated with aggressive behavior because children model what
they see on TV.
Viewing violent television is associated with aggressive behavior because people who watch more
violent TV have more lenient parents, and these lenient parents also do not care if their children
are violent.
Viewing violent television is associated with aggressive behavior very strongly among teenagers,
but less strongly among young adults.

36

Research Design in Counseling


Psychology

Class 8
Instructor:

Elizabeth A. Skowron, Ph.D.,


257 HEDCO Building
541-346-0913
eskowron@uoregon.edu
1

Analyses

Design selection
2

Three conditions for determining


causality
1. Co-variation (i.e., correlation)

2. Temporal precedence
3. Ruling out alternative explanations (due to extraneous
3rd variable threatsi.e., internal validity)

Testing Causal Hypotheses


Review basic components of Experiments
o Independent variables
Manipulated
o Dependent variables
Measured
Three conditions of causality
1. Establishing covariation
2. Establishing temporal precedence
3. Establishing internal validity

Two kinds of designs


4

Testing Causal Hypotheses


Two kinds of designs that support causal claims

1. Independent-groups designs
o (i.e., between-groups or between-persons or BP designs)
o Different groups of participants are assigned to different levels of the
independent variable

2. Within-groups designs
o (i.e., within-persons or WP designs)
o One group of participants are assigned to (or presented with) all levels of the
independent variable
Enables researcher to treat each participant as his/her own control
5

Testing Causal Hypotheses


1.

Independent-groups designs
o

(i.e., between-groups or between-persons or BP designs)

Two basic forms of this design


1.

Posttest only designs: random assignment and 1 posttest

R
R

IV: group 1

Measure of DV

IV: group 2

Measure of DV

Randomly
Assign

1.

O2a
O2b

Test for covariation by detecting


differences in the dependent variable;
establish temporal precedence bec.
IV precedes changes in DV; if study is
conducted well (no design confounds, no
selection effects), internal validity is
established.

Pretest-posttest designs: random assignment & key DVs are measured twiceonce before and once
after exposure to the IV
R
R

Randomly
Assign

O1
O1

O2
O2

Measure of DV

IV: group 1

Measure of DV

Measure of DV

IV: group 2

Measure of DV

All above applies plus


Use pre-posttest design to
evaluate whether random
assignment made groups equal
(relevant with small n studies);
can better track change over
time in each group
6

EXAMPLE
Study testing the effects of two
kinds of praise on childrens
problem-solving effort:

1.

Process praise: you must have


worked hard at these problems
Person praise: you must be
smart at these problems

Independent-groups designs
o

(i.e., between-groups or between-persons or BP designs)

Two basic forms of this design


1.

Posttest only designs: random assignment and 1 posttest

R
R

Randomly
Assign

1.

R
R

Randomly
Assign

O2a
O2b

IV: process
praise

# problems
solved

IV: person
praise

# problems
solved

7
6
5
4
3
2
1
0

Process
Person

# problems solved
Pretest-posttest designs: random assignment & key DVs are measured twiceonce before and once
after exposure to the IV
O1
O1

O2
O2

# problems
solved

IV: process
praise

# problems
solved

# problems
solved

IV: person
praise

# problems
solved

7.0
6.5
# problems 6.0
solved
5.5
5.0
4.5
4.0

Process
Person

Trial 1 (pre)

Trial 2
(post)

Posttest only designs


vs.
Pretest-posttest designs
o Which Design is Better?
o It depends..
o Posttest only design
combines random assignment with a manipulated IVenabling
powerful causal conclusions
o Pretest-posttest design
Adds a pre-testing stephelps if you want to be sure that IV
levels are equivalent at pretesting (as long as the pretest doesnt
change behavior), and helps to more clearly map patterns of
change
8

Testing Causal Hypotheses


1. Within-groups designs
o (i.e., within-persons or WP designs)
o Concurrent-measures design
Participants are exposed to all levels of an IV at roughly the same time, and a
single DV measure is taken
o e.g., Harlows study of attachment in baby monkeys
Two mothers are presented
IV: (mother type) A wire mother w/milk vs. A cloth mother w/no milk
DV: preference as measured by time spent clinging to either
o e.g., Coke v. Pepsi taste test
Wire mom w/milk

One group

Clinging behavior
Cloth mom

Testing Causal Hypotheses


1. Within-groups designs
o (i.e., within-persons or WP designs)
o Repeated-measures design
Participants are measured on a DV more than onceafter exposure to each
level of the IV
o e.g., Bick & Doziers (2008) study of social bonding in new mothers
Two toddlers are presented and mothers instructed to interact
closely with them
IV: (toddler type) own toddler vs. different toddler
DV: Oxytocin levels in bloodstream (social neuropeptide

One Group

Interact w/own
toddler

Measure oxytocin

Interact w/different
toddler

central to human bonding)

Measure oxytocin

10

Testing Causal Hypotheses


1. Within-groups designs
o (i.e., within-persons or WP designs)
o Advantages of Within-groups designs
Ensures participants in (or exposed to) all levels of the IV are equivalent.
Why ____________________________?
Gives the research study more (statistical) power to see differences across
conditions if they exist. Why ___________________? As per MAXMINCON,
when extraneous differences in demographic and other personality
variables, etc. are held constant across all levels, we can more easily detect
an effect of the IV manipulation if there is one.
These designs require fewer participants overall

11

Within-groups designs
(i.e., within-persons or WP designs)
Do within-group designs allow you to make causal
claims?

Covariation_____?
Temporal precedence____________?
Threats to internal validity_______________?

Potential threat to internal validity for WP designs = if being exposed to one


condition changes how someone reacts to the other condition(s)
o Called: order effects or practice effects or carryover effects

Solution?
o Counter-balancing controls for order effects

Randomly
Assign

Interact w/own
toddler

Measure
oxytocin

Interact
w/different toddler

Measure
oxytocin

Interact
w/different toddler

Measure
oxytocin

Interact w/own
toddler

Measure
oxytocin

12

Testing Causal Hypotheses


Designing and evaluating studies that make a causal claim
(via the 4 big validities)

Construct validity
o How well were the variables measured and manipulated?

External validity
o To whom or to what can you generalize the causal claim?
To other people?
To other situations?

Statistical conclusion validity


o How well do your data support your causal conclusion?
1. Is the different statistically significant?
2. How large is the effect?

Internal validity
o Are there (plausible) alternative explanations for the outcome?

13

Testing Causal Hypotheses


Designing and evaluating studies that make a causal
claim (via the 4 big validities)
Internal validity
o Are there (plausible) alternative explanations for the outcome?
o Three fundamental questions worth asking
1. Did the design of the experiment ensure there were no design confounds?
Or did some other variable accidentally covary along with the intended independent variable?

2. If the experimenters used an independent-groups design, did they control


for selection effects by using random assignment or matching?
3. If the experimenters used a within-groups design, did they control for
order effects by counterbalancing?

14

Threats to Internal Validity that can apply to


an experiment

Many threats to validity of studies can be corrected for simply by adding


a comparison group.

A few threats may apply to any intervention study/experiment


1. Observer bias
Possible in any study with behavioral/observed DVs
o

Occurs when researchers expectations influence their ratings/scores/interpretation of the results

Threatens internal validity (an alternative explanation now exists) and construct
validity (ratings/scores dont represent true scores)
Solution: ensure staff who measure the DV are blind to study hypotheses
2. Demand characteristics
A problem when participants guess what the study is supposed to be about &
change their behavior in the expected direction
Solution: conduct a double-blind study, where neither staff nor participants know
which condition they are in; at minimum, ensure staff are blind to condition
3. Placebo effects
Occur when participants improve after treatment, but only because they believe
they received an effective intervention
15

In-class activity #5: Article review


Prinz et al., 2009

Research question
o

Specific hypotheses

o
o

IV = __________________; # levels of the IV = _______________


Levels of the IV are:

DVs: # of DVs = ______; Specific DVs are:_______________, ___________, and ____________________

Design: ___________________________

Diagram the design

Describe the random assignment process. Who/what was


randomized?

Who were the participants?

Describe the Triple P intervention condition

Were the hypotheses supported?

Did they acknowledge plausible threats to validity? What are some


examples?
16

Beth Stormshak, Ph.D.


Professor, College of Education
University of Oregon

Implementation Science
An

intervention is one thing


Implementation is something
else altogether

According to NIH (2008):


The use of strategies to adopt and integrate evidence-based
health interventions and change practice patterns within and
across specific systems
Action Oriented
Within Settings or Systems
AND collects data
Chambers DA. Advancing the science of implementation: A workshop summary.
Administration and Policy in Mental Health and Mental Health Services Research.
2008;35(1-2):3-10.
3

1. We know a lot about what works


10K reviewed studies in What Works Clearinghouse
2. We are short on implementation action strategies to put
what works into practice:
3. It takes too long for research to affect practice

T1 Type 1 The application of basic research findings


to the development of interventions
T2 Type 2 Investigates the process and mechanism
through which tested and proven interventions are
integrated into practice and policy
T1 research is more common, T2 research is more
limited

The use of effective interventions without implementation


strategies is like serum without a syringe; the cure is available,
but the delivery system is not
Fixsen, Blase, Duda, Naoom, Van Dyke,2010.
Only a small percentage of interventions implemented by
community based delivery systems are evidence based.

Longitudinal Studies of a Variety of Comprehensive School Reforms

Effective
Interventions

Actual Supports
Years 1-3

Outcomes
Years 4-5

Every Teacher
Trained

Fewer than 50% of


the teachers
received some
training

Fewer than 10% of


the schools used the
CSR as intended

Every Teacher
Continually
Supported

Fewer than 25% of


those teachers
received support

Vast majority of
students did
not benefit

Aladjem & Borman, 2006; Vernez, Karam, Mariano, & DeMartini, 2006

17 Year Gap in Health Care

Is the gap between research and practice similar in


education to that existing in health?
Types of Gaps?
As long as?
As important to shorten? Which way?
As resistant to change?

Real World Relevance

Sustainment
Implementation

Implementation

Adoption /
Preparation

Making a
Program
Work

Generalized
knowledge

Exploration

Local
knowledge
Effectiveness
Studies

Does a
Program
InterventionWork?
Could a
Program
Work?

Efficacy
Studies

Preintervention

Traditional Translation Pipeline

IOM 2009
Landsverk,
Brown et al.
2012
Aarons et al.,
2011
10

Intervention: Program, Practice, Policy, Principles


Practice Setting: Delivery Support System
Ecological System: Population and Community/Cultural Context

11

Preadoption
How do preferences for EBI impact consumer choices?
What are the key channels for stakeholders to obtain EBI
information?
Adoption
What are key market, organizational, and other factors influencing
adoption decisions?
What evidence is used by decision makers in the adoption phase?

Implementation
What are the most effective delivery systems for different
settings?
What influences consumer participation?
What are the factors that impact implementation quality?
Sustainability
What funding models are needed to sustain the program?
What are the effective leadership strategies for long-term
implementation?

The Baltimore City Public School System (BCPSS) has


collaborated in 3 generations of education and
prevention field trials.
Trials were directed at helping children master
obeying rules of behaving, attending, academic
learning, socializing appropriately in 1st grade
classroom.
Interventions were tested separately in 1st generation
(our focus today), then together in later trials.

15

Levels of Prevention
and Treatment
Universal

Selective

Indicated

Rx
Med, MH,
Soc
Welfare
16

Early Risk in Prevention Research


Over the last four decades much has been learned about early
risk factors and paths leading to drug abuse, and other
behavioral, mental health, and school problems.
Aggressive, disruptive behavior as early as 1st grade has been
repeatedly found a risk factor for later drug and alcohol abuse
and disorders, delinquency, violence, tobacco use, high risk sex,
school failure and other high risk behaviors.
Parenting interventions are one of the most effective for
reducing aggressive behavior over time.

17

You have decided to implement your intervention in


schools
What are the barriers?
What are the strengths?
How will you go about doing this?
Do you think you will be successful?

Developmental
& Measurement
Models

Test and Tailor


for Real
World Conditions

Improved:
Effectiveness,
Efficiency,
Expense
Ethics

Revise for
Public Health
Service Settings

Intervention
Design &
Experiment

An Overview of the Family Check-Up


and Follow-Up Services
Brief, tailored
PMT

The Family Check-Up

Initial
Interview

Assess
Child &
Family

Parent
Feedback &
Planning

PMT
Treatment

Child
CBT
Community
Treatment
Resources

FCU

Mindful
Parenting
(proactive,
Monitoring)

Positive
Behavior
Support

Setting
Healthy
Limits

Family
Relationship
Building

Project Alliance 1
Portland Public Schools, 1995-present
Project Alliance 2
Portland Public Schools, 2005-2010
Early Steps
Children involved in WIC, ages 2-10
Shadow Project
American Indian families in PNW
Community Mental
CMH agencies in Portland120 families
Health (CDC)
Positive Family
44 Oregon Middle Schools
Support
Positive Family
5 Oregon Elementary schools
Support: Elementary school

Service Systems Affecting Mental Health


of Children and Adolescents
Developmental
Stage

Early
Childhood

Childhood

WIC,
Preschools
Public School
Setting

Early
Adolescence

Adolescence

Community
Programs:
Treatment and
Rehabilitation

Effects of the Early Childhood Family Check-up:


Average 2 Annual Sessions 70% Engagement
Outcome
Domain

Intervention
Effects

Period of
Development

Authors

Behavioral

* Problem behavior
* Problem behavior

Age 2 to 4
Age 2 to 7.5

Shaw et al 2006
Dishion et al 2013

Affective

* Co-morbid depression
* Maternal depression

Age 2 to 4
Age 2 to 4

Connell et al, 2009


Shaw et al, 2009

Parenting

* Observed PBS
* Reduced coercion

Ages 2 to 3
Ages 2 to 4

Dishion et al, 2008


Smith et al, 2013

Cognitive/Educatio
nal

*Improved effortful
control and language

Ages 2 to 7

Chang et al, in press

*School readiness

Ages 2 to 7

Brennan et al, 2013

Effects of the School-based Family Check-up:


Average 6 Sessions over 2 years and 25-50% Engagement
Outcome
Domain

Intervention
Effects

Period of
Development

Authors

Behavioral

* Antisocial Behavior
*Early Drug Use
*Drug (ab)use
*Problem behavior
*High risk sex

Age 11 to 19
Age 11 to 14
Age 11 to 23
Age 11 to 14
Age 11 to 22

Van Ryzin et al, 2012


Dishion et al 2002
Veronneau et al in press
Stormshak et al, 2010
Caruthers et al 2013

Affective

*Depression
*Depression

Age 11 to 15
Age 11 to 14

Connell et al, 2006


Fosco et al, in press

Parenting

* Observed Monitoring
* Reduced conflict

Ages 11 to 14
Ages 11 to 16

Dishion et al, 2003


Van Ryzin et al, 2012

Cognitive/Educatio
nal

*Improved grades
and attendance

Ages 11 to 17

Stormshak et al 2010

Phase 1
Exploration and
Readiness:
1) Information/br
ochure, cost
structure.
2) Assessment
process and
review
3) Plan and scope

Phase 2
Installation:
1) Role definition
2) Priority and
staging
3) Work site
training
4) Technology
Transfer
5) Supervision
training

Phase 3:
Implementation
consultation:

Phase 4:
Sustainability:
1)

1) Ongoing COACH
supervision
2) Feedback
monitoring
3) Clinical
outcome
monitoring

2)

3)
4)

Certification of
therapists
Certification of
supervisors
Certification of
agency
Plan for fidelity
Monitoring

Funding for this research supported by the


Department of Education IES, grant
R324A090111

Awarded to John Seeley, Ph.D., Tom Dishion,


Ph.D., Beth Stormshak, Ph.D., & Keith
Smolkowski, Ph.D.

Increased problem behavior

Increased peer group influence

Decreased attendance

Decreased parent involvement

Decreased academic performance

Robust evidence linking parenting practices and


family engagement in school to positive outcomes
for adolescents and young adults
Biglan et al., 2004; Dishion, et al., 1996, 2002; Fosco, et al., 2013; Henderson & Berla, 1994;
Henderson & Mapp, 2002

According to public health perspective:

Effective interventions should reach large numbers of people


Biglan, 1995; Biglan, Sprauge, & Moore 2006

Interventions should be designed to fit in or alter existing servicedelivery systems


Hoagwood & Koretz, 1996

Schools are the largest, and often only, providers


of child behavioral health services for many
communities
Burns, et al., 1995; Hoagwood, et al., 2001, 2003

A school-based system to form effective partnerships with


parents to support student success

What it is:

Strengths-based program
Integrated into PBIS tiers
Focused on family-school partnerships
Proactive
Inform, Invite, Involve parents in response to student needs
Foundation in empirically-supported strategies

Individualized Supports
Functional Behavioral
Assessments

Specialized Supports
Check-In/Check-Out

School Rules &


Expectations
Positive Reinforcement
Student Needs Screening

Indicated

Selected

Universal

Family Check-Up
Parenting Support Sessions
Parent Management Training
Community Referrals
Parent Integration CICO
Attendance & Homework Support
Home-School Beh Change Plans
Email and Text messages

Family Resource Center


Parenting Materials
(Brochures/Videos/Handouts)
Positive Family Outreach
Proactive Parent Screening

Assist middle school staff as they implement Positive


Family Support within their existing Positive Behavioral
Interventions and Supports infrastructure.

Brochures, TV/DVD, Supplies

Meeting Table, Computer,


Coffee/Danishes on counter

Use Home
Incentives Plan

Check-In/
Check-Out

Invite Parents
to Join CI/CO
For teachers &
family resource specialists

For parents and students


(with teacher & family
resource specialist help)

For teachers and parents

Tier I Family Support:


Parent Student Readiness Screener
Parent
Readiness
Screener
(school entry)

Teacher &
Staff
Readiness
Screener
(fall-spring)

SchoolParent
PBS plan

Family
Check Up

Tailored
Student &
Family
Support

Tier I Family Support:


Parent Student Readiness
Screener

A unidimensional,
psychometrically sound
parent screener

Linked with proximal


attributes of student
functioning (e.g.,
completes homework and
assignments on-time,
shows up on-time to
school)
Moore et al. (2014)

An Overview of the Family Check-Up


and Follow-Up Services

Tier III Family Support:


The Family Check-up
Brief, tailored
PMT

The Family Check-Up

Initial
Interview

Assess
Child &
Family

Parent
Feedback &
Planning

Dishion & Stormshak (2007);


Dishion, Stormshak, & Kavanagh (2012)

PMT
Treatment

Child
CBT
Community
Treatment
Resources

Recruitment
All middle schools in Oregon implementing PBIS
invited to participate
Strict adherence to PBIS later revised due to
recruitment difficulties
Interested schools provided with personal visit to
explain project and implementation process
Schools randomly assigned to intervention or wait-list
control (N=41)

Workshops
Spring before implementation:
All staff introduction to PFS to increase school-wide
awareness and buy-in

Summer before implementation:


2-day training for core PFS staff to familiarize with goals
and develop learning community
Had to be revised due to drastic budget cuts throughout
implementation

Fall of implementation:
All staff training to increase positive communication with
parents

Consultation
Intervention schools provided two years of
consultation
Planned visits and requested assistance

Consisted of:
Modeling positive family interventions
Problem solving regarding when and how to involve families in
intervention
Integration of family involvement into existing school
interventions
Setting up family resource center
Provision of parenting resources (brochures, videos, books,
etc)
Increasing positive and proactive family outreach

Family-School Wide Evaluation Tool (FamSET)


Multi-method, multi-source assessment
completed by trained assessor with
appropriate middle school staff member
Maintains alignment with the School-Wide
Evaluation Tool (SET; Horner et al., 2004)
Example items

Are parents contacted before a childs behavior gets


out of hand?
(1 = never, 4 = always)
At this school, do you offer family-based services or
educational material? (1 = never, 4 = always)

Provided questionnaire to assess parents' perspectives


on student strengths and risk factors (U)
Number of resources available to families at school (U)
Parents contacted before a child's behavior got out of
hand (U)
Defined system for regular, positive contact with
families (U)
Provided assessment-based feedback about parenting
related to academics (S)
Worked directly with parents to support family
involvement in academic issues (S)
Offered family-based services or educational material
(U)
Parents had input into school-wide policies regarding
student discipline practices (U)
Asked parents to participate in positive reward
systems for targeted school behaviors (S)
Worked directly with parents to support positive
parenting practices (I)
Followed-up with parents about previously discussed
concerns (I)
School budget contained an allocated amount of
money for school-wide behavioral support (U)

0.0
Adapted from Brown et al., 2013

0.5

1.0

1.5

2.0

2.5

3.0

80% of schools with the highest FamSET scores


were in the intervention condition

60% of schools with the lowest FamSET scores


were in the control condition

Universal Level
Poor Implementation

Adequate
Implementation

Strong Implementation

4.8%

28.6%

66.7%

Poor Implementation

Adequate
Implementation

Strong Implementation

4.8%

42.9%

52.4%

Poor Implementation

Adequate
Implementation

Strong Implementation

23.8%

71.4%

4.8%

Selected Level
Indicated Level

Intervention School

Universal

Selected

Indicated

Overall

Mad.

7.67

CP

7.67

Bro.

7.33

Cof.

7.33

HD

7.33

Ro.

7.33

BC

7.00

RR

7.00

Dam.

6.33

AS

6.00

Aza.

6.00

CR

6.00

WM

6.00

Sha.

5.67

WM

5.67

Ast.

5.33

Cre.

5.00

Tal.

5.00

DC

4.33

Lin.

3.33

Pio.

2.00

Conditions During Implementation: National

School Year

Operating Expenditure Capital Expenditure per


per Student
Student

2008-2009

$9392

$1364

2009-2010

$9275

$990

2010-2011

$9363

$777

2011-2012

$9366

$763

2012-2013

$9364

$556

From Oregon Department of Education,


2008-2013

Principal

SST

SPED

Counselor

Highest FamSET Scores

20%

31.6%

26.7%

8.3%

Lowest FamSET Scores

60%

66.7%

73.8%

66.7%

Note. Percent turnover from year 1 to


year 2, n=10

Table 1
FAM SET Implementation Findings
Control Schools
(n = 20)
Implementation Tier and Sample Items

Time 2/3a

Time 1
Mean
XX%b

Universal Implementation (range = 0 22)

10.65

Does your school have a room dedicated to parent or family


services?

30%

45%

Did your school offer parent topic nights?

35%

50%

Selected and Indicated Implementation (range = 0 22)

16.15

Offer family-based assessments for students struggling


academically or behaviorally?

45%

40%

Is there consistent follow-through on family support services


discussed in team meetings?

90%

95%

Number of Resources Available to Families (range = 0 11)

1.30

Is there a family support person identified at the school?

25%

a Third

Intervention Schools
(n = 21)

SD

Mean
XX%b

4.95

14.25

3.76

1.95

18.90

3.67

Time 1

SD

Mean
XX%b

3.58

10.86

1.67

4.56

35%

assessment for Wave A and B schools; Second assessment for Wave C schools
b Item level data indicate the percent of schools implementing each intervention component

Time 2/3a

SD

Mean
XX%b

SD

4.36

18.86

2.35 1.58

23.8%

85.7%

1.85

38.1%

85.7%

1.22

19.71

1.77 0.47

33%

76.2%

1.51

71.4%

95.2%

0.40

15.38

1.48
19%

3.83

2.99

7.48
71.4%

3.69 0.96
1.28

School readiness assessment important component of the


pre-implementation process

Implementation models rarely address the increased


response cost to school staff of changing routines and
expectations

More attention needed regarding how to reinforce school staff for


implementation efforts

Interventions more likely to be sustained when implemented


at the state or district level and supported with internal funds
High staff turnover often prohibits embedding interventions at the
individual school level

Intervention implementation most effective when scaffolded


and supported over a number of years
Funding for maintenance of implementation critical

Вам также может понравиться