Вы находитесь на странице: 1из 15

Evaluation Approaches, Framework, and Designs

HS 490
Chapter 14
This chapter focuses on evaluation approaches, an evaluation framework, and
evaluation designs. Houses (1980) taxonomy of eight evaluation approaches was
presented. No one approach is useful in all situations; therefore, evaluators should select
an approach or parts of approaches to structure the evaluation based on the needs of the
stakeholders involved with each program. The Framework for Program Evaluation in
Public Health presents a process that is adaptable to all health promotion programs, yet is
not prescriptive in nature.

The steps for selecting an evaluation design were also presented with a discussion
about quantitative and qualitative methods. Evaluation design should be considered early
in the planning process. Planners/evaluators need to identify what measurements will be
taken as well as when and how. In doing so, a design should be selected that controls for
both internal and external validity.

Evaluation Framework
The evaluation framework can be thought of as the skeleton of a plan that can
be used to conduct an evaluation. It puts in order the steps to be followed.

Evaluation Design
An evaluation design is used to organize the evaluation and to provide for
planned, systematic data collection, analysis, and reporting. A well-planned evaluation
design helps ensure that the conclusions drawn about the program will be as accurate as
possible.

Evaluation Approaches
Categorizing the different evaluation approaches. A brief description of each is
presented here:

Systems analysis uses output measures, such as test scores, to determine if the
program has demonstrated the desired change. It also determines whether funds have
been efficiently used, as in cost analysis.

Behavioral objectives, or goal-based evaluation, uses the program goals and


collects evidence to determine whether the goals have been reached.

Decision Making focuses on the decision to be made and presents evidence about
the effectiveness of the program to the decision maker (manager or administrator).

Evaluation Approaches
Goal-free evaluation does not base the evaluation on program goals; instead, the
evaluator searches for all outcomes, often finding unintended side effects.

Art criticism uses the judgment of an expert in the area to increase awareness and
appreciation of the program in order to lead to improved standards and better
performance.

Professional (accreditation) review uses professionals to judge the work of other


professionals; the source of standards and criteria is the professionals conducting the
review.

Evaluation Approaches
Quasi-legal evaluation uses a panel to hear evidence considering the arguments
for and against the program; a quasi-legal procedure is used for both evaluating and
policy making.

Case study uses techniques such as interviews and observations to examine how
people view the program.

Systems Analysis Approach


A systems analysis approach of evaluation is based on efficiently- determining
which are the most effective programs. It focuses on the organization, determining
whether appropriate resources are devoted to goal activities (and to nongoal activities,
such as staff training or maintenance of the system).

Economic Evaluations

Economic evaluations are typical strategies used in systems analysis approaches.


They have been defined as the comparison of alternative courses of action in terms of
both costs and outcomes. Control over rising health care costs has forced many
administrators and planners to be concerned about the cost of health promotion
programs.

Cost-identification Analysis (Cost-Feasibility)


Cost-identification analysis is used to compare different interventions available
for a program, often to determine which intervention would be the least expensive. With
this type of analysis, planners identify the different items (i.e., personnel, facilities,
curriculum, etc.) associated with a given intervention, determine a cost for each item,
total the costs for that intervention, and then compare the total costs associated with each
of several interventions.

Cost-benefit Analysis (CBA)


Cost-benefit analysis looks at how resources can best be used. It will yield the
dollar benefit received from the dollars invested in the program.

Cost-effectiveness Analysis (CEA)


Cost-effectiveness analysis is used to quantify the effects of a program in
monetary items. It is appropriate for health promotion programs than cost-benefit

analysis, because a dollar value does not have to be placed on the outcomes of the
program. Instead, a cost-effectiveness analysis will indicate how much it costs to
produce a certain effect. For example, based on the cost of a program, the effect of years
of life saved, number of smokers who stop smoking, or morbidity or mortality rates can
be determined.

Cost-utility Analysis (CUA)


This approach is different from the others in that the values of the outcomes of a
program are determined by their subjective value to the stakeholders rather than their
monetary cost. For example, an administrator may select a more expensive intervention
for a program just because of the good public relations (i.e., the subjective value in the
administrators eye) for the organization.

Behavioral Objectives, Goal-Attainment Approach, and Goal-Based Approach


Behavioral Objectives Approach

The most common type of evaluation model.

Focuses on the stated goals of the program.

Approaches using this type of goal-directed focuses are also known as goal
attainment and goal based.

In the behavioral objective approach, the program goals serve as the standards for
evaluation. This type of evaluation was first used in education, to assess student
behaviors. Competency testing is an example of goal-attainment evaluation, determining
whether a student is able to pass an exam or advance to the next grade.
Objective-Oriented Approaches

Objective- oriented approaches are probably the most commonly used approaches
to health promotion program evaluation. They specify program goals and objectives have
been reached. Success or failure is measured by the relationship between the outcome of
the program and the stated goals and objectives. This type of approach is based on action,
and the dependent variable is defined in terms of outcomes the program participant
should be able to demonstrate at the end of the intervention.

Goal Attainment/Goal Based


Emphasis is placed on how objectives are to be measured. This approach is also
found in business, where organizations are management by objectives to determine
how well they are meeting their objectives.

Five Steps in Measuring Goal Attainment:


Specification of the goal to be measured.
Specification of the sequential set of performances that , if observed, would
indicate that the goal has been achieved.

Identification of the performances that are critical to the achievement of the goal.

Description of the indicator behavior of each performance episode.

Collective testing to find whether each indicator behavior is associated with


each other.

Decision-Making Approach
There are three steps to the evaluation process: delineating (focusing of
information), obtaining (collecting, organizing, and analyzing information), and
providing (synthesizing information so it will be useful).

The decision maker, usually a manager or administrator, wants and needs


information to help answer relevant questions regarding a program.

The four types of evaluation in this approach include 1) context, 2) input, 3)


process, and 4) product (CIPP), with each providing information to the decision maker.

Context evaluation describes the conditions in the environment, identifies unmet


needs and unused opportunities, and determines why these occur.

Input evaluation is to determine how to use resources to meet program goals.

Process evaluation provides feedback to those responsible for program


implementation.

Product evaluation is to measure and interpret attainments during and after the
program.

Goal-Free Approach

Suggests that evaluation should not be based on goals in order to enable the
evaluator to remain unbiased. The evaluator must search for all outcomes, including
unintended positive or negative side effects. Thus, the evaluator does not base the
evaluation on reaching goals and remains unaware of the program goals.

Goal-Free Approach
The goal-free approach is not often used in evaluation. It is difficult for
evaluators to determine what to evaluate when program objectives are not to be used.
One concern is that evaluators will substitute their own goals, since there is a lack of
clear methodology as to how to proceed.

Management-Oriented Approaches
Management-oriented approaches focus on identifying and meeting the
informational needs of managerial decision makers. That is, good decision making is
best made on good evaluative information. In this approach, the evaluators and managers
work closely together to identify the decisions that must be made and the information
needed to make them. The evaluators then collect the necessary data about the
advantages and disadvantages of each decision alternative to allow for fair judgment
based on specified criteria. The success of the evaluation rests on the quality of the
teamwork between evaluators and decision makers.

CIPP
The acronym CIPP stands for the four type decisions facing managers, context,
input, process, and product. Context evaluation describes the conditions in the
environment, identifies unmet needs and unused opportunities, and determines why these
occur. The purpose of input evaluation is to determine how to use resources to meet
program goals. Process evaluation provides feedback to those responsible for program
implementation. The purpose of product evaluation is to measure and interpret

attainments during and after the program. It is the decision maker, not the evaluator, who
uses this information to determine the worth of the program.

Consumer-Oriented Approaches
Consumer-oriented approaches focus on developing evaluative information on
products, broadly defined, and accountability, for use by consumers in choosing among
competing products. This approach gets its label of consumer-oriented, in part, from
the fact that its an evaluation approach that helps protect the consumer by evaluating
products used by the consumer. The consumer-oriented approach, which is summative
in nature, primarily uses checklists and criteria to allow the evaluator to collect data that
can be used to rate the product. This is the approach used by: Consumer Reports when
evaluating various consumer products, principals when evaluating their teachers, and
instructors when they are evaluating the skill of their students to perform cardiorespiratory resuscitation (CPR). It is an approach that has been used extensively in
evaluating educational materials and personnel.

The highest level of checklist in the hierarchy is a COMlist. A COMlist is a


checklist comprised of the criteria that essentially define the merit of the product. For
example, what are the criteria that define an excellent health promotion program, or an
outstanding program facilitator, or exemplary instructional materials for a program? The
criteria of merit (COM) are identified by being able to answer the question: What
properties are parts of the concept (the meaning) of a good X? Thus if we were to
apply this to program planning, we would ask the question What are the criteria that
define an excellent health promotion program? Or, What are the qualities of an
outstanding facilitator? Or, What must be included for an instructional material to be
considered exemplary?

Table 14.1 Comparison of the Goal-Attainment and Goal-Free


Approaches

Goal-Attainment Approach
Have the objectives been reached?
Has the program met the needs of the target population?
How can the objectives be reached?
Are the needs of the program administrators and funding source being met?
Goal-Free Approach
What is the outcome of the program?
Who has been reached by the program?
How is the program operating?
What has been provided?

Expertise-Oriented Approaches
Expertise-oriented approaches, which are probably the oldest of the approaches to
evaluation, rely primarily on the direct application of professional expertise to judge the
quality of whatever endeavor is evaluated. Most of these approaches can be placed in
one of three categories, formal professional review systems, or informal professional
reviews, and individual reviews. Formal professional reviews are characterized by
having:

(1) structure or organization established to conduct a periodic review; (2)


published standards (and possibly instruments) for use in such reviews; (3) a
prespecified schedule (for example, every five years) for when reviews will be
conducted; (4) opinions of several experts combining to reach the overall
judgments of value; and (5) an impact on the status of that which is reviewed,
depending on the outcome.
The most common formal professional review system is that of accreditation.
Accreditation is a process by which a recognized professional body evaluates the work
of an organization (i.e., school, universities, and hospitals) to determine if such work
meets prespecified standards. If they do, then the organization is approved or accredited.
Examples of accreditation processes with which readers may be familiar are those of the
National Council for the Accreditation of Teacher Education (NCATE) which accredits
teacher education programs, including health education programs, and the Joint
Commission on Accreditation of Healthcare Organizations (JCAHO) which accredits
various healthcare facilities.

Expertise-Oriented Approaches
In all of the approaches presented so far in this chapter, the primary focus of each has
been on something other than serving the needs of the priority population. It is not that
those who use the previous approaches are unconcerned about the priority population, but
the valuation process does not begin with the priority population. The participantoriented approaches are different. They focus on a process in which involvement of
participants (stakeholders in that which is evaluated) are central in determining the
values, criteria, needs, data, and conclusions for the evaluation In addition, their
characteristics of less structure and fewer constraints, informal communication and
reporting, and less attention to goals and objectives may be a drawback for those who
want more formal, objective-type evaluation.
Fitzpatrick and colleagues (2004) have identified the following common elements of
participant-oriented approaches:
1. They depend on inductive reasoning. Understanding an issue or event or process
comes from grassroots observation an discovery. Understanding emerges; it is not
the end product of some preordinate inquiry plan projected before the evaluation
is conducted.
2. They use multiplicity of data. Understanding comes from the assimilation of data
from a number of sources. Subjective and objective, qualitative and quantitative
representations of the phenomena being evaluated are used.
3. They do not follow a standard plan. The evaluation process evolves as
participants gain experience in the activity. Often the important outcome of the
evaluation is a rich understanding of one specific entity with all the idiosyncratic
contextual influences, process variations, and life histories. It is important in and
of itself for what it tells about the phenomena that occurred.
4. They record multiple rather than single realties. People see things and interpret
them in different ways. No one knows everything that happens in a school, or in

the tiniest program. And no one perspective is accepted as the truth. Because only
an individual can truly know what she has experience, all perspectives are
accepted as correct, and a central task of the evaluator is to capture these realties
and portray them without sacrificing the programs complexity.

Table 14.2 Differences between conventional evaluation and participatory


evaluation
Who

Conventional Evaluation
External experts

What

Predetermined indicators of
success, primarily cost and
health outcomes or gains

How

Focus on scientific
objectivity, distancing
evaluators from other
participants; uniform,
complex procedures;
delayed, limited access to
results
Usually completion;
sometimes also midterm

When
Why

Accountability, usually
summative, to determine if
funding continues

Participatory Evaluation
Community, project staff
facilitator
People identify their own
indicators of success, which
may include health
outcomes and gains
Self evaluation; simple
methods adapted to local
culture; open, immediate
sharing of results through
local involvement in
evaluation processes
Merging of monitoring and
evaluation; hence frequent
small-scale evaluations
To empower local people to
initiate, control, and take
corrective action

Figure 14.1 Framework for Program Evaluation


Framework for Program Evaluation
Once evaluators have selected the approach or approaches that will be used in the
evaluation, they are ready to apply an evaluation framework. In 1999, an evaluation
framework to be used with public health activities was published. Since the framework is
applicable to all health promotion programs, an overview of it is provided here.

The early steps provide the foundation, and all steps should be
finalized before moving to the next step:
Step 1- Engaging stakeholders

This step begins the evaluation cycle. Stakeholders must be engaged to insure
that their perspectives are understood. The three primary groups of stakeholders are 1)
those involved in the program operations, 2) those served of affected by the program, and
3) the primary users of the evaluation results. The scope and level of stakeholder
involvement will vary with each program being evaluated.

Step 2- Describing the program:


This step sets the frame of reference for all subsequent decisions in the evaluation
process. At a minimum, the program should be described in enough detail that the
mission, goals, and objectives are known. Also, the programs capacity to effect change,
its stage of development, and how it fits into the larger organization and community
should be known.

Step 3- Focusing the evaluation design:


This step entails making sure that the interests of the stakeholders are addressed
while using time and resources efficiently. Among the items to consider at this step are
articulating the purpose of the evaluation (i.e., gain insight, change practice, assess
effects, affect participants), determining the users and uses of the evaluation results,
formulating the questions to be asked, determining which specific design type will be
used, and finalizing any agreements about the process.

Step 4- Gathering credible evidence:


This step includes many of the items mentioned in Chapter 5 of this text. At this
step, evaluators need to decide on the measurement indicators, sources of evidence,
quality and quantity of evidence, and logistics for collecting the evidence.

Step 5- Justifying Conclusions


This step includes the comparison of the evidence against the standards of
acceptability; interpreting those comparisons; judging the worth, merit, or significance of
the program; and creating recommendations for actions based upon the results of the
evaluations.

Step 6- Ensuring use and sharing lessons learned:


This step focuses on the use and dissemination of the evaluation results. When
carrying out this final step, concern must be given to each group of stakeholders.

In addition to the six steps of the framework, there are four standards of
evaluation. These standards are noted in the box at the center of Figure 14.1. The
standards provide practical guidelines for the evaluators to follow when having to decide
among evaluation options. For example, these standards help evaluators avoid
evaluations that may be accurate and feasible but not useful or one that would be useful
and accurate but it infeasible.

The four standards are:


Utility standards ensure that information needs of evaluation users are satisfied.
Feasibility standards ensure that the evaluation is viable and pragmatic.
Propriety standards ensure that the evaluation is ethical (i.e., conducted with
regard for the rights and interests of those involved and effected).

Accuracy standards ensure that the evaluation produces findings that are
considered correct.

Selecting an Evaluation Design


As noted in the section above evaluators must give careful consideration to the
evaluation design, since the design is critical to the outcome of the program.

There are few perfect evaluation designs, because no situation is ideal, and there
are always constraining factors, such as limited resources. The challenge is to devise an
optimal evaluation- as opposed to an ideal evaluation. Planners should give much
thought to selecting the best design for each situation.

The following questions may be helpful in the selection of a


design:
How much time do you have to conduct the evaluation?
What financial resources are available?
How many participants can be included in the evaluation?
Are you more interested in qualitative or quantitative data?
Do you have data analysis skills or access to computers and statistical
consultants?

In what ways can validity be increased?

Is it important to be able to generalize your finding to other populations?

Are the stakeholders concerned with validity and reliability?

Do you have the ability to randomize participants into experimental and control
groups?

Do you have access to a comparison group?

There are four steps in choosing an evaluation design. These four steps are
outlined in

Figure 14.2.
Step 1
The first step is to orient oneself to the situation. The evaluator must identify
resources (time, personnel), constraints, and hidden agendas (unspoken goals). During
this step, the evaluator must determine what is to be expected from the program and what
can be observed.

Step 2
The second step involves defining the problem- determining what is to be
evaluated. During this step, definitions are needed for independent variables (what the
sponsors think makes the difference), dependent variables (what will show the
difference), and confounding variables (what the evaluator thinks could explain
additional differences).

Step 3

The third step involves making a decision about the design- that is, whether to use
qualitative or quantitative methods of data collection or both.

Quantitative Method
The quantitative method is destructive in nature (applying a generally accepted
principle to an individual case), so that the evaluation produces numeric (hard) data, such
as counts, ratings, scores, or classifications. Examples of quantitative data would be the
number of participants in a stress-management program, the ratings on a participant
satisfaction survey, and the pretest scores on a nutrition knowledge test. This approach is
suited to programs that are well defined and compares outcomes of programs with those
of other groups or the general population. It is the method most often used in evaluation
designs.

Qualitative Method
The qualitative method is an inductive method (individual cases are studied to
formulate a general principle) and produces narrative (soft) data, such as descriptions.
This is a good method to use for programs that emphasize individual outcomes or in
cases where other descriptive information from participants is needed.

Figure 14.2 Steps in Selecting an Evaluation Design


Figure 14.3 Methods Used in Evaluation
Case studies- In-depth examinations of a social unit, such as an individual family,
household, worksite, community or any type of institution as a whole

Content analysis- a systematic review identifying specific characteristics of


messages

Delphi techniques- See chapter 4 for an in-depth discussion of the Delphi


technique

Elite interviewing- interviewing that focuses on a certain type (elite) of


respondent

Figure 14.3 Methods Used in Evaluation


Ethnographic studies- a variety of techniques (participant-observer, observation,
interviewing, and other interactions with people) used to study an individual group

Films, photographs, and videotape recording (film ethnography)- includes the


data collection and study of visual images

Focus group interviewing- see Chapter 4 for an in-depth discussion of focus group
interviewing

Figure 14.3 Methods Used in Evaluation


Historical analysis- a review of historical accounts that may include an
interpretation of the impact on current events

In-depth interviewing- A less structured, deeper interview in which the


interviewees share their view of the world

Kinesics- the study of body communication page 233


Normal group process- see Chapter 4 for an in-depth discussion of the nominal
group process

Figure 14.3 Methods Used in Evaluation


Participant-observer studies- those in which the observers (evaluators) also
participate in what they are observing

Quality cycle- a group of people who meet at regular intervals to discuss


problems and to identify possible solutions- page 236

Unobtrusive techniques- data collection techniques that do not require the direct
participation or cooperation of human subjects- page 236- and include such things as
unobtrusive observation, review of archival data, and study fo physical traces.

Rather than choose one method, it may be advantageous to combine quantitative


and qualitative methods. Steckler and colleagues (1992) have discussed integrating
qualitative and quantitative methods, since, to a certain extent, the weakness of one
method is compensated for by the strengths of the other. Figure 14.4 illustrates four ways
that the qualitative and quantitative methods might be integrated.

Figure 14.4 Method 1


Figure 14.4 Method 2
Figure 14.4 Method 3
Figure 14.4 Method 4

Experimental and Control Groups


Experimental group- the group of individuals participating in the program that is
to be evaluated. The evaluation is designed to determine what effects the program has on
these individuals.

Control group- should be as similar to the experimental group as possible, but the
members of this group do not receive the program (intervention or treatment) that is to be
evaluated.

Without the use of a properly selected control group, the apparent effect of the
program could actually be due to a variety of factors, such as differences in participants
educational background, environment, or experience. By using a control group, the
evaluator can show that the results or outcomes are due to the program and not to those
other variables.

Since the main purpose of social programs is to help client, the clients viewpoint
should be the primary one. It is important to keep this in mind when considering ethical
issues in the use of control groups. Conner (1980) identifies four underlying premises for
the use of control groups in social program evaluation:

All individuals have a right to status quo services.

All individuals involved in the evaluation are informed about the purpose of the
study and the use of a control group.

Individuals have a right to new services, and random selection gives everyone a
chance to participate.

Individuals should not be subjected to ineffective or harmful programs.

Comparison Group
When participants cannot be randomly assigned to an experimental or control
group, a nonequivalent control group may be selected.

It is important to find a group that is as similar as possible to the experimental


group, such as two classrooms of students with similar characteristics or a group of
residents in two comparable cities. Factors to consider include:

Participant's age

Gender

Education

Location

Socioeconomic status

Experience

As well as any other variable that might have an impact on program results.

Evaluation Designs
Measurements used in evaluation designs can be collected at three different times:
after the program; both before and after the program; and several times before, during,
and after the program.

Measurement is defined as the method or procedure of assigning numbers to


objects, events, and people.

Pretest- measurement before the program begins

Posttest- measurement after the completion of the program

Figure 14.5 Evaluation Designs


Figure 14.5 Evaluation Designs
Figure 14.5 Evaluation Designs
Experimental Design
Offers the greatest control over the various factors than may influence the results
Random assignment to experimental and control groups with measurement of
both groups.

Quasi-experimental design
Results in interpretable and supportive evidence of program effectiveness.
Usually cannot control for all factors that affect the validity of the results.
There is no random assignment tot eh groups, and comparisons are made on
experimental and comparison groups.

Non-experimental design
Without the use of a comparison or control group, has little control over the
factors that affect the validity of the results.

The most powerful design is the experimental design, in which participants are
randomly assigned to the experimental and control groups. The difference between I.1.

And I.2. in figure 14.5 is the use of a pretest to measure the participants before the
program begins. Use of a pretest would help assure that the groups are similar. Random
assignment should equally distribute any of the variables (such as age, gender, and race)
between the different groups. Potential disadvantages of the experimental design are that
it requires a relatively large group of participants and that the intervention may be
delayed for those in the control group.

A design more commonly found in evaluations of health promotion programs is


the quasi-experimental pretest-posttest design using a comparison group (II.1 in figure
14.5). This design is often used when a control group cannot be formed by random
assignment. In such a case, a comparison group (a nonequivalent control group) is
identified, and both groups are measured before and after the program. For example, a
program on fire safety for two fifth-grade classrooms could be evaluated by using preand post knowledge test. Two other fifth-grade classrooms not receiving the program
could serve as the comparison group. Similar pretest scored between the comparison and
experimental groups would indicate that the groups were equal at the beginning of the
program. However, without random assignment, it would be impossible to be sure that
other variables (a unit on fire safety in a 4-H group, distribution of smoke detectors,
information from parents) did not influence the results.

Figure 14.6 Staggered Treatment Design


Internal Validity
Internal validity of evaluations are the degrees to which the program caused the
change that was measured. Many factors can threaten internal validity, either singly or in
combination, making it difficult to determine if the outcome was brought about by the
program or some other cause.

History occurs when an event happens between the pretest and posttest that is not
part of the health promotion program. An example of history as a threat to internal
validity is having a national antismoking campaign coincide with a local smoking
cessation program.

Maturation occurs when the participants in the program show pretest-to-posttest


differences due to growing older, wiser, or stronger. For example, in tests of muscular
strength in an exercise program for junior high students, an increase in strength could be
the result of muscular development and not the effect of the program.

Testing occurs when the participants become familiar with the test format due to
repeated testing. This is why it is helpful to use a different form of the same test for
pretest and posttest comparisons.

Instrumentation occurs when there is a change in the measuring between pretest


and posttest, such as the observers becoming more familiar with or skilled in the use of
the testing format over time.

Statistical regression is when extremely high or low scores (with are not
necessarily accurate) on the pretest are closer to the mean or average scored on the
posttest.

Selection reflects differences in the experimental and comparison groups,


generally due to lack of randomization. Selection can also interact with other threats to

validity, such as history, maturation, or instrumentation, which may appear to be program


effects.

Mortality refers to participants who drop out of the program between the pretest
and posttest. For example, if most of the participants who drop out of a weight loss
program are those with the least (or the most) weight to lose, the group composition is
different at the posttest.

Diffusion or imitation of treatments results when participants in the control group


interact and learn from the experimental group. Students randomly assigned to an
innovative drug prevention program in their school (experimental group) may discuss the
program with students who are not in the program (control group), biasing the results.

Compensatory equalization of treatments occurs when the program or services are


not available to the control group and there is an unwillingness to tolerate the inequality.
For instance, the control group from the previous example (students not enrolled in the
innovative drug prevention program) may complain, since they are not able to participate.

Compensatory rivalry is when the control group is seen as the underdog and is
motivated to work harder.

Resentful demoralization of respondents receiving less desirable treatments


occurs among participants receiving the less desirable treatment compared to other
groups, and the resentment may affect the outcome. For example, an evaluation to
compare two different smoking cessation programs may assign one group (control) to the
regular smoking cessation program and another group (experimental) to the regular
program plus an exercise class. If the participants in the control group become aware that
they are not receiving the additional exercise class, they may resent the omission, and this
may be reflected in their smoking behavior and attitude toward the regular program.

The major way in which threats to internal validity can be controlled is though
randomization. By random selection of participants, random assignment to groups, and
random assignment of types of treatment or no treatment to groups, and differences
between pretest and posttest can be interpreted as a result of the program. When random
assignment to groups is not possible and quasi-experimental designs are used, the
evaluator must make all threats to internal validity explicit and then rule them out one by
one.

External Validity (Generalizability)


The other type of validity that should be considered is external validity. The
extent to which the program can be expected to produce similar effects in other
populations. This s also known as generalizability. The more the program is tailored to a
particular population, the greater the threat to external validity, and the less likely it is
that the program can be generalized to another group.

Several factors can threaten external validity. They are sometimes known as
reactive effects, since they cause individuals to react in a certain way. The following are
several types of threat to external validity:

Social desirability occurs when the individual gives a particular response to try to
please or impress the evaluator. An example would be a child who tells the teacher she
brushes her teeth every day, regardless of her actual behavior.

Expectancy effect is when attitudes projected onto individuals cause them to act
in a certain way. For example, in a drug abuse treatment program, the facilitator may feel

that a certain individual will not benefit from the treatment; projecting this attitude may
cause the individual to behave in self-defeating ways.

The following are several types of threat to external validity:


Hawthorne effect refers to a behavior change because of the special status of those
being tested. This effect was first identified in an evaluation of lighting conditions at an
electric plant; workers increased their productivity when the level of lighting was raised
as wall as when it was lowered. The change in behavior seemed to be due to the attention
given to them during the evaluation process.

Placebo effect causes a change in behavior due to the participants belief in the
treatment.

Blind- study in which the participants do not know what group (control or type of
experimental group) they are in.

Double blind- study of the type of group participants are in is not known by either
the participants or the program planners.

Triple blind- study where the information is not available to the participants,
planners, or evaluators.

It is important to select an evaluation design that provides both internal and


external validity. This may be difficult, since lowering the threat to one type of validity
may increase the threat to the other. For example, tighter evaluation controls make it
more difficult to generalize the results to other situations. There must be enough control
over the evaluation to allow evaluators to interpret the findings while sufficient flexibility
in the program is maintained to permit the results to be generalized to similar settings.

Вам также может понравиться