Advancing What Works in Justice Past Present and Future Work of Federal Justice Research Agencies

Justice Evaluation Journal
ISSN: 2475-1979 (Print) 2475-1987 (Online) Journal homepage: https://www.tandfonline.com/loi/rjej20
Advancing “What Works” in Justice: Past, Present,

and Future Work of Federal Justice Research
Agencies
Thomas E. Feucht & Jennifer Tyson
To cite this article: Thomas E. Feucht & Jennifer Tyson (2018) Advancing “What Works” in
Justice: Past, Present, and Future Work of Federal Justice Research Agencies, Justice Evaluation
Journal, 1:2, 151-187, DOI: 10.1080/24751979.2018.1552083
To link to this article: https://doi.org/10.1080/24751979.2018.1552083
Published online: 20 Jan 2019.
Submit your article to this journal
Article views: 32
View Crossmark data
Full Terms & Conditions of access and use can be found at

https://www.tandfonline.com/action/journalInformation?journalCode=rjej20
JUSTICE EVALUATION JOURNAL
2018, VOL. 1, NO. 2, 151–187
https://doi.org/10.1080/24751979.2018.1552083
Advancing “What Works” in Justice: Past, Present, and

Future Work of Federal Justice Research Agencies
Thomas E. Feucht and Jennifer Tyson
National Institute of Justice, Office of Justice Programs, U.S. Department of Justice, Washington,
DC, USA
ABSTRACT ARTICLE HISTORY

Since the 1960s, research on crime, delinquency, and justice has Received 7 November 2018
achieved important milestones regarding program evaluation. The Accepted 21 November 2018
field has made significant strides in identifying and cataloging
evidence-based programs, practices, and policies for juvenile and KEYWORDS
criminal justice. These efforts have helped refine our definition of Design; evaluation;
evidence; policy; research
“evidence-based programs.” Tracing the distinctive role that
Federal science agencies have played in determining what works
and in advancing evidence-based approaches to crime and just-
ice, we highlight key milestones, distinctive features, and the
changing landscape of justice research over the past half-century.
We extend our examination of current efforts to discern future
directions for evaluation and evidence work in our field. Our
review of a half-century of justice evaluation to build evidence-
based approaches in juvenile and criminal justice reveals an
evolution in our field’s commitment to rigor, our standards of
evidence, and our notions of “what works.” Our review suggests
important directions for the future including the importance of
program context, the trade-offs between implementation fidelity
and experimentation, and the added value of supporting pro-
grams with decision-making tools and platforms. We close with
some insights into how current approaches to evaluation may fur-
ther evolve and grow, especially in the areas of implementation,
program adaptation, and support for local capacity. The payoff is
a deeper understanding of the potential and the limitations of
evaluation evidence to determine what works and what doesn’t.
Introduction
Across government and the private sector, a vigorous conversation endures about
how to best support evidence-based programs and policies across the public policy
landscape (Blumstein, 2013; Farrington et al., 2018; Nagin and Weisburd, 2013).
There is a granular level to this conversation, focused on whether a particular strategy
CONTACT Jennifer Tyson jennifer.tyson@usdoj.gov

Disclaimer: Findings and conclusions reported here are those of the authors and do not necessarily reflect the official
policies or positions of the U.S. Department of Justice.
This work was authored as part of the Contributor’s official duties as an Employee of the United States Government
and is therefore a work of the United States Government. In accordance with 17 USC. 105, no copyright protection
is available for such works under US Law.
152 T. E. FEUCHT AND J. TYSON
or program works (e.g., to prevent gang violence or to reduce recidivism), and what
can be done to improve a specific strategy or program (Drake, Aos, & Miller, 2009;
Lum, Koper, & Telep, 2011; Petersilia, 2004). There is also a broader aspect to the
conversation regarding how we should measure and take stock of our collective
efforts to produce, disseminate, and operationalize into practice the evidence of what
works—and what does not. This broader discussion certainly must inform efforts to
design and implement specific programs and subsequent program evaluations; but it
is a discussion designed to inform the direction of science and program evaluation
more generally. It is for this broader discussion that we have prepared this essay.
In criminal and juvenile justice, CrimeSolutions.gov has become a significant point
of reference for conversations about evidence-based programs and practices.
Established in 2011, the US Department of Justice’s CrimeSolutions.gov comprises
a broad, consistent, and transparent catalog of rigorous evidence for what works in
criminal justice programs and practices. Together with its sibling program in juvenile
justice, the Model Program Guide (MPG), CrimeSolutions.gov has helped to raise the
public policy discourse about what works in the justice field to an unprecedented
level of scientific sophistication and precision.1
MPG and CrimeSolutions.gov are not the first efforts by the science agencies of
the US Department of Justice to provide a catalog of what works and what doesn’t.
We are confident they will not be the last. Indeed, these two programs are only
the most recent chapters in a story of criminal justice evidence-building that began at
least 50 years ago and is not at all near its conclusion.
This exploration of the history of the evidence-based research is replete with
missed opportunities and paths not taken that may yet be viable lines for developing
and improving today’s programs. Finally, a full appreciation of the pace and diverging
lines of a specific research and development effort helps us to form realistic expecta-
tions for the progress we still hope to achieve.
We hope that, like a study of family history, this article provides a view into the
formative work of those who toiled years before us. When properly understood, these
glimpses help us to be wiser – and more humble – caretakers of a body of work that
was established decades earlier. If that wisdom informs our diligence, we may be able
to provide similar lessons for those who will certainly come after us.
Our essay begins 50 years ago and traces the significant developments by the science
agencies of the US Department of Justice to fulfil what was then a new mission of State
and local assistance to control and prevent crime and delinquency. Most importantly,
our essay outlines how establishing a tradition for rigorous evidence on what works has
been a long process marked by landmark achievements as well as failures and setbacks.
Though this process has been beset at times by changes in direction and missed oppor-
tunities that let the goal elude us for a time, it is inevitably a process of increasingly
improved evidence – a process that continues to develop and grow.
Sections I and II of the article describe early stages in the process of pro-
ducing and cataloging evidence for what works in criminal justice. Much of
1
Data from Google Analytics demonstrate the reach of CrimeSolutions.gov: during FY 2018, more than 430,000 users
visited CrimeSolutions.gov. As of 31 October 2018, more than 22,000 individuals were subscribed to receive email
alerts whenever CrimeSolutions.gov posts a new Program or Practice.
JUSTICE EVALUATION JOURNAL 153
this work dates back to the Federal agencies that predate the current science
agencies within the Department of Justice. Section III describes recent and
current activity by the National Institute of Justice and the Office of Juvenile
Justice and Delinquency Prevention, including establishing the Model Programs
Guide and CrimeSolutions.gov. Some work in the Department has been
described by Rodriguez (2018); this section fills in much more of the story
and describes important new directions in the way we compile, disseminate,
and translate justice program evaluation results. This includes an exploration of
recent work that goes beyond a simple “copy-and-paste” transplant of a pro-
gram or intervention into a jurisdiction exactly as it was developed, manual-
ized, and tested elsewhere. Section IV states the key findings from our review
of where justice evaluation has been and reflects on the complex array of
stakeholders, historical trajectories, unique jurisdictional quirks, political environ-
ments, and plethora of interrelated and interdependent programs, strategies,
and priorities that is our justice system. The article concludes with a few add-
itional thoughts and recommendations.
The early years of modern criminal justice evidence

The 1960s were a time of rising fear of crime across the US. For the first time, the
issues of crime and public safety figured significantly in presidential campaigns
(Beckett and Sasson, 2008).2 By mid-point through the decade, serious fears had led to
action by the Federal government. Despite apprehensions of Federal involvement in
local crime matters, a Presidential Commission was established to seek new solutions
to the rising threat of local crime.
Alongside the growing fear of crime, however, emerged a bright optimism about
what science and technology could do to solve the problem of crime. This was the
generation that put a man on the moon: surely, the problem of rising crime could be
solved just as quickly.
In some ways, it is not surprising that the positivist spirit of social philosophers and
scientists from previous decades should find a comfortable new perch in the 1960s.
The modern discourse on Western criminal justice policy has well-established roots,
counting among its founding thinkers the great minds of the 18th and 19th century,
including Bentham, Beccaria, Durkheim, and others (Jacoby, Severance, & Bruce, 2004).
The work of these early thinkers that shaped generations of researchers and theorists
in the US and elsewhere found a new resonance with the problem-solving science-
and-technology attitudes of the 1960s.
In the US in the 1960s, these deeply rooted scientific traditions were embossed by
the emergence of a distinctive role of the Federal government in solving the persist-
ent problem of crime. In a context where crime and justice are primarily matters of
local control, the US Federal government’s distinctive role as knowledge broker for
what works in criminal justice policy and practice claimed for what would become a
new division of the US Department of Justice the high ground of science while trying
2
Republican Barry Goldwater is generally credited with introducing “law and order” rhetoric to presidential politics
during the 1964 Presidential campaign (Beckett and Sasson, 2008).
to steer clear of even the perception of Federal meddling in local affairs.3 This
“knowledge broker” role for justice policy finds its earliest voice in the 1960s, as part
of President Lyndon Johnson’s Great Society:
The problems of crime bring us together. Even as we join in common action, we know
there can be no instant victory. Ancient evils do not yield to easy conquest. We cannot
limit our efforts to enemies we can see. We must, with equal resolve, seek out new
knowledge, new techniques, and new understanding (Johnson, 1966).
The President’s Commission of 1967 would describe further this distinctive knowledge
broker role for the Federal government. The Commissioners noted that “[F]inancial and
technical assistance [from the Federal government] … should be only a small part of the
national effort to develop a more effective and fair response to crime” (President’s
Commission on Law Enforcement and Administration of Justice, 1967, p. xi, emphasis
added). But if not assistance grants, what should be the distinctive role of the Federal gov-
ernment? Among the Commission’s most profound assertions was the need for a national
research agenda on criminal justice and law enforcement. The Commission found that
there were “ … many needs of law enforcement and the administration of criminal justice.
But what it has found to be the greatest need is the need to know.” (President’s
Commission on Law Enforcement and Administration of Justice, 1967, p. 273)
In its recommendations, the Commission operationalized this need by recommend-
ing the establishment of an independent Federal agency, “a National Foundation for
Criminal Research.” This independent agency would be linked to a new aid program
the Commission outlined elsewhere. It was essential, the Commission stated, that
these new Department of Justice undertakings should “embody a major research com-
ponent, if it is not simply to perpetuate present failures” in the administration of just-
ice and law enforcement (p. 277).
Following the Commission’s suggestion to develop a detailed plan for this criminal just-
ice research agency, the Department of Justice contracted with the Institute for Defense
Analyses to draw up what effectively would be the blueprint for the new science agency
named and authorized in the Omnibus Crime Control and Safe Street Act (1968). The
National Institute of Law Enforcement and Criminal Justice (NILECJ) would be housed
within the Law Enforcement Assistance Administration (the forerunner of today’s Office of
Justice Programs).4 In 1968, the IDA’s report (Blumstein, 1968) placed the first cornerstone
for what would be the distinctive Federal role for research work within a broader Federal
assistance mission to State and local criminal justice and law enforcement agencies:
[T]he principle responsibility for crime reduction and control will continue to rest
with State and local agencies and officials. The Federal Government should support
3
The apprehension of Federal meddling in local policing and justice affairs would be given voice by Congress.
Specifically: “Nothing in this title or any other Act shall be construed to authorize any department, agency, officer,
or employee of the United states to exercise any direction, supervision, or control over any police force or any other
criminal justice agency of any State or’ any political subdivision thereof” (Omnibus Crime Control and Safe Streets
Act, 1968, Section 809).
4
Today, the IDA might seem an unlikely think tank for such an undertaking, and in a recent essay, Blumstein (2018)
reflected on what seemed like a mismatch of the task to his focus on engineering and operations research. In the
1960s, however, the President’s Commission recognized the model that the Department of Defense could provide
for a research-driven Federal enterprise. Defense, the Commission noted, allocated up to 15% of its revenues for
research and development. This provided a compelling model for what the Commission envisioned for the work of
criminal justice and law enforcement.
them by helping to do those things they are unable to do wholly on their own, espe-
cially when the benefits can be generalized and shared widely, and also by fostering
dissemination of results developed by State and local agencies on their own. In
research and development, in particular, the Institute can perform a central role with-
out infringing on the prerogatives or authority of local agencies (p. 10).
Federal support through research on State and local criminal justice operations
seemed to the Commission a particularly compelling idea. Reflecting the practical
focus of the Commission’s report, the IDA blueprint for NILECJ paid particular atten-
tion to applied research, including program evaluation. IDA saw the evaluation work
of the new Federal research agency as a core function that would continue to inform
State and local practice:
Once an innovation is developed, it must be evaluated for its utility for adoption
elsewhere. The evaluation must consider the degree to which the intended effects are
achieved, but must also take into account any side effects that may be created … The
evaluation process must always continue after the initial introduction of an innovation in
order to assess its performance in the operating context. In view of continually changing
circumstances, it is important that the concept of continual evaluation become an
integral part of criminal justice operations (p. 13).
In this original “blueprint” for NILECJ, a template was established for how the Federal
government could lead State and local criminal justice efforts through research, particu-
larly through program evaluation. The template went so far as to point out what sorts
of evidence standards would matter most in this effort. The IDA report asserted:
Wherever possible in the evaluation process, experimental controls will be required in the
evaluation … the evaluation of both operational and equipment innovations requires
considerable sophistication in social experimentation and in the techniques of
experimental design (p. 16, emphasis added).
The report says little more about program evaluation or evaluation design; it focused
more attention to setting the substantive focus for a research agenda on law enforcement,
courts, criminal justice technology, and other topics, and on the allocation of an expected
first-year research budget of $10 million across all these topics. But in a few short pages, the
report provided the blueprint for the Federal government’s research agency on criminal just-
ice. Furthermore, here at the beginning of the agency’s mission, the report would articulate
the sine qua non of experimental design in determining what works for criminal justice prac-
titioners and policy makers at the State and local level. (It would take decades for true experi-
mental evidence to receive more than passing attention in the work of the agency.)
Informed by the work of the President’s Commission and the IDA report, Congress
fashioned legislation that would establish the LEAA and its research and evaluation
agency: The National Institute of Law Enforcement and Criminal Justice (Omnibus
Crime Control and Safe Streets Act, 1968).5
Beyond instituting a significant program of Federal assistance, the Safe Streets Act
gave new voice to Congress’ desire to know what works in criminal justice policy
5
By the time Congress set to work to draft legislation to implement the Commission’s lofty designs, the notion of
research may have become something of an afterthought. One account describes how a single Congressman
introduced language to establish the NILECJ. His proposal was met with “massive indifference by senior members of
both parties” (Committee on Research on Law Enforcement and Criminal Justice,1977, pp. 14–15).
and practice. Just a few years earlier, Congress had funded a small program of
Federal assistance to State and local criminal justice and law enforcement agencies
(Law Enforcement Assistance Act, 1965). In the 2.5 years leading to the 1968 Act,
DOJ’s Office of Law Enforcement Assistance (OLEA) had disbursed more than $20
million to support 359 projects. Now, in 1968, Congress expected some evidence of
the impact of these funds, and it instructed the NILECJ to conduct a study of what
had been learned:
Immediately upon establishment of the [Law Enforcement Assistance] Administration, it
shall be its duty to study, review, and evaluate projects and programs funded under the
Law Enforcement Assistance Act of 1965 (Institute of Criminal Law and Procedure,
1971 pp1).
The 359 OLEA awards were made with explicit expectations for individual program
evaluation. However, we would know little about the activities of these very first
Federal criminal justice assistance projects if not for what was likely the very first
large-scale post hoc compilation of criminal justice evaluation results undertaken by
NILECJ (or anyone else). Awarded a NILECJ grant to assess the OLEA programs and
their evaluations, the Georgetown University Institute of Criminal Law and Procedure
submitted a final report over 1700 pages in length. It takes only a few pages to dis-
cern the main findings of the study: there was little evidence that could be obtained
to demonstrate efficacy or impact among the 359 criminal justice grants awarded by
OLEA (Institute of Criminal Law and Procedure, 1971).6
The Georgetown report states quite clearly that any expectations OLEA may have
had for learning what works through evaluation activities incorporated into these
grants were not met:
Evaluation is always a difficult area. In most projects the evaluation took the form of
submitting questionnaires to project participants and obtaining their opinions. Rarely did
a project report include an independent professional evaluation by an objective
consultant (p. 46).
The report continues:

Evaluation was a stated requirement of almost every OLEA training or demonstration
grant. Certainly, it would appear that OLEA desired a valid and reliable system for
evaluating the projects. OLEA projects rarely met this standard. Most evaluations
consisted of a questionnaire to the participants in the study asking them whether they
liked the project and whether or not they gained anything from it. And once in a while
the person answering the question was carried away by the spirit of the task. For
example, a letter of commendation was included in one final report as a type of
evaluation. The writer of the letter asserted that he did benefit from the seminar and
workshop because the surrounding hills and forest were so beautiful that he was
inspired (p. 132).
This lack of reliable evaluations likely led to language in the 1973 amended Safe
Streets Act quickly directing LEAA to evaluate its grant projects. Like previous
authorizing language, the instruction to LEAA is fairly general: it does not stipulate
a need for specific evaluation methods or for evaluations conducted by
6
The report is worth reading, if only to find comfort in seeing that the challenges of criminal justice program
evaluation are not at all new.
independent evaluators (Committee on Research on Law Enforcement and Criminal

Justice, 1977 p.15).
The desire for better practical evidence about what works was so great that it trig-
gered a significant effort to restructure the still-nascent LEAA in ways that would
make NILECJ’s evaluation work more central. In 1971, a law review article by American
Bar Association executive director Bert Early entitled, “National Institute of Justice: A
Proposal,” suggested “a new type of organization … for an accelerated program of
modernization of our system of law and justice.” Early goes on to state that, despite
some progress, key elements of our legal system remained missing: “Those elements
are focus, continuity, innovation, experimentation, and research, all melded under cap-
able direction and with adequate funding. The catalytic agency to synthesize these
elements can, in this writer’s judgment, be a National Institute of Justice.” In what
must be understood as an underscore to the national significance of the issue, the for-
ward to Early’s paper was written by no less than then Chief Justice of the United
States Warren Burger. In his forward, the Chief Justice makes clear the intended scien-
tific function of a National Institute of Justice by comparing it to the National
Institutes of Health, describing the need to “revitalize the faltering machinery of
justice” (Early 1971).
Early’s paper galvanized a justice community already in motion. In May 1972,
Senator Hubert Humphrey introduced Senate Bill 3612 “to establish a National
Institute of Justice, in order to provide a national and coordinated effort for reform of
the judicial system … and for other purposes” (A Bill to Establish the National
Institute of Justice 1972). At a professional gathering in May, Chief Justice Burger
urged the ABA to take further action on the idea of a National Institute of Justice. The
ABA formed a task force then later a larger commission which led to an ABA-spon-
sored national conference in December 1972. Reports from the conference noted “a
strongly shared sense that there should be a National Institute of Justice,” while
acknowledging some divergence as to the details of just what such an organization
would entail (Allen 1973).
This momentum for a stronger criminal justice research function appears to have
stalled; no significant legislative reconfiguring of LEAA or NILECJ would occur until
1979 and the Justice Improvement Act, which would replace LEAA with a new Office
of Justice Assistance, Research, and Statistics (Justice System Improvement Act, 1979).
NILECJ and LEAA would exhibit a general enthusiasm for evaluation and produced
a steady stream of activities and reports regarding program evaluation. Unfortunately,
this enthusiasm – and the level of resources needed to sustain it – waxed and waned
over the coming decades.
The middle years: Defining and building evidence

The 1970s would bring the publication of NILECJ’s first comprehensive report detailing
the results of the agency’s first 5 years of research and evaluation projects; a LEAA
task force on evaluation policy; a 1977 assessment of the NILECJ by the National
Academy of Sciences; a 1978 report from NILECJ on the corpus of program evaluation
evidence to date; the genesis of the agency’s conference series on evaluation and
research; and an intriguing 1980 report on early experiments on delinquency preven-

tion funded by the (then only recently authorized) Office of Juvenile Justice and
Delinquency Prevention (OJJDP). The section concludes with a review of the 1980s,
when declining budgets and other factors slowed the work of Federally sponsored
criminal justice evaluation just when a commitment to true experimental design
seemed finally to take hold.
1974 compilation of projects from NILECJ’s first 5 years

In its first decade, NILECJ amassed an impressive body of work. A 1974 inventory of all
the grants and contracts awarded to date by the NILECJ reveals a great deal about
the practical focus of the agency. Many of these awards aimed to improve through
evaluation the practices and policies of State and local criminal justice and law
enforcement agencies. In its very first year, along with the grant for the comprehen-
sive study of the earlier OLEA projects (described above), the agency commissioned
scientific work that would improve modern court management; inform the design for
the optimal law enforcement communication system for the tri-state area surrounding
Chicago, Illinois; improve local assessment and planning in state correctional agencies;
study the optimal features of a police patrol car; examine the role and function of
States Attorneys General; conduct a longitudinal study of psychological factors in
police assessment and performance; initiate a study of bail reform; conduct basic
research on urban design and urban street behavior; develop and help implement a
“behavioral systems” approach to prevent and control delinquency; and many other
studies (National Institute of Law Enforcement and Criminal Justice, 1974).
In the agency’s first year (1969) alone, NILECJ initiated 100 research projects. The
array of topics and study designs signals what must have been an urgently felt need
for answers to so many practical criminal justice questions. Many of these initial grants
were very small, as little as $3000. Judging by the number of reports accessioned by
the National Criminal Justice Reference Service (NCJRS), a great deal of research activ-
ity was acquired at little cost (Law Enforcement Assistance Administration, 1969).
The rigor of all this activity, however, was at best dubious. A NILECJ evaluation
report in 1978 would later summarize the subset of these early projects that focused
on evaluation (discussed below). Many of the reports from these early evaluation
grants are not available online, so their findings and scientific rigor are sometimes dif-
ficult to characterize. Those we have suggest that many of these studies lacked meth-
odological rigor. Few of these studies involved anything remotely resembling a true
random assignment experiment, even when the explicit purpose seemed to be an out-
come evaluation. For example:
The CCT concept was evaluated over a period of a year. Types of measurements and
statistical evaluation techniques are described. The experimental results are interpreted as
significant due to the consistency of the results. The CCT performed in an outstanding
manner when compared to the bulk of the Syracuse Police Department. The performance
might have been the result of using above average policemen, superior leadership, the
Hawthorne effect, or a combination of these and other factors. The fact remains that it
has been demonstrated that the effectiveness of the municipal police can be increased
significantly without an increase in manpower or financial resources (Elliott and Sardino,

1970, emphasis added).
Some have observed that the ambitious goal of the Safe Streets Act to learn what
works in criminal justice had to rely on a research community not yet prepared for
the challenge. Universities had neither the research staff nor the empirical literature
with which to undertake such an ambitious mission.7 Even early on, however, critics
knew that poor research could lead to ineffective programs and bad program out-
comes. A 1973 Columbia University symposium concluded the chaos in LEAA’s pro-
gram activities stemmed chiefly from a lack of useful evaluations of past programs. In
other words, those who do not effectively evaluate the flawed programs of the past
are doomed to fund those same flawed programs in the future (Law Enforcement
Assistance Administration: A Symposium on Its Operation and Impact:
Conclusion, 1973).
1974 LEAA evaluation policy task force

As the early evaluation projects were accruing at NILECJ, the Administrator of the
LEAA formed a task force to provide a more deliberate roadmap for program evalu-
ation. Completed in about 5 months, “The Report of the LEAA Evaluation Policy Task
Force” is a thoughtful and detailed analysis of the path NILECJ and its LEAA partner
agencies would need to follow if answers about what works were to be consistently
produced (Law Enforcement Assistance Administration 1974).
The task force comprised Federal employees from LEAA and elsewhere along with
representatives from State Planning Agencies (SPAs), the primary recipients of LEAA
assistance formula and block grant funds. The inclusion of non-Federal experts (who
were also recipients of major LEAA formula funds) is a quaint reminder of a time
when such collaborations occurred with frequency – and not a little naïve innocence
about conflicts of interest. However, this spirit of inclusiveness did not extend to aca-
demics: the task force membership included none. Partly, this omission may signal the
under-developed academy of criminal justice research at that time. More likely, it sug-
gests NILECJs lack of rapport with the emerging university research community, since
it stands in sharp contrast to the more academic study of NILECJ just a few years later
by the National Academy of Sciences – a review committee replete with highly quali-
fied criminologists and other researchers with an interest in justice issues.
1977 NAS study on NILECJ: A deeper look at the work of NILECJ

As if to provide a poignant (and more academic) counterpoint to the task force’s own
report, the National Academy of Sciences (NAS) would succinctly summarize the work
of the LEAA evaluation task force in its 1977 assessment of NILECJ (Committee on
Research on Law Enforcement and Criminal Justice, 1977).8 The NAS underscores sev-
eral key issues from the evaluation task force report – and program evaluation at
7
See Sherman et al.,1997, “Crime Prevention: What Works, What Doesn’t, What’s Promising” for some thoughtful
observations regarding these early investments in evaluation.
8
The study was funded through a 1976 NILECJ grant to the NAS.
NILECJ more generally – as the agency approached its 10-year anniversary. First, prior
to the time of the task force, program evaluation was not an institutionalized function
of the agency. Not until NILECJ Director Gerry Caplan established the Office of
Evaluation in 1973 was there a section of the agency with the explicit responsibility
for determining what works in criminal justice practices and policies. Second, the task
force aspired to pivot from the weak evaluation designs of the past (what the task
force had characterized as “phase 1” evaluations) to more rigorous “phase 2” evalua-
tions to be undertaken by NILECJ (though the NAS found no evidence that greater
rigor was occurring or would be fulfilled any time soon). Third, in addition to the fun-
damental work of conducting evaluations, the task force articulated the additional
work of disseminating results and improving evaluation methods by calling for three
distinct evaluation-related offices within NILECJ – for conducting evaluation, dissemi-
nating evidence, and strengthening methods.
The perspective of just a few years provided the NAS analysis with additional insights
into the efforts of NILECJ and the task force. Chief among these is what the NAS
described as the very real tension between, on the one hand, agencies in LEAA who
operate programs and make assistance grants for implementing the programs, and on
the other, the agency (NILECJ) evaluating these assistance programs, particularly when
the evaluations aspired to get at actual impacts of the program:
[The] main goals [of a program assistance agency] in the bureaucratic interactions that
occurred were the maintenance of the good will of the local agencies and the development,
within those agencies, of the feeling of “ownership” of the innovation being demonstrated.
[The Office of Evaluation], on the other hand, was charged by the director with the rigorous
task of structuring a controlled experiment that would result in a reasonably definitive study
of a hypothesis. Furthermore, a demonstration program used as dissemination implies some
value to the technique that is being disseminated. A field experiment, hypothesis testing,
implies that the technique in question is just that –“in question.” The potential for a great
deal of tension between the two offices was present (Committee on Research on Law
Enforcement and Criminal Justice, 1977, p. 152).
It seems that the enthusiasm of the LEAA evaluation task force for a comprehensive
institutionalized structure for evaluation at NILECJ would be short-lived. Beyond the NAS
report of 1977, few further references would be made to the cornerstone proposals of
the task force for the “National Evaluation Program,” the “Model Evaluation Program,” or
its “phase1/phase 2” approach to building evidence of what works in criminal justice and
law enforcement. NILECJ Director Caplan would indeed establish three offices within the
agency to do the complex work of program evaluation; but not much else of the task
force recommendations seems to have been institutionalized.
Looking beyond program evaluation, the NAS report articulated a bold new vision
for the research agency of the US Department of Justice. It called for a more theoret-
ical approach to understanding crime (not merely evaluating programs) and envi-
sioned an NILECJ that was organized around crime problems rather than one that
focused on solutions (i.e., programs, practices, or policies for purposes of evaluation).
In a section addressing the issue of the agency’s “usefulness,” the NAS report draws a
clear distinction between NAS’s ambitions for the agency and those that called for
practical, applied knowledge-building:
Although the Institute may have been helpful to some practitioners directly, we have
found no evidence of a productive relationship … As a general matter, therefore, the
Committee finds that the Institute has not met its service responsibilities under the Safe
Streets Act. We have already commented on the unrealistic expectations that characterize
that legislation, especially the expectation that the Institute can and should undertake to
provide immediate solutions to problems of crime. The fact that the Institute has not
achieved visibility within the larger LEAA and practitioner community may be due, at
least in part, to these unrealistic expectations (p. 72–73).
It seems that as the end of the 1970s approached, a greater body of research about
crime and justice was growing outside of NILECJ. NAS criticized the NILECJ for failing
to build “cumulative knowledge” that could find its place within this larger research
context. This failure was, according to the NAS, a result of the agency’s overly practical
focus on providing applied evidence to the problem du jour:
[T]he demand that every piece of research have immediate usefulness can lead to a narrowly
conceived program of applied development or a program heavily weighted toward
immediate-solution research … [T]he Committee found little evidence that the Institute has
been committed to cumulative research. Too many research projects appeared isolated,
developed without any historical context. This is particularly disturbing because the Institute
has repeatedly professed to have committed itself to a coherent research agenda … [The
committee] was unable to locate any evidence of a multiple approach to particular
problems … [T]he subcommittee was not able to determine what orchestration, if any, was
taking place for the purpose of filling out an area of knowledge (p. 77).
The report is replete with criticisms that challenged the assumptions inherent in the Safe
Streets Act that established the agency. Specifically, the report criticizes NILECJ for focusing
too much on narrow applied research questions and not enough on more basic empirical
questions like the causes of crime, deterrence, rehabilitation, and socialization to crime.
The hindsight of decades might suggest that a dialectic had emerged, pitting the clas-
sic perspective of criminology and the other traditional social sciences against the upstart
applied sciences of criminal justice and public policy. If nothing else, the pivotal 1977 NAS
report would mark the beginning of nearly 20 years when it would seem that NILECJ
(and its progeny, the National Institute of Justice) would wander the wilderness, searching
for a path by which research could once again inform practice and policy.
Ten years of evaluation: Taking stock in 1978

Ten years from the establishment of NILECJ, it was time to take stock. “How Well Does
It Work: Review of Criminal Justice Evaluation” (National Institute of Law Enforcement
and Criminal Justice, 1978) surely intended the question to be asked of the programs
of State and local criminal justice agencies that NILECJ had spent a decade evaluating;
but it could as well be asked of the agency itself. According to some who worked
extensively with NILECJ in the late 1970s, “How Well Does It Work” was intended to
be the first in a series of annual catalogs of evidence from program evaluations.9 The
9
This insight was provided by Michael Tonry, who served as the long-time editor of what quickly became NILECJ’s
annual flagship research publication, Crime and Justice, which was also begun in 1978. Tonry has stated that, at the
time of its inception, “How Well Does It Work” was envisioned to be the annual evaluation counterpoint to C&J’s
theory-driven criminological research volumes.
fact that it was produced only once speaks volumes to the shifting ground on which
stood the NILECJ’s evaluation work – and the slow accretion of evaluation evidence to
inform the work of LEAA and the field of criminal justice.
“How Well Does It Work” was NILECJ’s first definitive portrayal of program evalu-
ation findings from its own research grants. Now a decade into the evaluation busi-
ness, NILECJ set about to synthesize the evidence in an array of topics: law
enforcement, corrections, prevention, courts, technology, and several others.
Unfortunately, none of the syntheses was particularly compelling. Once again, in chap-
ter after topical chapter, the 1978 report only served to underscore how little was
known, what little ground had been covered adequately, and how much more evi-
dence was needed.
The 1978 report is of historical note, but it provides very little of substance.
Decades, later, it may have served as a model for the more heralded “What Works,
What Doesn’t and What’s Promising” (Sherman et al., 1997, discussed in a later section)
in how it organizes the body of evaluation results around topics such as law enforce-
ment, community crime prevention, juvenile delinquency, courts, corrections, and
others. But even in this, the 1978 volume is primarily aspirational: with only a few
exceptions, such as delinquency research and some sub-topics of policing, the chap-
ters are more laments of what we don’t know than proclamations of what we do.
Although the report concludes with a chapter entitled “Where Do We Go From Here,”
the reader is struck by the ennui, the creeping sense of dissolution that leaks from the
pages. It is as if the authors, knowing what little had been achieved in 10 years of pro-
gram evaluation, despaired of ever learning what works to prevent crime. The report
clearly echoed concerns that “nothing works,” a sentiment famously attributed to
work about the same time by Martinson and others (Cullen, 2013; Lipton, Martinson, &
Wilks, 1975; Martinson, 1974).10
Before examining the pivotal work undertaken by NILECJ (and its successor agency,
the National Institute of Justice) in the 1980s and 1990s, it is important to note the
emergence in the 1970s of a new Federal agency focused on justice research: the
Office of Juvenile Justice and Delinquency Prevention. One of this new agency’s ear-
liest projects reported on some of the best justice evaluation science of the preced-
ing decades.
1980 OJJDP delinquency prevention report

While NILECJ was taking stock of its earliest years, plans were formed for a similar
Federal effort focused on juvenile justice and delinquency prevention.11 The Office of
Juvenile Justice and Delinquency Prevention (OJJDP) came equipped with a research
10
But also see Cullen and Gendreau (2001) for an account of moving beyond “nothing works.”
11
Decades of Federal efforts to address juvenile delinquency preceded establishment of OJJDP. A President’s
Commission on Juvenile Delinquency and Youth Crime was established in 1961, years before Johnson’s Crime
Commission. Subsequent legislation in 1968 and 1972 would address Federal aid for juvenile justice and
delinquency prevention, including research; however, these functions formed first within the Department of Health,
Education, and Welfare. The 1973 Crime Control Act directed LEAA to designate juvenile justice as a top priority,
and LEAA established a Juvenile Delinquency Division in 1973 and a Juvenile Justice Division in early 1974. Finally,
in 1974, Congress authorized the Office of Juvenile Justice and Delinquency Prevention within the Department of
Justice’s LEAA (Juvenile Justice Delinquency Prevention Act, 1974).
mandate similar to that of NILECJ.12 OJJDP’s late arrival to LEAA might be accounted
for by sensitivities around Federal involvement in issues regarding youths – something
that may have seemed deeply intrusive in a context of States rights and local author-
ity over such matters. Whatever the reason for its timing, OJJDP’s formation followed
a wave of research already in motion in the 1970s around issues of juvenile delin-
quency. A strong argument can be made for delinquency prevention research as one
of the brighter spots in early justice program evaluation. Good evidence of this is
found in the OJJDP report published in 1980 recounting the results of ten formative
experiments in delinquency prevention conducted (largely without Federal assistance)
during the 1950s and 60s (Berleman, 1980).
Most of the studies in the 1980 OJJDP report were years in implementation and
study: one, the Cambridge-Somerville Youth Study, ran 8 years. (It was also the ear-
liest, drawing subjects in 1937.) Each of the studies employed a rigorous evaluation
design. The report describes the ten as using the “classic experimental design” (with,
admittedly, some variations). Half of the ten evaluations, however, used designs we
would characterize as quasi-experimental. (No use of this term can be found in the
1980 publication.) The other five studies employed some form of random assignment
of subjects.
The studies evaluated 10 different prevention interventions that were developed
and implemented independently. Perhaps by way of acknowledging the important
role to be played by the new OJJDP, the report author notes that the programs may
have suffered for this lack of coordination, though the rigorous designs (and even the
unrelatedness) of the studies provided credible evidence of program ineffectiveness:
A glance at the “Background” section of each experiment reviewed makes clear that a
coordinated strategy to implement and evaluate delinquency prevention services has
never taken place. Each experiment was one-time and idiosyncratic. So far as can be
determined, no person prominently involved with one experiment ever went on to
become involved with a second experiment. Efficiency may not have been served by
having each experiment conducted in isolation and often in ignorance of similar
experiments. Cumulative experience may have been frustrated in never having some of
these one-time experimenters get a second chance. On the other hand, the very insularity
of each study lends a certain credibility to the cumulative findings of service
ineffectiveness (p. 112).
Remarkably, Berleman provides what could serve as the eulogy marking the demise
of experimental evaluation of delinquency prevention programs (at least for the time
being). Though he lauds them for aspiring to the classic experimental method and
clearly understands the advantages for causal attribution of outcomes, he states
firmly that:
The classic experimental design is not the only way to evaluate treatment effectiveness in
a rigorous fashion. In recent years, it has fallen into disuse … More current research has
been strongly influenced by social learning theory and the application of behavioral
modification techniques (p. 6).
12
The Act, signed by President Ford in early FY1975, included provisions for a National Institute of Juvenile Justice,
housed within the OJJDP (https://www.gpo.gov/fdsys/pkg/STATUTE-88/pdf/STATUTE-88-Pg1109.pdf ). This provision
was omitted when the Act was reauthorized several years later (http://legcounsel.house.gov/comps/juvenile.pdf ).
Berleman goes on to describe this evaluation approach in gauzy terms that may
have mitigated against subsequent efforts to increase the rigor of program
evaluations:
Typically, the application of behavioral techniques calls initially for a close monitoring of a
selected individual’s behavior in order to establish the frequency of that individual’s
antisocial behavior within a given time frame. This before-treatment frequency count
serves as the individual’s antisocial behavioral baseline against which subsequent
behavioral counts are measured. Should the frequency of antisocial behavior lessen
significantly during and/or at the close of treatment when compared with the before-
treatment frequency count, then it is assumed that the treatment is effective. In essence,
then, these projects relied upon a single-subject, before-after evaluation model. A fall in
the subject’s antisocial rates during and shortly after treatment is taken as an indicator of
treatment effectiveness. Each subject serves as his own control, and the question of
whether treatment was better than no treatment is answered by reference to the before-
after measures (p. 6–7, emphasis added).
But for the delayed establishment of OJJDP, the agency might have figured more
significantly in helping LEAA and NILECJ set the early course to support rigorous pro-
gram evaluation. Certainly, many of these rigorous experiments on delinquency pre-
vention were extant at the time NILECJ began its work. The path across the 1960s and
70s to provide rigorous evidence for what works might have been favorably altered
had a report such as OJJDP’s 1980 delinquency prevention report been produced 10
or 15 years earlier, injecting compelling evidence into the Federal discourse regarding
the feasibility of rigorous evaluation designs.
The 1980s: A “Golden Age” of criminal justice experiments?

Despite the uncertain commitment to strong evaluation designs that emerged from
the 1970s, the years to follow would provide some new enthusiasm for rigorous pro-
gram evaluations in criminal justice. NIJ’s13 work during the 1980s to build rigorous
evidence for what works is recounted by Farrington (2003). He describes a “meager
feast” of experiments to evaluate criminal justice programs and practices.14 This
includes key experiments like the landmark Minneapolis Domestic Violence
Experiment (to be followed by several replication experiments in other jurisdictions),
randomized controlled trials involving the police (several undertaken independently
by the Police Foundation), and a key study on corrections conducted by RAND.15
13
Through the Justice System Improvement Act of1979, the agency that had been NILECJ was recreated as the
National Institute of Justice. By 1984, LEAA would be replaced by the Office of Justice Assistance, Research, and
Statistics, later the Office of Justice Programs (Tonry,1997). Tonry’s essay provides a fascinating review of the policy
context of the US Department of Justice for the early work of LEAA, NILECJ/NIJ, and BJS.
14
Farrington gives much of the credit for these experiments to then NIJ Director James K. “Chips” Stewart. Director
Stewart also found a way to undertake a major community-level study on the nature of crime and offending, the
Chicago Project on Human Development. During a time when funding was limited, the agency still made important
contributions through both applied and basic studies on crime and delinquency. However, the need to support a
balanced research portfolio with limited funding made it difficult to build a substantial body of findings in any
single program area.
15
The June 2003 issue of Evaluation Review in which Farrington’s article appeared provides additional glimpses to
Criminal Justice evaluation during the 1980s and 90s. By and large, the papers echo Farrington’s characterization of
the “meager feast” of experiments, despite a clear understanding of the need for rigor in program evaluation and of
the evaluation designs that would provide strong evidence.
Farrington and others figured prominently during the 1980s in calling for greater
attention to the role of experiments to evaluate the effectiveness of justice interven-
tions and programs. His 1986 volume, with Lloyd Ohlin and James Q. Wilson articu-
lated the need for high-quality experiments if causal hypotheses were to be
adequately tested (Farrington, Ohlin and Wilson, 1986).
This call for increased rigor in criminal justice evaluation is echoed by Lempert and
Visher (1988), who recount the proceedings from a 1987 meeting on the issue of
experiments in criminal justice evaluation. In their NIJ-published paper, Lempert and
Visher outline the key considerations for designing and undertaking experiments to
test criminal justice policies and practices. These include thoughtful selection of the
research question (not all issues lend themselves to experimentation, they pointed
out); attention to legal and ethical aspects of the study; rigorous adherence to a ran-
dom assignment design throughout the project; and close collaboration between
researchers and practitioners.
Unfortunately, the sense of purpose voiced by Farrington, Lempert and Visher, and
others for greater use of experiments would be undermined by falling budgets and shift-
ing priorities. Blumstein and Petersilia (1995) reported that NIJ’s budget was essentially
flat from 1980 until 1994 (and the beginning of the Crime Act). Tonry (1997) described
the “atrophy” of key substantive areas of research during the 1980s as resources
declined, the agency’s attention wavered, and lines of research were abandoned.
NIJ would not again create a catalog of evidence like the 1978 “How Well Does It
Work” report until nearly two decades later – due in part to the slow accretion of
evaluation studies. Many of the experiments of the 1980s would have to wait a dec-
ade or more for inclusion in the pivotal “Preventing Crime: What Works, What Doesn’t,
and What’s Promising” report, published in 1997. By then, whatever earlier despair
had developed around learning what works would be replaced by a swell of growing
enthusiasm (again!) for building the definitive body of program evaluation evidence.
A key strategy to disseminate evaluation results would take form during the 1980s.
The proceedings of the First National Conference in Criminal Justice Evaluation, spon-
sored by the NILECJ, was published in 1981 (Garner and Jaycox, 1981).16 The nearly
400-page proceedings document features a selection of full-length papers that convey
the agency’s broad scope for program evaluation. Sections of the conference included
papers on policing, courts, corrections, and community program evaluations; evalu-
ation methods were covered in a separate set of papers (such as Robert Martinson’s
“Recidivism and Research Design: Limitations of Experimental Control Research”). The
keynote address by Sir Leon Radzinowicz, founding director of the Institute of
Criminology at the University of Cambridge, notes the size of the event: 34 panels,
over 150 reports and papers, and more than 1,400 participants.17
The record of the NIJ conference series during the 1980s is hazy; but annual confer-
ences from 1990 forward are well documented and show the agency’s deep and
16
The meeting occurred in February, 1977, in Washington, DC. Given the lapse of three years from conference to
publication, it is surprising to find evidence of a second national conference on criminal justice evaluation the next
year, in November 1978 (though no published proceedings from that second meeting have been located).
17
The complete set of abstracts from the 1977 conference is available at https://www.ncjrs.gov/pdffiles1/Digitization/
39313NCJRS.pdf.
persistent commitment to promoting and disseminating key findings on justice evalu-

ation. The annual conferences assembled a catalytic mix of researchers and practi-
tioners for several days of cutting-edge panels and presentations; featured plenaries
by the Attorney General or other Federal leaders along with leading practitioners in
policing, corrections, and other areas; and until discontinued in 2012 were a promin-
ent fixture on the national criminal justice research landscape.18
Evidence-based program lists and repositories: From the 1990s to today

The 1990s would see a resurgence of the Federal role in addressing State and local
crime problems. With the passage of the Violent Crime Control and Law Enforcement
Act of 1994, the US Department of Justice re-entered the criminal justice assistance
arena in a big way. The policy breadth of the Act and the depth of its funding sur-
passed anything since the 1960s. Significantly, the Act provided a new opportunity for
building knowledge about “what works,” and the agencies of the US. Department of
Justice were ready to seize that opportunity. The work put in motion during the 1990s
would feed efforts to assemble the growing body of criminal and juvenile justice evi-
dence; these efforts would provide the final building blocks leading to today’s evi-
dence-focused programs.
The 1994 Crime Act provided important funding for new programs at the
Department of Justice’s Office of Justice Programs (OJP). More important, the leader-
ship at OJP set aside portions of these program funds to support research and evalu-
ation activities at the National Institute of Justice (NIJ) to help guide these programs.
With a dramatic infusion of transferred funds, the agency was once again in the busi-
ness of knowledge-building in a significant way.
Many of the research investments undertaken with Crime Act funding involved
ambitious evaluations of national programs, like the hiring programs run by the Office
of Community Oriented Policing and the prevention programs of the Violence Against
Women Office (later the Office on Violence Against Women). Years later, reviews by
the Government Accountability Office (discussed in a later section) would point out
the shortcomings of NIJ’s evaluations of these large grant programs: few were able to
demonstrate explicit outcomes of these broad, national programs. In part, the elusive-
ness of outcomes was a reflection of the local discretion the Act provided for how
these funds could be used – an issue that had been a key element of Federal criminal
justice assistance grants from inception. The challenge this poses for evaluation had
been identified decades earlier in the IDA and Georgetown reports; and it would fig-
ure prominently in the policy recommendations of the 1997 Preventing Crime report.
1997 “Preventing crime: What works, what doesn’t, and what’s promising”
At the time of its release, Preventing Crime (Sherman et al., 1997, also known as the
“NIJ ‘What Works’ Report”) was widely acknowledged as a pivotal report on criminal
18
Proceedings from most annual conferences in the 1990s can be found on NCJRS at https://www.nij.gov/events/nij_
conference/Pages/archive.aspx. Each year, NIJ and OJP covered all costs of participants’ travel expenses. As these
costs continued to rise amid tightening agency budgets, the conference series came to an end in 2012.
justice program evaluation. It was a landmark compilation of the cumulative evidence

of what works, most of it sponsored over three decades by NIJ (and its predecessor,
NILECJ) and OJJDP. The timing of its release, during the heady days and unprece-
dented funding levels of the Crime Act, helped propel the report and the broader
issues of evidence for what works to control crime into the center of the policy
discourse.19
In the time since its publication, that discourse has become richer and more scien-
tifically nuanced. Nevertheless, the work is still impressive and provocative today. It
assembles a huge body of research evidence and includes a remarkable level of detail
across and within program areas. The report establishes – for the first time in the his-
tory of LEAA/OJP – a clear standard for evidence rigor by which to rate evaluations. In
its narrative, the report provides a thoughtful and far-reaching discussion of policy
implications of program evaluation for solving the riddles of crime, delinquency, and
public safety. In 1997, Preventing Crime was clearly the most robust assessment of
justice evaluation evidence to date. Even today, it represents some of the most potent
thinking available about the business of program evaluation for justice policy
and practice.
A review of any of the topical sections reveals the report’s detailed compilation of
the evidence and its rich analysis. In the section on preventing gang membership, for
example, primary evaluations tabulated in the report are scored for rigor, and brief
narratives provide substantive context. The analysis of gang prevention evaluations
takes stock of the surprising iatrogenic effect that resulted from one fairly rigorous
quasi-experiment. The narrative outlines the reasons for not dismissing the surprising
finding, and it provides a thoughtful research path to confirm or refute it. Similar
breadth, combined with rich detail, characterize each of the report sections.
Using a five-level scale, the report rates the evidence for effective crime prevention
efforts in each of seven crime prevention “settings” (e.g., community, family, the police,
criminal justice). Based on these evidence ratings, the report provides for each setting a
list of strategies deemed effective, ineffective, and promising in each setting.20
The report also includes a compelling chapter on the policy choices facing the
research agency charged with building and improving upon the extant evidence. This
concluding chapter of the report should be required reading even today for anyone
with responsibility for Federal programs to support and advance local crime policy
and practice. The chapter notes what was obvious by 1997: we know how to produce
rigorous evidence. The shortcoming, the authors state, is in the legislative plan for
Federal funding that emphasized local control of program funds, the insufficient com-
mitment to employ scientifically recognized standards for evidence, and the unwilling-
ness to shoulder the cost of conducting quality program evaluations.
Preventing Crime set a new standard as a broad and inclusive compilation of justice
evaluation results measured against a clear and transparent scale of evidence rigor. In
19
Preventing Crime was considered such a pivotal publication it was assigned its own distinctive web address within
the National Criminal Justice Reference Service library (www.ncjrs.gov/works).
20
It is difficult, however, to judge from Preventing Crime the degree to which justice evaluations overall had
become more rigorous. Though each chapter provides a narrative portrait of the rigor and depth of the available
evaluation evidence, due partly to the complexity of the research data across and within the various settings, “What
Works” does not provide an overall tally of the evidence.
doing so, the report formed a foundation for subsequent evidence registries. Part of
the foundation is to identify not only what is “effective” or “promising” but also what
is “ineffective.” As increasingly powerful on-line registries developed, a multitude of
justice evidence scales and standards would be explored, leading eventually to talk of
“common standards of evidence” to sort out competing scales and measures of evi-
dence rigor. But for a while, a variety of evidence standards would be employed in
criminal and juvenile justice research.
GAO and NAS assessments of NIJ evaluations

Not long after Preventing Crime was published, NIJ’s evaluation work drew the atten-
tion of the General Accounting Office (GAO),21 which launched several audits of NIJ’s
evaluation work focusing on methodological rigor and evidence of program results. A
2003 GAO report provided a detailed assessment of a sample of 15 outcome evalua-
tions begun and completed between 1992 and 2002. The GAO auditors found that
only about two-thirds of the sampled evaluations were designed from the start with
sufficient rigor to test program outcomes; half of these encountered implementation
problems that made conclusive results impossible to achieve; and the other third of
the studies in the sample were never designed with sufficient rigor to demonstrate
outcomes. Design shortcomings included lack of control or comparison groups, low
response rates, inadequate outcome measures, and lack of baseline data. Thus, only
about a third of NIJ’s outcome evaluations, the GAO reported, were likely to provide
meaningful information about the effectiveness of the program or intervention being
evaluated. Evaluations at NIJ, GAO stated, needed greater attention (US General
Accounting Office, 2003).
Only a few years, later, the NAS conducted their second comprehensive assessment
of NIJ. Though broad in scope, the NAS report made specific observations criticizing
NIJ’s evaluation work. For instance, the NAS noted that NIJ had re-established a dedi-
cated evaluation unit in 2002 only to see it disappear 3 years later.22 The NAS faulted
NIJ for a lack of strategic planning and poor record-keeping regarding program evalu-
ations. Citing research by Garner and Visher (2003), the NAS report also noted that the
overall investment by NIJ in experiments remained very small: about 3% of all NIJ
research dollars.23 The NAS concluded that greater investment in rigorous evaluation
was constrained by the agency’s competing commitments to program assistance and
capacity-building (Committee on Assessing the Research Programs of the National
Institute of Justice, 2010).24
21
In 2004, the agency was renamed the Government Accountability Office.
22
An evaluation office had existed from the 1970s into the 1990s when the function was eventually absorbed within
topic-specific research divisions.
23
A decade later, Telep, Garner, and Visher (2015) provided an updated report showing increased use of
experiments in 2001–2013, comprising about 10% of all NIJ research funding for those years.
24
In a formal response to the NAS report, NIJ Director John Laub noted conditions imposed by Congress on NIJ funds
(especially on funds appropriated specifically for capacity building). Laub also noted that “NIJ’s management of both
research programs and capacity-building programs may provide a context for making better decisions about both
research and capacity building” and could create “synergy in which each program informs the other” (Laub, 2011).
Blueprints for violence prevention and model programs guide

The 1994 Crime Act was not the only pivotal legislation that year: it was accompanied
by the reauthorization of OJJDP. The reauthorization introduced the Title V Incentive
Grants for Local Delinquency Prevention Grants Program (Office of Juvenile Justice
and Delinquency Prevention, n.d.-a), which designated as a priority within the grant
program to support localities in “developing data-driven prevention plans, employing
evidence-based prevention strategies” (Office of Juvenile Justice and Delinquency
Prevention, 2002). As with previous justice legislation, however, it did not provide a
definition for what constituted an “evidence-based” strategy. Recognizing that there
was no generally recognized standard for juvenile justice professionals to use in identi-
fying potential evidence-based programs, OJJDP funded two significant initiatives in
the late 1990s and early 2000s to begin to identify evidence-based delinquency pre-
vention programs: The Blueprints for Violence Prevention (currently the Blueprints for
Healthy Youth Development) and the OJJDP Model Programs Guide.
The Blueprints for Violence Prevention was launched in 1996 by the Center for the
Study and Prevention of Violence (CSPV), at the University of Colorado at Boulder. The
initial effort received funding from the Colorado Division of Criminal Justice, Centers
for Disease Control and Prevention, the Pennsylvania Commission on Crime and
Delinquency, and long-term funding from OJJDP. It identified youth prevention and
intervention programs that met a strict scientific standard of program effectiveness
(Center for the Study and Prevention of Violence, Institute of Behavioral Science
“Background”, n.d.-a).
The standards for evidence that would be established by the Blueprints Program
would be among the field’s most demanding. Ratings of “Promising” or “Model” pro-
grams would be based on the quality of the evaluation, the effect size of the program,
the logical and practical focus of the program, and its readiness for dissemination and
implementation elsewhere (Center for the Study and Prevention of Violence, Institute
of Behavioral Science “Background”, n.d.-b).
Programs determined highly effective by the Blueprints criteria would accrue slowly.
Today, the website lists a total of 87 “promising” and “model” programs.25 (Model pro-
grams must meet a higher standards of at least two RCTs demonstrating program
effectiveness for at least 12 months beyond the program intervention.) These rigorous
standards would provide an important benchmark for evidence in much of what OJP
would do in the years to follow. These standards, however, would remain unique to
the Blueprints program, as even the Model Programs Guide, OJJDP’s second foray into
evidence standards, would establish its own standard for evaluation evidence.
A few years later, in 2000, OJJDP began funding the development of the Model
Programs Guide (MPG) as a component of the Title V community delinquency training
curriculum. Like Blueprints, the MPG would begin by focusing exclusively on preven-
tion programs. OJJDP provided funding to begin developing protocols, standards, and
a list of evidence-based delinquency prevention programs. It developed a unique actu-
arial scoring instrument that emphasized the program’s theoretical framework, the
evaluation design, method of sample assignment, group equivalence, sample size,
25
https://www.blueprintsprograms.org/programs, accessed November 19, 2018.
intervention fidelity, attrition, follow-up period, and outcomes. It used an algorithm to

assess the strength of program effectiveness and assign an overall score to a study.
The score then translated into three categories of programs: “Promising,” “Effective,”
and “Exemplary.” In 2001, OJJDP released the first version of the MPG in print; when
MPG transitioned online in 2004, it contained 106 prevention programs (Development
Services Group, 2001, 2003).
In 2005, OJJDP expanded the scope of the MPG to include not only delinquency
prevention, but also intervention and aftercare (later “reentry”) programs for youth
involved in the juvenile justice system. In 2006, the scoring instrument was revised for
future program reviews. (At the time, OJJDP decided against re-reviewing programs
that had been under the previous scoring rubric.) By 2010, there were 225 programs
in OJJDP’s Model Programs Guide.
On 2 February 2005, in the State of the Union Address, President George W. Bush,
announced the establishment of the Helping America’s Youth Initiative (HAY) (Bush,
2005). He designated First Lady Laura Bush to lead the initiative. Under HAY, first Lady
Bush convened officials and resources from across the federal government to develop
and launch The Community Guide to Helping America’s Youth. One of three central
components included A Community Resource Inventory and a Program Tool “that
functions as a repository of research-based youth-serving interventions.” Model
Programs Guide was identified as a promising, pre-existing source of information, and
the HAY Program Tool repository was ultimately designed to simply pull information
from Model Programs Guide via a web services, so that the two resources shared a list
of programs generated by the OJJDP’s MPG.26
The OJP what works repository

At about the same time, OJP was developing an initiative in response to President
George W. Bush’s Task Force for Disadvantaged Youth. In collaboration with the
Coalition for Evidence-Based Policy, OJP undertook an effort to advance evidence-
based approaches to crime and substance abuse prevention policy (Baron, 2003;
Working Group of the Federal Collaboration on What Works, 2005). The initiative’s
three primary goals included developing a “what works” repository, replicating proven
programs through existing resources, and increasing collaboration among federal
agencies around evidence-based efforts. By December 2004, the workgroup for this
initiative developed and presented the design for what was to be called the OJP What
Works Repository. There were six evidence classifications of programs: Effective;
Effective with Reservation; Promising; Inconclusive Evidence; Insufficient Evidence; and
Ineffective. There were also three classifications of dissemination capacity: Fully
Prepared for Widespread Dissemination; Fully Prepared for Limited Dissemination; Not
Ready for Widespread Dissemination.
As a pilot initiative, the OJP What Works rating framework was used to rate 70 pro-
grams, and the initiative appeared to secure funding to continue the effort. In the
end, however, the What Works Repository was never fully implemented or
26
The information contained in the Community Guide later transitioned to what is currently www.Youth.gov, which is
administered by the Department of Health and Human Services. This collaboration continues today.
disseminated. Some familiar with the effort have suggested that the standard for evi-
dence may have been so high that too few programs would be judged as evidence-
based, leading to waning enthusiasm for the effort in the sponsoring agencies. Others
have suggested the effort simply lost momentum amid changes in OJP leadership.
Importantly, the What Works Repository would presage a feature of evidence stand-
ards that would inform important developments to come: namely its continued com-
mitment to including ratings on programs that were not effective, a feature first
introduced in Preventing Crime. (Neither the Blueprints Program nor the MPG registry
included a list of programs that failed against the standards of evidence.) This feature
would form a cornerstone of CrimeSolutions.gov and would figure prominently in a
redesign of MPG (including the rerating of MPG programs using the new
CrimeSolutions.gov standards of evidence).
The evidence integration initiative and the launch of CrimeSolutions.gov’s

program database
Laurie O. Robinson was sworn in as Assistant Attorney General (AAG) of the Office of
Justice Programs on 9 November 2009 (US Department of Justice, 2009).27 AAG
Robinson had a clear focus on improving data-driven and evidence-based strategies to
reduce crime and restore the integrity of, and respect for, science (Robinson, 2011).
These goals led to the establishment of the agency-wide Evidence Integration
Initiative (E2I). E2I sets three goals for OJP and the field: (1) to improve the generation
of evidence by evaluators and other researchers; (2) improve the integration of evi-
dence into policies and programs; and (3) improve the translation of evidence into
practice (Office of Justice Programs, n.d.).
A cornerstone of Robinson’s E2I effort was the development of an evidence-based
repository of programs that would address not only juvenile justice, but also criminal
justice and crime victim services. In charging her staff to develop a “clearinghouse”
devoted to what works across the breadth of juvenile and criminal justice, law
enforcement, and victim services, Robinson highlighted the importance of leveraging
previous DOJ investments to establish a comprehensive, reliable, and rigorous catalog
of effective justice-related programs.
On 22 June 2011, the Office of Justice Programs officially announced the launch of
CrimeSolutions.gov with 125 programs (Holder, 2011). CrimeSolutions.gov was
designed to serve an audience of policymakers and practitioners with a stated goal to
use rigorous research to inform them about what works in criminal justice, juvenile
justice, and crime victim services. It uses a systematic, quantitative methodology to
rate based on the available evidence to rate whether a program or practice achieves
its goals and categorizes programs into “Effective,” “Promising,” and “No Effects.”
The CrimeSolutions.gov website also provides extensive details about the scoring
protocol used to render summary ratings (depicted using green, yellow, and red
icons). Using results from up to three studies (focusing on the most rigorous evalua-
tions), the rating instrument (available on the website) scores three dimensions of a
27
Robinson was returning to serve as AAG in OJP for the second time: she had previously served as AAG during the
late 1990s.
program’s conceptual framework and seven aspects of evaluation design quality. It

provides separate scorings for individual program outcomes and discrete measures for
recording program fidelity. Each program summary rating is accompanied by a
detailed narrative that explains the strengths and weaknesses of the evidence found.
The website provides contact information (where available) for program developers,
researchers, and training and technical assistance providers.
More than three-fourths of CrimeSolution.gov’s current 563 programs are rated as
either Effective (17%) or Promising (58%).28 Programs with “strong evidence indicating
that they had no effects or had harmful effects” comprise the No Effects category.29 In
addition, CrimeSolutions.gov provides an inventory of all programs that were screened
for rating but were found to have “Inconclusive Evidence,” insufficient to reach any
rating at all. This list of “Inconclusive Evidence” programs goes a step beyond the
inclusion of No Effects programs and is (as far as we know) unique among evidence
clearinghouses (National Institute of Justice, n.d.).
The world of online evidence established by programs like Blueprints, MPG, and
CrimeSolutions.gov has introduced both opportunities and responsibilities that the
earlier efforts could not anticipate. For instance, CrimeSolutions.gov provides a facility
through which visitors to the site can submit a program nomination or appeal a rat-
ing.30 While the ability to appeal a rating is not limited to program developers or eval-
uators, most of the few appeals to date have come from within one of these
communities. It also provides for a program “re-review” if a new additional evaluation
study of the program is submitted.
Undoubtedly, the online format and transparent documentation of
CrimeSolutions.gov and MPG made the programs’ contrasting rating strategies readily
apparent; inevitably, it made unavoidable the need to consolidate the programs
around a single standard of evidence. In 2011, OJJDP started the process to align
MPG’s program ratings to harmonize them with the CrimeSolutions.gov evi-
dence standards.
The alignment of model programs guide with CrimeSolutions.gov

CrimeSolutions.gov’s scoring methodology and instrument were based on those devel-
oped for MPG. Once CrimeSolutions.gov was launched, OJJDP started discussions
regarding the future of MPG: the Office of Justice Programs now had two different
websites with similar (but not identical) ratings of programs, but different lists and dif-
ferent databases. Eventually, OJJDP decided to redesign and re-launch Model
Programs Guide with a program database that was completely aligned with
CrimeSolutions.gov. OJJDP would maintain the familiar name of MPG and its add-
itional resources.31 MPG also later expanded to include an additional focus on imple-
mentation resources (discussed in a later section).
28
As of November 15, 2018. www.crimesolutions.gov
29
The “No Effects” rating in CrimeSolutions.gov includes programs with null or negative results; each program
narrative provides information describing the specific findings and results.
30
See https://www.crimesolutions.gov/about_evidencerating.aspx
31
OJJDP Meeting Notes on MPG: Strategy and Next Steps, December 11, 2011.
All MPG programs that would align within CrimeSolutions.gov’s rubric were re-
reviewed, and new profiles and ratings were entered into the CrimeSolutions.gov
database. In most cases, the re-review resulted in a similar rating; however, for 20
programs, the re-review resulted in a new classification under CrimeSolutions.gov as
“No Effects.” For 66 programs, the re-review resulted in the lack of classification
because there was “insufficient” evidence. In addition, 21 programs were screened out
prior to classification because they no longer met the scope or minimum evidence cri-
teria established for review under CrimeSolutions.gov. In many of these cases, the pro-
grams were no longer operational and did not appear to have much of an impact.
However, in some cases the change in classification caused confusion among constitu-
ents who used Model Programs Guide as well as concern among program developers.
Finally, some content (the Disproportionate Minority Contact and Deinstitutionalization
of Status Offenders content) that had been developed under Model Programs Guide
no longer fit into OJJDP’s next generation of an evidence-based MPG and OJJDP
migrated this content to a training and technical assistance provider.
In November 2013, OJJDP officially re-launched its new Model Programs Guide with
a list of juvenile programs that fully aligned with CrimeSolutions.gov but maintained
the brand and features of the original effort. Both sites now use the same underlying
database of programs to project juvenile/youth programs simultaneously to both sites.
The Model Programs Guide project still funds the review and input of the vast major-
ity of “juvenile” programs on CrimeSolutions.gov (and MPG and Youth.gov).
Incorporating OJPs “what works in reentry” evidence clearinghouse into

CrimeSolutions.gov
Within approximately the same timeframe CrimeSolutions.gov was developed, OJP’s
Bureau of Justice Assistance established an online repository for research on reentry.
The evidence standards for BJA’s What Works in Reentry (WWR) focused on whether a
program or intervention was harmful or beneficial; the strength of the evidence; and
the rigor of the research design. It also incorporated the important advance of
grouping studies together into broad program areas (e.g., employment, education,
family-based programs), presaging a similar approach in CrimeSolutions.gov to include
meta-analyses (discussed in the next section). By 2015, the cost and duplication of
WWR and CrimeSolutions.gov seemed untenable. By 2017, all programs that had
appeared in WWR had been re-reviewed under the CrimeSolutions.gov
methodology.32
CrimeSolutions.gov and the expansion to practices

A unifying feature of the justice-focused evidence clearinghouses was a general focus
on single programs as the unit of analysis.33 From the beginning, OJP had envisioned
that CrimeSolutions.gov would include a schema to aggregate evidence across
32
See https://www.crimesolutions.gov/faqs.aspx for more information.
33
The obvious exception is where the same program is evaluated separately in more than one setting. This,
however, is generally a true replication of the same identical “name-brand” program.
programs so that “practices” common to similar programs could be characterized for

effectiveness.34 Even in the relatively small literature of criminal justice evaluation, sys-
tematic reviews and meta-analysis had taken firm root as a means to aggregate indi-
vidual studies (Boruch, Petrosino, & Morgan, 2015; Petrosino, 2005; Petrosino &
Lavenberg, 2007). Eventually, CrimeSolutions.gov developed a protocol to assemble
and rate evidence drawn from meta-analyses and systematic reviews; this would be
the foundation for a new “module” within CrimeSolutions.gov. In October 2013,
CrimeSolutions.gov released the “Practices” module with 15 practices.35 (They currently
number 77.36) CrimeSolutions.gov defines practices as “a general category of pro-
grams, strategies, or procedures that share similar characteristics with regard to the
issues they address and how they address them” (National Institute of Justice, n.d.).
Using results derived from meta-analyses and systematic reviews, practices are
reviewed and rated using a process similar to the one for CrimeSolutions.gov pro-
grams but employs a scoring instrument specifically designed to assess the rigor of a
meta-analysis and the strength of the outcomes instead of single program evalua-
tions.37 There are also two key differences in the way the evidence is analyzed and
presented on CrimeSolutions.gov. First, all acceptable meta-analyses are used for a
practice rating, not just a maximum of three as in the Program module. Second,
Practices provide individual ratings for each of several outcomes rather than a single
overall “effectiveness” rating.
Aggregating evidence to prevent delinquency

While CrimeSolutions.gov invested in identifying and synthesizing findings from meta-
analyses, OJJDP has also invested in a second effort focused on building a platform to
help translate the findings from meta-analyses into practice. In 2009, with support
from OJJDP and other funders, Lipsey published a meta-analysis of a massive body of
research on interventions to prevent juvenile delinquency (Lipsey 2009). It identified
four broad program characteristics that moderated the effectiveness: risk level of the
juvenile; therapeutic versus control treatment approach; program type and features;
and dosage and quality of program services (Lipsey, Howell, Kelly, Chapman, & Carver,
2010). While these findings represented important contributions for juvenile justice
program planning, they also represent a change in “evidence-based program” guid-
ance. Lipsey’s work implied that figuring out what to do in a complex justice “system”
may require more than simply prescribing implementation of a single, specific, defined
evidence-based program. The findings in the meta-analysis suggest that even “model”
programs are often poorly implemented; and poorly implemented model programs
may be no more effective than generic programs. It also emphasized that there were
other key contextual factors for justice – the characteristics of the individual as well as
34
Office of Justice Programs Request for Quote 2010Q_028 Evidence Assessment of Justice Programs and
Practices, 2011.
35
This was also when the program was relocated from the Office of Justice Programs to its current home in the
National Institute of Justice.
36
As of November 19, 2018. www.crimesolutions.gov
37
The CrimeSolutions.gov practice scoring instrument is available at: https://www.crimesolutions.gov/about_
practicereview.aspx
the amount and quality of service delivered. Lipsey’s approach suggested that the
best way to use evidence to improve effectiveness had less to do with identifying any
particular evidence-based program; instead, it was more about identifying how to
facilitate matching the unique criminogenic risks of a youth to the right program type
with the right quality and quantity of services. This may or may not include imple-
menting an identifiable evidence-based program.
To facilitate the application of the aggregated findings from his comprehensive
meta-analysis within varied and unique jurisdictional settings, Lipsey produced an ana-
lytic tool – the Standardized Program Evaluation Protocol (SPEP). The SPEP used the
evidence base of Lipsey’s meta-analysis to create a scoring protocol that could be
used to “evaluate” the match between the risk levels of the youth, the services avail-
able, the quality and quantity of those services, and relevant contextual factors. SPEP
provides a quantitative assessment of the degree of the fit and identifies gaps as
opportunities for program improvement. In 2012, OJJDP provided funding to imple-
ment the SPEP tool in three states in order to assess effectiveness of a wide range of
delinquency programs (Liberman and Hussemann, 2016).
Essentially, Lipsey’s SPEP is a program decision support tool based on evaluation
results from hundreds of delinquency interventions. Lipsey describes it as a
“framework” for “the integration of a forward-looking administrative model with evi-
dence-based programming” (Lipsey et al. 2010, p 5). The administrative model is the
work of assessing programs in operation for effectiveness and to plan forward-looking
changes to increase program effectiveness.
Evidence-based repositories, such as the Model Programs Guide, have also begun
to recognize the greater systemic and community contextual factors that affect imple-
mentation in the Implementation Guides. Under OJJDP’s Model Programs Guide, DSG
collected information from juvenile justice practitioners and policymakers regarding
how they use information about evidence-based programs in their daily lives. While
this information collection consisted of a small number of informally structured focus
groups, DSG was able to systematically code and analyze the qualitative data into dis-
cussion themes. The first theme that emerged was the importance of funding. The
second most mentioned theme was “adaptation” (Stephenson, et al., 2014). The juven-
ile justice specialists, members of the state advisory boards, and local program imple-
menters appeared to fully appreciate the concept of fidelity but considered
adaptability a higher priority in evidence-based program decision making. As a result
of this data collection effort, MPG’s I-Guides were launched in 2016. I-Guides outline
10 steps that should be taken in the pre-implementation stage (before identifying or
implementing an evidence-based program or practice). Identifying the evidence-based
program is only one of these steps: the other nine focus on identifying target popula-
tion, program goals, key stakeholders, community needs, and funding factors, and
ensuring the program selected is aligned to these considerations (Office of Juvenile
Justice and Delinquency Prevention. (n.d.-b).
Other juvenile justice initiatives have also been focused on systematic translation
methods to apply an evidence-based approach to practice with a broader orientation
than the traditional “evidence-based program model.” One example is the Juvenile
Drug Treatment Court Guidelines (Office of Juvenile Justice and Delinquency
Prevention, 2016), that involved two meta-analyses, two systematic reviews, and a pol-
icy and practice scan to examine the relative effectiveness of program types and court
practices on youth substance use and juvenile offending outcomes, as well as the
mediators, moderators, and correlates of that relative effectiveness. The staff involved
in the project then developed a “protocol” that established an analytical framework to
assess the relative strength of evidence findings about certain practices, translate the
types of practices with the strongest findings into a set of guidelines, and integrate
contextual information about implementation (Jarjoura, Tyson, & Petrosino, 2016). The
teams involved in the project then created a “self-assessment tool” that provides the
instrument to assess the closeness of practice of any local court to the established
guideline based on evidence. OJJDP is funding an ongoing evaluation to examine the
effectiveness of this approach.
The National Mentoring Resource Center has also established a framework that
includes, but does not limit its evidence-based guidance to “model” programs.
Instead, it examines the evidence of those models in combination with practices and
resources as well as a structure of measurement guidance to establish a continuous
quality improvement approach to evidence. Research evidence is not regarded as a
singular, static set of information about a particular program but is instead examined
in a broader context (National Mentoring Resource Center Research Board, n.d.).
While the goals and methods across these initiatives differ, they represent a body
of work that broadens the focal point of federally-funded evidence-based approaches
to include structured tools to assess closeness of fit between a system’s or commun-
ity’s practices and the existing evidence-base, which includes but is not limited to
appropriate implementation of a model program.
Findings
In this concluding section, we describe what this historical analysis suggests about the
past, current, and potential future directions about understanding research evidence
in criminal justice. Our review of more than 50 years in pursuit of program evidence
of effectiveness for criminal justice has followed a specific path: the arc traced by the
Federal agencies charged with the mission of leading and guiding this pursuit. These
agencies figure prominently in our conclusions about what lies ahead, though we
acknowledge that much of what will define the future lies beyond their domains.
Success in the future, as in the past, will depend on resources and partners. As much
as anything, success will depend on the vision of those who are placed in positions of
leadership in these Federal agencies and throughout the key agencies and partners
that comprise the tapestry of criminal justice.
Finding 1: “What works” is probably the wrong question. At least, it is not the
complete question
For more than 50 years, federal agencies within the US Department of Justice have
been at work to bring evidence to bear on criminal justice policy and practice. This
history is grounded in foundational legislation that formed federal agencies and their
missions to work with communities, agencies, and individuals across the country. The
belief in the ability of research and evidence to decrease crime and improve public
safety is a theme in the legislation and the evolution of federal evidence-based policy
in criminal and juvenile justice. From its genesis, the persistent research question has
been some form of “what works?”
Despite its persistence – from the 1974 “How Well Does it Work,” to the 1997
“Preventing Crime: What Works, What Doesn’t and What’s Promising,” to the present
vernacular of “what works clearinghouses” like Model Programs Guide and
CrimeSolutions.gov – the question of what works minimizes the complexity of even
the most straightforward program evaluation (Sampson, 2010; Butts and Roman,
2018). Contextual features – for whom, in what dosage, in which communities, and
under what conditions – have always formed a backdrop to the “it works or it doesn’t”
bottom line result of any evaluation. Researchers and program partners often agonize
over choices that determine context: which site, which subjects, what dosage, and so
on. Yet we often fail to note these important programmatic features once the inter-
group t test results are in.
The oversimplified goal of identifying what works in the criminal and juvenile justice
community pervades our field. The legacy of our research history is a language that
emphasizes program effect, fidelity, and replicability over context or adaptation. Even
the program databases in CrimeSolutions.gov and Model Programs Guide (and other evi-
dence repositories) lend themselves to a focus on “what works,” even though they con-
tain much richer information. Users of these databases can easily overlook detailed
information provided about a program’s evaluation and its methodological rigor, varia-
tions among outcomes, the conceptual framework of the program and how it figures
into the program’s assessment, the fidelity with which the program is implemented, and
other valuable program and contextual information. Some early critics of
CrimeSolutions.gov cautioned against the red-yellow-green rating schema precisely
because it minimized many of these important details of the program and its evaluation.
What “doesn’t work” is similarly blinkered. It’s likely that most CrimeSolutions.gov
users readily disregard programs rated “No Effects,” perhaps assuming these programs
contain nothing of value. It’s worth remembering that though these programs were
judged to not achieve criminal justice outcomes in a given context, they may “work”
elsewhere or with other outcomes (the inverse of “Effective” programs that sometimes
fail in replication). Another overlooked group of CrimeSolutions.gov programs provides
even more potential value: those classified as “no rating due to inconclusive evidence.”
For these, the failing was often the evaluation or its design, not necessarily the pro-
gram itself. We could learn a lot about how we use – and misuse – our evaluation
tools from a closer inspection of these studies; and there likely are effective programs
to be found and evaluated more rigorously.
Rather than framing evaluation results only in terms of “what works” (or what
doesn’t work), we might address other important questions more completely. Under
what circumstances did the program work and not work? What aspects of the pro-
gram contributed to its effectiveness or lack thereof? Knowledge is cumulative, time
and places are dynamic, and research is iterative. Pretending otherwise limits the value
of what we learn from any single study.
Finding 2: Our evaluations must provide more dynamic guidance and a

broader array of tools to inform criminal justice policy and practice
Meta-analysis, especially the comprehensive work done by Lipsey and others, lifts our
attention up from a single study toward more aggregated evidence. We need to rec-
ognize, however, the manifold ways multiple evaluations might be conducted,
assessed, compared, and aggregated – especially when results vary across studies
(Boruch et al., 2015).
In her section of the paper by Farrington et al. (2018), Denise Gottfredson describes a
key tension in our approach to replication and heterogeneity among evaluation findings.
Drawing on work supported by the Society for Prevention Research, Gottfredson notes
two different objectives for repeating a study. The first is to rule out what may have been
chance findings from the first evaluation; the second is to establish generalizability of
effects across different settings, subjects, dosage levels, and other program features.
Gottfredson concludes by pointing out the incompatibility between these objectives: rul-
ing out chance findings requires exact replication, while learning about program general-
izability requires variations to the program being implemented and tested.
In a field where programs frequently evolve and implementation varies, it is neces-
sary not only to assemble and contrast program effects but also program features
(context, dosage, participants, etc.). Lipsey’s comprehensive meta-analysis of more
than 500 delinquency prevention programs – incorporated into his SPEP tool –
revealed that, on average, one broad program type (therapeutic) “worked” and
another (control) “didn’t work.” Lipsey recognized that meta-analysis often leads eval-
uators and program implementers to focus on only the few programs at the top end
of the rating scheme. This focus on what works leaves out the larger portion of evalu-
ation evidence from all the other programs that could be used to discern what it is
consistent among programs that had effects, even if the magnitude of the effect
varies. Lipsey’s SPEP approach aggregates all the evidence of effective program ele-
ments from as many studies as feasible, including the widest possible array of effective
and promising (and perhaps even not-so-promising) programs: not just those that
might be considered effective at the highest level. This work of unpacking, measuring,
and scoring the context and the features of programs linked to program effectiveness
is the first essential step beyond a simplistic decision about “what works.”
If our research knowledge is to be truly cumulative, evaluations that provide only a
works/doesn’t work determination are probably not worth doing. More of our evalua-
tions need to measure and report in greater detail about what’s in the experimental
“black box:” what was the dosage, who were the participants, what was the context of
the intervention, what were the attributes of those who delivered it, what program
elements were included and which were missing or minimized, and so on. These are
the data we often associate with the notion of external validity: which population is
represented by the study subjects, how distinctive or comparable is the site and set-
ting to other possible sites, was implementation similar to or distinct from that of
comparable programs, and so on.
Meta-analyses and other program compilations like CrimeSolutions.gov’s Practices
module reveal the unresolved tension between manualized “name-brand” programs
that seem to demand hi-fidelity replication and the range of program variations that
always occur (often despite our best efforts to control or prevent them). Our review of
more than 50 years of identifying evidence-based programs in criminal justice shows
that a clear, manualized program with strong, rigorous evaluation evidence will always
have an important role in building the body of evidence. And strict replication of
these programs is needed to rule out chance findings from single studies (Farrington
et al., 2018; Pridemore, Makel, & Plucker, 2018). At the same time, we must propel the
field toward ever-more expansive program variations to find the limits of our current
interventions in order to develop the next generation of ever-more effective programs
and strategies (Lo€sel, 2018).
Learning how to organize the expanding, dynamic, and evolving body of evidence
to support a more nuanced discussion about using research evidence will be key to
our work of bringing evidence to bear on policy and practice. We have been shown
one way forward through the systemic approaches of SPEP; and juvenile justice
researchers are already expanding on this approach in tools like the Juvenile Drug
Treatment Court Guidelines, National Mentoring Resource Center, and the MPG
Implementation-Guides. Undoubtedly, there are more tools and strategies waiting to be
discovered and developed.
Recent developments in the field of medical research suggests such tools will be
needed in our own field sooner than later. The Institute of Medicine (IOM) consensus
report on “the learning health care system” noted the rapid growth of research data: on
any given day, there are more than 2,000 health-related research studies published, about
10 meta-analyses begun, and about 75 clinical trials started (Institute of Medicine, 2013).
How can clinical practice ever hope to keep pace?
Much of that hope rests on nascent artificial intelligence systems that will continually
incorporate new research data to inform increasingly refined “decision support tools” to
guide clinical practice. At least some of these tools blur the distinction between research
findings and administrative (in some fields, “clinical”) data. For example, the IOM report
highlights Sweden’s hip replacement registry, which was started in 1979 and adds data on
more than 17,000 hips surgeries each year. The registry contains detailed data on each
patient, surgery, hospital, surgeon, after-care provider, and up to 10 years of follow-up out-
come data on the patient’s health and quality of life. Notwithstanding valid concerns
about naïve faith in “big” administrative data (Lynch, 2018), it is worth considering what
our criminal justice equivalent to Sweden’s hip surgery registry might reveal about what is
working and not working to prevent and control crime, violence, and delinquency.
In a policy environment like criminal justice, where many key aspects are determined
locally, programs need to be relevant – valid, essentially – to the local communities that
implement them. Where local resources and local statutes form the backdrop, local deter-
mination of at least some program features must be acknowledged and encouraged for
successful program uptake and effectiveness. Even programs proven highly effective else-
where may require significant adaptation and alteration to match local needs and con-
straints. Moreover, local context and conditions are not immutable, so programs must not
be framed in static terms. Program implementation occurs in an ever-changing temporal
and geo-spatial context. In order to remain relevant and effective, programs and interven-
tions may be continually redesigned through implementation, testing, and improving – a
dynamic process of invention and re-invention.
If research evidence is to have a more prominent role in the future of crime and
justice programs and policies, its interaction with the innovation, implementation, and
redesign continually occurring in thousands of communities across the country must
be much more immediate, integrative, and persistent. CrimeSolutions.gov and Model
Programs Guide have taken formative steps in this direction: in addition to elementary
answers to “what works,” the repositories also include a much broader set of Practices,
Implementation Guides, and literature reviews that start examining the contextual fea-
tures of those program implementations. These important developments need to be
supported and extended further.
Finding 3: There is an enduring need for coordinated leadership to support,

strengthen, and aggregate program evaluation, program implementation, and
ongoing program assistance
More than 50 years ago, the President’s Commission provided a description of the criminal
justice system that still rings true. The Commission noted it was “not a monolithic or even
consistent system. It was not designed or built in one piece at one time.” Around a core
that was “an adaptation of English common law to America’s peculiar structure of gov-
ernment,” we have added “layer upon layer of institutions and procedures, some carefully
constructed and some improvised, some inspired by principle and some by expedience.” If
that weren’t enough disarray, “[o]ur system of justice deliberately sacrifices much in effi-
ciency and even in effectiveness in order to preserve local autonomy … Sometimes it may
seem to sacrifice too much” (President’s Commission on Law Enforcement and
Administration of Justice, 1967, p. 7).
The Commission’s report and recommendations were a cry for what some have
described as a hoped-for “national coherence”38 on the work of justice and preventing
crime. For the Commission, empirical data and evidence were to form the cornerstone
that would bring order, coherence, and effectiveness to the justice system. “Society
has relied primarily on traditional answers and has looked almost exclusively to com-
mon sense and hunch for needed changes,” the Commission wrote. Noting the aver-
age industrial investment in research (3%, according to the Commission) and the
research investment of the Department of Defense (15%), the Commission laments the
estimated 1% spent nationally at that time on justice research: “There is probably no
subject of comparable concern to which the Nation is devoting so many resources
and so much effort with so little knowledge of what it is doing.” (p 273)39 Research,
the Commission, averred, would be “the instrument for reform” (p 273).
38
Stone and Travis (2013) named “national coherence” as a pivotal goal in policing – a part of the justice system
particularly riven by intensely local determination and variation. Their paper was part of the “second” NIJ Executive
Session on Policing. The second – and papers like Stone and Travis’s – can trace heritage to the earlier NIJ
Executive Session which was held in the 1980s and was intellectually tied (through concepts like co-production of
public safety and what would become the concept of community policing) to the work of the Commission.
39
Though the Commission’s report called for “as much as $200 million” in 1967 to fund research and led to the
establishment of a new Federal source of funding to support state and local criminal justice (LEAA), the funding
ambitions were never realized. LEAA’s total program budget in 1973 (the apex of LEAA funding) is estimated to
have been not more than 10% of national 1973 expenditures on crime and justice. NILECJ’s research portion, of
course, was miniscule, less than one-twentieth of LEAA funds. NIJ’s budget has never approached even so much as
a tenth of a percent of national crime and justice expenditures. (Contact the authors for details on these budget
data and estimations.)
In a system as large, diverse, and decentralized as the US justice system, there is a

persistent need for leadership on a number of core concerns. Our decentralized sys-
tem benefits greatly from national bodies, associations, and forums which connect
local practitioners and transmit valuable information, data, and evidence. These con-
nections may be best served by commonly recognized standards of evidence estab-
lished and sustained by “honest brokers” in support of the research community and in
collaboration with justice practitioners. Overall, our disjointed system benefits most
when its unconnected forays in justice experimentation, trial, error, and success are
tracked, recorded, and disseminated nation-wide. We learn fastest when we learn
together, from each other.
There is a critical need to more seamlessly blend evidence into practice, research
into policy, evaluation into implementation and program assistance. Even when the
pace of evaluation has been slow, the infusion of research into practice has lagged
even farther behind and has been incomplete, inexact, or unsuccessful. As the pace of
research builds and the speed of dissemination accelerates, new paradigms are
needed to overthrow conventional distinctions that separate the research enterprise
from the world of policy and practice. A new balance must be struck between the dis-
tance that objectivity warrants and the intimacy that collaboration demands – particu-
larly in the rapidly evolving world of justice.
National professional associations partly fill this pressing need, as do key non-
government justice organizations. In the end, however, justice work is government
work, and state and local government justice agencies (as well as their non-govern-
ment partners) deserve Federal leadership and a Federal “honest broker” attuned to
the dizzying variety and hectic pace that is our nation’s state and local justice system.
Key elements of that Federal evidence leadership already exist. Newly aligned to
reflect the dynamic dance that must occur among research, assistance, practice, and
policy, these elements embodied in the Federal science agencies that carry on the leg-
acy and the foundational work of the LEAA, will help to provide more effective justice
programs and policies for our nation.
Conclusion
The history of what works in juvenile and criminal justice is a story of evolution in
how we conduct rigorous evaluation and how we define the terms and standards of
evidence. The federal approach and role in this ongoing dialogue has been shaped by
legislation, federal policy, the state of justice research, relationships with local agencies
and organizations, and partnerships across public-private workgroups. The history of
the evolution of the conversation and approach provides an important context for
understanding current efforts to identify and catalog evidence-based approaches.
Compressed to a few pages, our history of building evidence might seem to be
largely about finding “what works” in justice. And it may seem marked by fits and
starts, by good beginnings that faltered, and by fortuitous events that seemed to avert
calamity by steering matters more closely to course, at least for a while. Read this
way, our history of building evidence to inform practice and policy is a bit disappoint-
ing, perhaps even frustrating, for the opportunities missed, the time lost, and the
forward motion that seemed to be interrupted again and again with the answer still
beyond our reach.
However, viewed over more than 50 years of evolving knowledge about context
and implementation, one can see all the countervailing forces and competing prior-
ities not as impediments to progress, but instead as a call to continuous growth and
improvement. From this point of view, one can more see 50 years of evidence-build-
ing as an ongoing (if not consistent) effort, marked by resilience and persistence even
through times of turbulence and falling resources.
Writing in 1967, the Commission noted a 1927 study that concluded that science
was needed in the justice world to “separate the known from the unknown, to divorce
fact from assumption, and to strip biases of every sort of their authority.” The
Commission resignedly acknowledged that not much had changed in the 40 years
since that study. Our review of the 50 years since the Commission confirms just how
difficult this pursuit continues to be.
Perhaps our quest for evidence-based justice will never entirely separate the known
and unknown, fact from assumption, or strip every possible bias from our discourse.
Social science evaluation in the real world will never eliminate all potential threats to
internal or external validity, and application of those findings to an ever evolving set
of circumstances will never be quick enough to be perfectly situated.
A more realistic (and achievable) set of goals might be to maximize the known, dis-
tinguish as clearly as possible what remains unknown, draw clearer boundaries
between fact and assumption, and reduce biases to the greatest extent possible.
Doing this, while remaining attentive to the varied and ever-shifting context in which
programs are implemented and tested will go a long way toward providing justice
practitioners with the highest confidence possible in our policies, practices, and pro-
gram decisions.
Disclosure statement
No potential conflict of interest was reported by the authors.
Notes on contributors
Thomas E. Feucht, PhD, is Executive Senior Science Advisor at the National Institute of Justice
(NIJ), U.S. Department of Justice. He has served NIJ’s Deputy Director for Research and
Evaluation, program director for NIJ’s CrimeSolutions.gov, and science advisor at the National
Institute of Corrections, Federal Bureau of Prisons. He has conducted and published research on
policing and terrorism; substance abuse; intravenous drug use, HIV, and prostitution; prison
drug use; research for policy and practice; and school violence. Before joining NIJ in 1994, Dr.
Feucht was Associate Professor in Sociology and in the College of Urban Affairs at Cleveland
State University. Dr. Feucht received his doctorate in sociology from the University of North
Carolina-Chapel Hill with an emphasis on quantitative research methods and statistics.
Jennifer Tyson is a senior social science analyst in the National Institute of Justice, Office of
Research and Evaluation. In this position, she supports the development, management, and
operation of NIJ’s research and evaluation agenda around diverse juvenile justice system, juven-
ile delinquency, and child victimization topics. She previously served as the research coordinator
for the Office of Juvenile Justice and Delinquency Prevention (OJJDP) and focused on
translational research efforts in mentoring, juvenile drug treatment courts, and delinquency pre-
vention initiatives. She also completed a temporary assignment with the Office of Justice
Programs’ Evidence Integration Initiative (E2I) – an initiative that launched CrimeSolutions.gov
and aimed to better integrate research and evidence both within OJP and in the field. Prior to
joining OJJDP, she served as a coordinator for a national training and technical assistance pro-
ject at American University and as a program coordinator for a community-based crime preven-
tion and public safety effort in the Office of the Attorney General, Commonwealth of
Massachusetts. Jennifer holds a B.A. in Philosophy and Psychology from Boston University and
an M.A. in Child Development and Urban Policy and Planning from Tufts University.
Acknowledgements
The authors wish to thank NIJ Director David Muhlhausen for his support of this essay. We also
wish to thank Marcia Cohen, Brecht Donoghue, Scott Hertzberg, Tammy Holt, Amy Leffler, Lee
Mockensturm, Angela Moore, David Muhlhausen, Rianna Starheim, Rachel Stephenson, and
Phelan Wyrick for providing valuable comments on earlier drafts. Scott Hertzberg and James
Fort were instrumental in filling in key gaps in our knowledge by surfacing several historical
documents that were known to exist but were particularly difficult to find.
References
A Bill to Establish the National Institute of Justice, S.3612, 92d Cong., 2nd Sess. (1972). Available
from: ProQuestV Congressional; Accessed: 11/20/2018
R
Allen, R. E. (1973). Considering the NIJ concept. American Bar Association Journal, 59, 1. https://
heinonline.org/HOL/Page?collection¼abajournals&handle¼hein.journals/abaj59&id¼170&men_
tab¼srchresults
Baron, J. (2003). Bringing evidence-based progress to crime and substance-abuse policy: A recom-
mended Federal strategy. Washington, DC: The Council for Excellence in Government. http://
coalition4evidence.org/wp-content/uploads/2012/12/BringingEvidenceDrivenProgresstoCrime
SubstanceAbuse.pdf
Beckett, K., & Sasson, T. (2008). The politics of crime. In R. Crutchfield, C. E. Kubrin, & G. S.
Bridges (Eds.), Crime: Readings. Los Angeles: Sage.
Berleman, W. C. (1980). Juvenile delinquency prevention experiments: A review and analysis. U.S.
Department of Justice, Law Enforcement Assistance Administration, Office of Juvenile Justice
and Delinquency Prevention. https://www.ncjrs.gov/pdffiles1/Digitization/71111NCJRS.pdf
Blumstein, A. (1968). A national program of research, development, test, and evaluation on law
enforcement and criminal justice. Arlington, VA: Institute for Defense Analyses.
Blumstein, A. (2013). Linking evidence and criminal justice policy. Criminology & Public Policy,
12(4), 721–730. doi:10.1111/1745-9133.12040. http://eds.b.ebscohost.com/ehost/pdfviewer/pdf
viewer?vid¼3&sid¼9de1398e-9615-476a-9fb2-725e577195f6%40pdc-v-sessmgr06
Blumstein, A. (2018). Science and technology and the President’s Crime Commission.
Criminology & Public Policy, 17(2), 271–282. doi:10.1111/1745-9133.12360
Blumstein, A., & Petersilia, J. (1995). Investing in criminal justice research. In J. Q. Wilson (Ed.),
Crime (Vol. 465, pp. 467) San Francisco: ICS Press.
Boruch, R. F., Petrosino, A. & Morgan, C. (2015). Meta-analyses, systematic reviews, and research
syntheses. In Newcomer, K. Hatry, H., & Wholey J. (Eds.), Handbook of practical program evalu-
ation (pp. 673–698). Hoboken, NJ: Wiley. https://onlinelibrary.wiley.com/doi/pdf/10.1002/
9781119171386
Bush, G. W. (2005). President George W. Bush, State of the Union Address, February 2, 2005.
Retrieved from https://georgewbush-whitehouse.archives.gov/stateoftheunion/2005/index.
html
Butts, J. A., & Roman, J. K. (2018). Good questions: Building evaluation evidence in a competitive
policy environment. Justice Evaluation Journal, 1(1), 1–17. doi:10.1080/24751979.2018.1478237.
Center for the Study and Prevention of Violence. (n.d.-a). Background, November 20, 2018.
Retrieved from https://cspv.colorado.edu/blueprints/background.htm.
Center for the Study and Prevention of Violence. (n.d.-b). Blueprints for healthy youth develop-
ment, Retrieved November 20, 2018. Retrieved from http://www.blueprintsprograms.com/
Committee on Assessing the Research Programs of the National Institute of Justice. 2010.
Strengthening the National Institute of Justice. Washington, DC: National Research Council.
https://www.nap.edu/catalog/12929/strengthening-the-national-institute-of-justice
Committee on Research on Law Enforcement and Criminal Justice. (1977). Understanding crime:
An evaluation of the national institute of law enforcement and criminal justice. Washington, DC:
National Academy of Sciences. https://www.nap.edu/catalog/13536/understanding-crime-an-
evaluation-of-the-national-institute-of-law
Cullen, F. T. (2013). Rehabilitation: Beyond nothing works. Crime and Justice, 42(1), 299–376. doi:
10.1086/670395
Cullen, F. T., & Gendreau, P. (2001). From nothing works to what works: Changing professional
ideology in the 21st century. Prison Journal, 81(3), 313. doi:10.1177/0032885501081003002
Development Services Group. (2001). Title V promising and effective programs guide. Bethesda,
MD: Development Services Group, Inc.
Development Service Group. (2003). Title V training and technical assistance program for state
and local governments. Bethesda, MD: Development Services Group, Inc.
Drake, E. K., Aos, S., & Miller, M. G. (2009). Evidence-based public policy options to reduce crime
and criminal justice costs: Implications in Washington State. Victims & Offenders, 4(2),
170–196.doi:10.1080/15564880802612615 http://search.ebscohost.com/login.aspx?direct¼true&
db¼i3h&AN¼36449212&site¼ehost-live
Early, B. H. (1971). National institute of justice – A proposal. W. Va. L. Rev., 74, 226.
Elliott, J. F., Sardino, T. J. (1970). Experimental evaluation of the crime control team organization
concept. Police, May/June, 44–53.
Farrington, D. P. (2003). A short history of randomized experiments in criminology. A meager
feast. Evaluation Review, 27(3), 218–227.
Farrington, D. P., Lo €sel, F., Boruch, R. F., Gottfredson, D. C., Mazerolle, L., Sherman, L. W., &
Weisburd, D. (2018). Advancing knowledge about replication in criminology. Journal of
Experimental Criminology, 1–24. https://doi.org/10.1007/s11292-018-9337-3
Farrington, D. P., Ohlin, L. E., & Wilson, J. Q. (1986). Understanding and controlling crime-Toward a
new research strategy. New York: Springer-Verlag.
Garner, J. H. and Jaycox, V. (Eds.). (1981). The first national conference on criminal justice evalu-
ation: Selected papers. Washington, DC: National Institute of Justice. https://www.ncjrs.gov/
pdffiles1/Digitization/82918-82926NCJRS.pdf
Garner, J. H. and Visher, C. A. (2003). The production of criminological experiments. Evaluation
Review, 27(3), 316–335. doi:10.1177/0193841X03027003006
Holder, E. (2011). Attorney General Eric Holder Speaks at the 2011 National Institute of Justice
Conference. Retrieved from https://www.justice.gov/opa/speech/attorney-general-eric-holder-
speaksat-2011-national-institute-justice-conference
Institute of Criminal Law and Procedure. (1971). Study and evaluation of projects and programs
funded under the Law Enforcement Assistance Act of 1965. Retrieved from https://www.ncjrs.
gov/pdffiles1/Digitization/64601NCJRS.pdf
Jacoby, J. E., Severance, T. A., & Bruce, A. S. (Eds.). (2004). Classics of criminology. Prospect
Heights, IL: Waveland Press.
Jarjoura, R., Tyson, J., & Petrosino, A. (2016). Juvenile drug treatment court guidelines: Research evi-
dence and practice synthesis and translation protocol. Retrieved from https://www.ojjdp.gov/
JDTC/protocol.pdf
Johnson, L. B. (1966). Special message to the Congress on crime and law enforcement, March 9,
1966. The Public Papers of the Presidents: Hoover to Obama. Retrieved from http://www.presid
ency.ucsb.edu/ws/?pid¼27478
Justice System Improvement Act of 1979, Pub. L. No. 96-157 § 241 (1979). https://www.gpo.gov/
fdsys/pkg/STATUTE-93/pdf/STATUTE-93-Pg1167.pdf
Juvenile Justice and Delinquency Prevention Act of 1974, Pub. L. No. 93-415 (1974). https://
www.gpo.gov/fdsys/pkg/STATUTE-88/pdf/STATUTE-88-Pg1109.pdf
Laub, J. H. (2011). National Institute of Justice response to the report of the National Research
Council: Strengthening the National Institute of Justice. Washington, DC: National Institute of
Justice. https://www.ncjrs.gov/pdffiles1/nij/234630.pdf
Law Enforcement Assistance Act, Pub. L. No. 89-197 (1965). https://www.ncjrs.gov/pdffiles1/
Digitization/134199NCJRS.pdf
Law Enforcement Assistance Administration. (1969). LEAA first annual report. Washington, DC:
Law Enforcement Assistance Administration. https://www.ncjrs.gov/pdffiles1/nij/2157.pdf
Law Enforcement Assistance Administration: A symposium on its operation and impact:
Conclusion. (1973). Columbia Human Rights Law Review, 5, 207–214. https://heinonline.org/
HOL/Page?handle¼hein.journals/colhr5&id¼213&collection¼journals
Law Enforcement Assistance Administration. (1974). The report of the LEAA evaluation policy task
force. Washington, DC: Law Enforcement Assistance Administration. https://www.ncjrs.gov/
pdffiles1/Digitization/146855NCJRS.pdf
Lempert, R. O., & Visher, C. (1988). Randomized field experiments in criminal justice agencies.
Washington, DC: National Institute of Justice. https://www.ncjrs.gov/pdffiles1/Digitization/
113666NCJRS.pdf
Liberman, A., & Hussemann, J. (2016). Implementing the SPEPTM: Lessons from demonstration sites
in OJJDP’s Juvenile Justice Reform and Reinvestment Initiative. Washington, DC: Urban Institute.
https://www.ncjrs.gov/pdffiles1/ojjdp/grants/250482.pdf
Lipsey, M. W. (2009). The primary factors that characterize effective interventions with juvenile
offenders: A meta-analytic overview. Victims and offenders, 4(2), 124–147. http://www.episcen-
ter.psu.edu/sites/default/files/community/Lipsey_Effective%20interventions%20-%202009.pdf
doi:10.1080/15564880802612573
Lipsey, M. W., Howell, J. C., Kelly, M. R., Chapman, G., & Carver, D. (2010). Improving the effectiveness
of juvenile justice programs. Washington, DC: Center for Juvenile Justice Reform at Georgetown
University. http://www.njjn.org/uploads/digital-library/CJJR_Lipsey_Improving-Effectiveness-
of-Juvenile-Justice_2010.pdf
Lipton, D., Martinson, R., & Wilks, J. (1975). Effectiveness of correctional treatment: A survey of
treatment evaluation studies. Westport, CT: Praeger.
€sel, F. (2018). Evidence comes by replication, but needs differentiation: the reproducibility
Lo
issue in science and its relevance for criminology. Journal of Experimental Criminology, 14(3),
257–278. doi:10.1007/s11292-017-9297-z
Lum, C., Koper, C., & Telep, C. (2011). The Evidence-based policing matrix. Journal of Experimental
Criminology, 7(1), 3–26. doi:10.1007/s11292-010-9108-2. http://eds.b.ebscohost.com/ehost/
pdfviewer/pdfviewer?vid¼11&sid¼9de1398e-9615-476a-9fb2-725e577195f6%40pdc-v-sessmgr06
Lynch, J. (2018). Not even our own facts: Criminology in the era of big data. Criminology, 56(3),
437–454. doi:10.1111/1745-9125.12182
Martinson, R. (1974). What Works? – Questions and answers about prison reform. The Public Interest,
35, 22–54. Retrieved from https://search.proquest.com/docview/60863685? accountid¼26333
McGinnis, J. M., Stuckhardt, L., Saunders, R., & Smith, M. (2013). Best care at lower cost: the path
to continuously learning health care in America. National Academies Press. http://www.national
academies.org/hmd/Reports/2012/Best-Care-at-Lower-Cost-The-Path-to-Continuously-Learning-
Health-Care-in-America.aspx
Nagin, D. S., & Weisburd, D. (2013). Evidence and public policy: The example of evaluation
research in policing [comments], Criminology and Public Policy, 12(4), 651–680. http://eds.b.
ebscohost.com/ehost/pdfviewer/pdfviewer?vid¼13&sid¼9de1398e-9615-476a-9fb2-725e57719
5f6%40pdc-v-sessmgr06
National Institute of Justice. (n.d.). About CrimeSolutions.gov, Retrieved November 20, 2018.
Retrieved from http://www.crimesolutions.gov/About.aspx
National Institute of Law Enforcement and Criminal Justice. (1974). Directory of grants, contracts,
and interagency agreements 1969–1974. Washington, DC: U.S. Government Printing Office.
https://www.ncjrs.gov/pdffiles1/Digitization/19975NCJRS.pdf
National Institute of Law Enforcement and Criminal Justice. (1978). How well does it work?
Review of criminal justice evaluation 1978. Washington, DC: Law Enforcement Assistance
Administration, U.S. Department of Justice. https://www.ncjrs.gov/pdffiles1/Digitization/
64112NCJRS.pdf
National Mentoring Resource Center Research Board. (n.d.). National Mentoring Resource Center.
Retrieved November 20, 2018.
Office of Justice Programs. (n.d.). Evidence Integration Initiative (E2I): Translating reliable
research into policy and practice. Retrieved November 20, 2018.
Office of Juvenile Justice and Delinquency Prevention. (2002). Juvenile Justice and Delinquency
Prevention Act of 2002 as amended, Pub. L. No. 93-415 (1974). Retrieved from https://www.
ojjdp.gov/about/jjdpa2002titlev.pdf
Office of Juvenile Justice and Delinquency Prevention. (2016). Juvenile drug treatment court
guidelines. Retrieved November 20, 2018.
Office of Juvenile Justice and Delinquency Prevention. (n.d.-a). Legislation/JJDP Act. Retrieved
from https://www.ojjdp.gov/about/legislation.html
Office of Juvenile Justice and Delinquency Prevention. (n.d.-b). Model Programs Implementation
Guides. Retrieved from https://www.ojjdp.gov/mpg-iguides/
Omnibus Crime Control and Safe Streets Act of 1968, 42 U.S.C., Pub. L. No. 90-351 § 3789d, 82
Stat.197 Stat. (1968). https://www.justice.gov/crt/omnibus-crime-control-and-safe-streets-act-
1968-42-usc-3789d
Petersilia, J. (2004). What works in prisoner reentry-reviewing and questioning the evidence. Fed.
Probation, 68, 4. http://www.uscourts.gov/federal-probation-journal/2004/09/what-works-pris-
oner-reentry-reviewing-and-questioning-evidence
Petrosino, A. (2005). From Martinson to meta-analysis: Research reviews and the US offender
treatment debate. Evidence & Policy: A Journal of Research, Debate and Practice, 1(2), 149–172.
doi:10.1332/1744264053730770
Petrosino, A., & Lavenberg, J. (2007). Systematic reviews and meta-analyses: Best evidence on
what works for criminal justice decision makers. Western Criminology Review, 8, 1. http://www.
westerncriminology.org/documents/WCR/v08n1/petrosino.pdf
President’s Commission on Law Enforcement and Administration of Justice. (1967). The challenge
of crime in a free society. Washington, DC: U.S. Government Printing Office. https://www.ncjrs.
gov/pdffiles1/nij/42.pdf
Pridemore, W. A., Makel, M. C., & Plucker, J. A. (2018). Replication in criminology and the social
sciences. Annual Review of Criminology, 1, 19–38. doi:10.1146/annurev-criminol-032317-091849
Robinson, L. O. (2011). Remarks of the Honorable Laurie Robinson, Assistant Attorney General,
Office of Justice Programs, at the Office of Justice Programs Science Advisory Board inaugural
meeting. Washington, DC: Office of Justice Programs. https://ojp.gov/docs/sabrobinson.pdf
Rodriguez, N. (2018). Expanding the evidence base in criminology and criminal justice: Barriers
and opportunities to bridging research and practice. Justice Evaluation Journal, 1(1), 1–14.
doi:10.1080/24751979.2018.1477525. https://www.tandfonline.com/doi/full/10.1080/24751979.2018.
1477525
Sampson, R. (2010). Gold standard myths: Observations on the experimental turn in quantitative
criminology. Journal of Quantitative Criminology, 26(4), 489–500. doi:10.1007/s10940-010-9117-
3. http://eds.b.ebscohost.com/ehost/pdfviewer/pdfviewer?vid¼19&sid¼9de1398e-9615-476a-9f
b2-725e577195f6%40pdc-v-sessmgr06
Sherman, L. W., Gottfredson, D. C., MacKenzie, D. L., Eck, J., Reuter, P., & Bushway, S. (1997).
Preventing crime: What works, what doesn’t, what’s promising. A report to the United States
Congress. Washington, DC: Office of Justice Programs, US Department of Justice. https://www.
ncjrs.gov/works/
Stephenson, R., Cohen, M., Montagnet, C., Bobnis, A., Gies, S., & Yeide, M. (2014). Model
Programs Guide Implementation Guides: Background and user perspectives on implementing
evidence-based programs. Washington, DC: Office of Juvenile Justice and Delinquency

Prevention, Office of Justice Programs, U.S. Department of Justice. https://www.ojjdp.gov/
mpg/implementations/ImplementationGuides.pdf
Stone, C., & Travis, J. (2011). Toward a new professionalism in policing. In National Institute of
Justice & Kennedy School of Government-Harvard University (Series Ed.). New perspectives on polic-
ing. Washington, DC: National Institute of Justice. https://www.ncjrs.gov/pdffiles1/nij/232359.pdf
Telep, C.W., Garner, J.H., and Visher, C.A. (2015). The production of criminological experiments
revisited: the nature and extent of federal support for experimental designs, 2001-2013.
Journal of Experimental Criminology, 11(4), 541–563. doi:10.1007/s11292-015-9239-6
Tonry, M. (1997). Building better policies on better knowledge. In USDOJ Office of Justice
Programs (Ed.), Symposium on the 30th anniversary of the President’s Commission on Law
Enforcement and Administration of Justice - The challenge of crime in a free society: Looking
back, looking forward (pp. 93–124). Washington, DC: Office of Justice Programs, U.S.
Department of Justice. https://www.ncjrs.gov/pdffiles1/nij/170029.pdf
U.S. Department of Justice. (2009). Attorney General Eric Holder welcomes Laurie Robinson as
Assistant Attorney General in Office of Justice Programs [Press release]. Retrieved from https://
www.justice.gov/opa/pr/attorney-general-eric-holder-welcomes-laurie-robinson-assistant-attorney-
general-office
U.S. General Accounting Office. (2003). Justice outcome evaluations: Design and implementation
of studies requires more NIJ attention. (GAO-03-1091). Washington, DC: U.S. General Accounting
Office. https://www.gao.gov/assets/240/239877.pdf
Working Group of the Federal Collaboration on What Works. (2005). The OJP What Works
Repository. Washington, DC: Office of Justice Programs, U.S. Department of Justice. https://www.
ncjrs.gov/pdffiles1/nij/220889.pdf

Advancing What Works in Justice Past Present and Future Work of Federal Justice Research Agencies

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Advancing What Works in Justice Past Present and Future Work of Federal Justice Research Agencies

Загружено:

Авторское право:

Доступные форматы

Justice Evaluation Journal

ISSN: 2475-1979 (Print) 2475-1987 (Online) Journal homepage: https://www.tandfonline.com/loi/rjej20

Advancing “What Works” in Justice: Past, Present,

Thomas E. Feucht & Jennifer Tyson

To link to this article: https://doi.org/10.1080/24751979.2018.1552083

Published online: 20 Jan 2019.

Submit your article to this journal

View Crossmark data

Full Terms & Conditions of access and use can be found at

Advancing “What Works” in Justice: Past, Present, and

ABSTRACT ARTICLE HISTORY

CONTACT Jennifer Tyson jennifer.tyson@usdoj.gov

The early years of modern criminal justice evidence

The report continues:

independent evaluators (Committee on Research on Law Enforcement and Criminal

The middle years: Defining and building evidence

research; and an intriguing 1980 report on early experiments on delinquency preven-

1974 compilation of projects from NILECJ’s first 5 years

significantly without an increase in manpower or financial resources (Elliott and Sardino,

1974 LEAA evaluation policy task force

1977 NAS study on NILECJ: A deeper look at the work of NILECJ

Ten years of evaluation: Taking stock in 1978

1980 OJJDP delinquency prevention report

The 1980s: A “Golden Age” of criminal justice experiments?

persistent commitment to promoting and disseminating key findings on justice evalu-

Evidence-based program lists and repositories: From the 1990s to today

justice program evaluation. It was a landmark compilation of the cumulative evidence

GAO and NAS assessments of NIJ evaluations

Blueprints for violence prevention and model programs guide

intervention fidelity, attrition, follow-up period, and outcomes. It used an algorithm to

The OJP what works repository

The evidence integration initiative and the launch of CrimeSolutions.gov’s

program’s conceptual framework and seven aspects of evaluation design quality. It

The alignment of model programs guide with CrimeSolutions.gov

Incorporating OJPs “what works in reentry” evidence clearinghouse into

CrimeSolutions.gov and the expansion to practices

programs so that “practices” common to similar programs could be characterized for

Aggregating evidence to prevent delinquency

Finding 2: Our evaluations must provide more dynamic guidance and a

Finding 3: There is an enduring need for coordinated leadership to support,

In a system as large, diverse, and decentralized as the US justice system, there is a

evidence-based programs. Washington, DC: Office of Juvenile Justice and Delinquency

Вам также может понравиться