Experimental Evaluation Design for Program Improvement. Laura R. Peck

Чтение книги онлайн.

Читать онлайн книгу Experimental Evaluation Design for Program Improvement - Laura R. Peck страница 5

Experimental Evaluation Design for Program Improvement - Laura R. Peck Evaluation in Practice Series

Скачать книгу

is not a single thing: It can vary by setting, in terms of the population it serves, by design elements, by various implementation features, and also over time. The changing nature of interventions in practice demands that evaluation also account for that complexity.1

      Within the field of program evaluation, the concept of impact variation has gained traction in recent years. The program’s average impact is one metric by which to judge the program’s worth, but that impact is likely to vary along multiple dimensions. For example, it can vary for distinct subgroups of participants. It might also vary depending on program design or implementation: Programs that offer X and Y might be more effective than those offering only X; programs where frontline staff have greater experience or where the program manager is an especially dynamic leader might be more effective than those without. These observations about what makes up a program and how it is implemented have become increasingly important as potential drivers of impact.

      Accordingly, the field has expanded the way it thinks about impacts, to be increasingly interested in impact variation. Assessments of how impacts vary—what works, for whom, and under what circumstances—are currently an important topic within the field. The field has expanded its toolkit of analytic strategies for understanding impact variation to addressing “what works” questions, this book will focus on design options for examining impact variation.2

      1 In Peck (2015), I explicitly discuss “programmatic complexity” and “temporal complexity” as key factors that suggest specific evaluation approaches, both in design and analysis.

      2 For a useful treatment of the relevant analytic strategies—including an applied illustration using the Moving to Opportunity (MTO) demonstration—I refer the reader to Chapter 7 in New Directions for Evaluation #152 (Peck, 2016).

      The Ethics of Experimentation

      Prior research and commentary considers whether it is ethical to randomize access to government and nonprofit services (e.g., Bell & Peck, 2016). Are those who “lose the lottery” and are randomized into the control group disadvantaged in some way (and is that disadvantage actually unfair or unethical)? Randomizing who gets served is just one way to ration access to a funding-constrained program. I argue that giving all deserving applicants an equal chance through a lottery is the fairest, most ethical way to proceed when not all can be served. I assert that is it unfair and unethical to hand pick applicants to serve because that selection can involve prejudices that result in unequal treatment of individuals along racial, ethnic, nationality, age, sex, or orientation lines. Even a first-come, first-serve process can advantage some groups of individuals over others. Random assignment such as a lottery can ensure that no insidious biases enter the equation of who is served.

      Furthermore, program staff can be wonderfully creative in blending local procedures with randomization in order to ensure that they are serving their target populations while preserving the experiment’s integrity. For example, the U.S. Department of Health and Human Services’s Family and Youth Services Bureau (FYSB) is operating an evaluation of a homeless youth program called the Transitional Living Program (Walker, Copson, de Sousa, McCall, & Santucci, 2019; U.S. Department of Health and Human Services (DHHS), n.d.a). The evaluation worked with program staff to help them use their existing needs-assessment tools to prioritize youth for the program in conjunction with a randomization process that considers those preferences: It is a win-win arrangement. Related scholarship has established procedures for embedding preferences within randomization (Olsen, Bell, & Nichols, 2017), ensuring the technical aspects of the approach as well as mitigating program concerns about ethics.

      Even if control group members either are perceived to be or actually are disadvantaged, random assignment still might not be unethical (Blustein, 2005). For example, society benefits from accurate information about program effectiveness and, accordingly, research may be justified in allowing some citizens to be temporarily disadvantaged in order to gather information to achieve wider benefits for many (e.g., Slavin, 2013). Society regularly disadvantages individuals based on government policy decisions undertaken for nonresearch reasons. An example that disadvantages some people daily is that of high-occupancy vehicle (HOV) lanes: they disadvantage solo commuters to the benefit of carpoolers. Unlike an evaluation’s control group exclusions, those policy decisions (such as establishing HOV lanes) are permanent not temporary.

      In an example from the private sector, Meyer (2015) argues that managers who engage in A/B testing—where staff are subjected to alternative policies—without the consent of their employees operate more ethically than those who implement a policy change without evidence to support that change. Indeed, the latter seems “more likely to exploit her position of power over users or employees, to treat them as mere means to the corporation’s ends, and to deprive them of information necessary for them to make a considered judgment about what is in their best interests” (Meyer, 2015, p. 279).

      Moreover, in a world of scarce resources, I argue that it is unethical to continue to operate ineffective programs. Resources should be directed toward program improvement (or in some cases termination) when evidence suggests that a program is not generating desired impacts. From this alternative perspective, it is unethical not to use rigorous impact evaluation to provide strong evidence to guide spending decisions.

      It is worth noting that policy experiments are in widespread use, signaling that society has already judged them to be ethically acceptable. Of course it is always essential to ensure the ethics of evaluation research, not only in terms of design but also in terms of treatment of research participants. Moreover, I acknowledge that there are instances where it is clearly unethical—in part because it may also be illegal—to randomize an individual out of a program. For example, entitlement programs in the U.S. entitle people to a benefit, and that entitlement cannot and should not be denied, even for what might be valuable research reasons. That does not imply, however, that we cannot or should not continue to learn about the effectiveness of entitlement programs. Instead, the kinds of questions that we ask about them are different from “Do they work?” That is, the focus is less on the overall, average treatment effects and more about the impact variation that arises from variation in program design or implementation. For instance, we might be interested to know what level of assistance is most effective for achieving certain goals. A recent example of this involves the U.S. Department of Agriculture’s extension of children’s food assistance into the summer. The Summer Electronic Benefits Transfer for Children (SEBTC) Demonstration that replaced no summer cash/near-cash assistance with a stipend for $30 or $60 per month is indeed an ethical (and creative) way to ascertain whether such assistance reduces hunger among vulnerable children when school is out of session (Collins et al., 2016; Klerman, Wolf, Collins, Bell, & Briefel, 2017).

      This leads to my final point about ethics. Much of the general concern is about randomizing individuals into a “no services” control group. But, as the remainder of this book elaborates, conceiving the control group that way is unnecessary. Increasingly, experimental evaluation designs are being used to compare alternative treatments to one another rather than compare some stand-alone treatment to nothing. As such, concerns about ethics are much assuaged. As we try to figure out whether Program A is better or worse than Program B, or whether a program should be configured this way or that way, eligible individuals get access to something. When research shows which “something” is the better option, then all individuals can begin to be served through that better program option.

      What This Book Covers

      This book considers a range of experimental evaluation designs, highlighting their flexibility to accommodate a range of applied questions of interest to program managers. These questions about impact variation—what drives successful programs—have tended to be outside the purview of experimental evaluations. Historically, they have been under the purview of nonexperimental

Скачать книгу