Impact Evaluations 101

LEO’s preferred research modality is impact evaluations. Why? Because we want to understand if it is truly the intervention that produces the outcomes that the program desires to achieve, and not some other factor.

A randomized controlled trial (RCT) is the gold standard in evaluation work. It has several important elements:  

  • A well-defined intervention.  
  • Measurable outcomes for the intervention.
  • An ample sample size, meaning that there are enough recipients to show that it is the intervention that causes the change.
  • Random assignment into services to ensure other variables are removed—such as changes to the environment, policy changes, and changes in the economy—so that you know that it was your intervention that caused the desired outcomes.

Most agencies don’t have enough resources to serve everyone who needs their services. So they turn people away or distribute slots for a service in a casual way. Putting rigorous randomization procedures in place allows an organization to distribute slots in a program in a more fair way that allows us to learn. 

Let’s say an organization has 200 slots for a particular program, but 500 people come through their doors asking for this service. Randomization requires the organization to screen everyone the same way and then use a lottery to determine which of the 500 people will receive the service (and join the treatment group) and which will not receive the service (and join the control group) because of the program's limited capacity.  

When we study both groups, we can attribute changes within the treatment group solely to the intervention. We are able to do this knowing that—because a random lottery was used to determine who got the service—the treatment and control groups looked the same at the start.  

Think about how the social service sector often measures success: by measuring outputs, or the number of people served in a year. A more sophisticated approach is to measure success based on outcomes—the results that clients have experienced because of the organization’s services. 

An RCT is distinct in that it helps an organization understand what would have happened in the absence of the organization’s intervention.

People Talking

Let's illustrate with a couple of examples.

Consider a homeless shelter. An output-based evaluation of a homeless shelter might define success as sheltering a certain number of people on a given night, looking at how many people were helped (the number of people who are homeless) and the intervention they received (nights at a homeless shelter).  

By contrast, an outcome-based evaluation of the same homeless shelter might look at the percentage of clients who moved out of the shelter and into their own housing, and maintained that housing for 12 months. The program might have a goal to help 70% of its clients transition out of the shelter and into housing that they maintain for 12 months.  

Outcome-based measurements take evaluation to the next level, yet they can often be misleading. What if one year, the organization helped 80% of its clients transition to housing, and the next year, only 20%? Did the program lose its effectiveness? Maybe. Or maybe not. What else was going on?  

For example, what if the federal government gave the organization a lot of resources to helps clients transition to housing the first year, but those resources didn’t exist at the same level in the second year. Is the program a failure? No. The decline in clients transitioning to housing was due to a resource issue. That is why impact evaluation through RCTs is critical.   

An impact evaluation takes our analysis of the homeless shelter’s services a step further. Let’s say the homeless shelter has 500 beds and--this year--125 housing vouchers for people to use to rent their own apartments. Let’s say that 300 people staying in the shelter were ready to transition into their own housing. The organization could randomly assign the 125 housing vouchers among the 300 clients. Then, we could assess what happens to both sets of clients—those who got the vouchers and those who didn’t. 

If outcomes for the 175 people who didn’t get the vouchers and the 125 who did look relatively similar, the evidence would point to a lack of effectiveness for the housing voucher intervention. However, if the 125 people who get the vouchers have better outcomes than the people who didn’t receive the vouchers, then those outcomes would be attributed to the voucher intervention.

We see the same ideas played out in a study of a community college completion program. A community college may conduct an output evaluation that looks at the numbers of students they are serving. By contrast, an outcome evaluation may look at the number of students who persist through their course load and complete a degree or certificate program. An RCT would look at the outcomes of persistence and completion by comparing a group of students who got an intervention—such as intensive case management—with a group that didn’t. This would allow us to understand if the intensive case management worked—did the students who had the extra help persist and complete community college at much higher rates, thus justifying the investment in case managers in the first place?

Social service agencies have limited resources. Randomized controlled trials allow them to understand where best to put their resources to make the most profound impact toward their missions.