Want to see what we mean by “bias?”

The US government commissioned an evaluation of its Talent Search program, designed to fight the college enrollment gap by providing advice (mostly on financial aid) to disadvantaged students. The evaluation is pretty thorough, rigorous, and straightforward about what it did and didn’t find, as we’ve come to expect with government-funded evaluations. (This is probably because the government is under pressure to actually produce something; “government” doesn’t tend to melt people’s hearts, and brains, the way “charity” seems to.)

This well-done study gives a great illustration of how misleading a poorly done study can be. What do I mean? Let’s focus on Texas (one of the three states it examined).

On page 47 (of the PDF – 23 of the study), you can see comparison statistics for Talent Search participants and other students in the state. Talent Search participants were far more likely to be economically disadvantaged … but … they also had much better academic records. I’d guess that’s because while money has its perks, it isn’t everything – motivation also matters a lot, and it’s easy to see why the Texans who bothered to participate in Talent Search might be more motivated than average kids.

How would you react if a charity running an afterschool program told you, “80% of our participants graduate high school, compared to 50% for the state as a whole?” Hopefully I just gave you a good reason to be suspicious. Kids who pick up an afterschool program and stick with it are already different from average kids.

But there’s more. The study authors spotted this problem, and did what they could to correct for it. Page 52 shows how they did it, and they did about as good a job as you could ask them to. They used an algorithm to match each Talent Search participant to a non-participant who was as similar as possible (what I dub the “evil twin” approach) … they ended up with a comparison group of kids who are, in all easily measured ways, exactly like the participants.

So, if Talent Search participants outperformed their evil twins, Talent Search must be a good thing, right? Not so fast. As page 55 states, Talent Search participants had an 86% graduation rate, while their evil twins were only at 77%. The authors equivocate a bit on this, but to me it’s very clear that you can’t credit the Talent Search program for this difference at all. The program is centered on financial aid and college applications, not academics; to think that it would have any significant effect on graduation rates is a huge stretch.

The fact is, that invisible “motivation” factor is a tough thing to control for entirely. Even among students with the same academic record, some can be more motivated than others, and can thus have a brighter future, with or without any charity’s help.

This is why Elie and I are so prickly and demanding when it comes to evidence of effectiveness. These concerns about selection bias aren’t just some academic technicality – they’re a real issue when dealing with education, where motivation is so important and so hard to measure. If you believe, as we do, that closing or denting the achievement gap is hard, you have to demand convincing evidence of effectiveness, and the fact is that studies with sloppily constructed comparison groups (or no comparison groups) are not this. We want charities that understand these issues; that know how difficult it is to do what they’re doing; and that are unafraid – in fact, determined – to take a hard, unbiased look at what they’re accomplishing or failing to accomplish. It’s psychologically hard to measure yourself in a way that really might show failure, but how else are you going to get better?

The GiveWell Blog

Exploring how to get real change for your dollar.

Want to see what we mean by “bias?”