More on weighing the evidence on pre-K

Andrew Coulson of the Cato Institute has a blog post commenting on the debate between me and Russ Whitehurst over what evidence to believe about the effects of pre-K programs.

Coulson’s argument is that the only reliable evidence for ascertaining the effects of “large-scale” public pre-K programs is the randomized control trials for Head Start and Tennessee pre-K, and that these studies reach a consensus: “program effects fade out by the elementary school years…” In Coulson’s view, this is the “evidence that matters when discussing proposals for expanding government pre-K”.

What evidence doesn’t matter, in Coulson’s view? First, he doesn’t regard the Perry Preschool program and Abecedarian “randomized control trial” studies as mattering, because these were “tiny programs”, and it is “difficult to massively replicate any service without compromising its quality”.  Second, he doesn’t regard any non-experimental study as mattering.

The main thing that Coulson’s blog post overlooks is that not all randomized control trials provide evidence of equal quality, and that not all non-experimental studies are of equal quality.  Some non-experimental studies provide evidence on the true causal effects of public policies that is superior to that of some randomized control trial studies.

As mentioned in a previous blog post, the key problem that all pre-K studies are trying to address is “selection bias”. We are concerned that children and families participating in pre-K may differ from those who do not participate. We can control for observed characteristics of the children and families, but unobserved characteristics of children and families may differ between the treatment and comparison groups. These unobserved differences could be causing different outcomes, which would bias the estimated effects of pre-K by some amount that is potentially large and of unknown sign.

In theory, a perfectly run randomized control trial addresses this selection bias issue. Because participation is determined randomly, the treatment and comparison groups would be expected to on average be similar in unobserved characteristics.

But this advantage only holds fully for the full original sample. If there is “large enough” attrition in the randomly chosen groups, the final sample of treatment and comparison households could easily differ greatly in unobserved characteristics.  The final observed sample in that case would no longer really be randomly chosen.

In the case of pre-K, as I pointed out in a previous blog post, the Tennessee pre-K study has some serious problems with large and differential attrition. Therefore, I don’t think it at all appropriate to cite it as meeting some “gold standard” of providing evidence that is more reliable than any study that makes a serious effort to control for observable characteristics. The evidence from the Tennessee pre-K study is suggestive but not definitive, as is true for all studies that can only control for observable characteristics.

On the other hand, non-experimental studies can persuasively deal with selection bias if these studies rely on a “natural experiment”, in which access to the pre-K program varies due to geography or age or some other factor that is plausibly unrelated to unobserved child and family characteristics. All “non-experimental studies” are not created equal. Some non-experimental studies only can control for observable characteristics. Other non-experimental studies have variations in access that while not randomly assigned, may be almost as good as random for identifying the causal effects of the pre-K program.  One such study may be an outlier, but multiple studies increases confidence that we are identifying true causal effects of pre-K on outcomes.

We have many “natural experiments” that show large effects of pre-K. These include: Ladd et al.’s study of North Carolina pre-K; Ludwig et al., Currie et al., and Deming’s studies of Head Start; the various studies of the Chicago Child-Parent Center program; the various “regression discontinuity” studies of state and local pre-K programs. This evidence matters.

The Head Start and Tennessee evidence also matters. The Tennessee evidence is only suggestive, but is not particularly supportive of the effectiveness of that state’s program, which may have too low funding per child to be effective.  The Head Start RCT evidence also suggests that Head Start during that period may not have particularly effective relative to the other pre-K alternatives available to the control group, which included state pre-K programs. Changes over time in Head Start quality and Head Start alternatives is the most straightforward way to reconcile the fading effects in the Head Start RCT with the previous natural experiments that suggest that Head Start has long-term effects. Recent Head Start reforms may have improved its quality relative to Head Start’s quality at the time of the RCT. Head Start reforms to improve quality should continue.

Two other points. I would challenge the argument that Perry and Abecedarian don’t matter. These are programs with characteristics and delivery models that are well-understood. Today’s Educare program is quite similar to the Abecedarian program. Perry is essentially a smaller class size two-year version of today’s state and local pre-K programs. I think the Perry and Abecedarian evidence, in conjunction with the natural experiments, suggest that high-quality pre-K can make a difference.

Second, fading test-score effects are found in a variety of pre-K programs, including Perry, Abecedarian, and the Chicago CPC program. Such fading test-score effects appear to be consistent with large long-term effects on adult outcomes.  One theory to account for this fading and re-emergence is the importance of so-called “soft skills”, such as social skills and character skills. Such soft skills are important in determining educational attainment and adult earnings.

A broad view of the research evidence on pre-K suggests that a variety of programs have strong effects, although this should not be interpreted as meaning that every program always works. There is a lot of variation in program quality and program effects over space and time. For other recent research reviews that also take a broader view of the research evidence, and conclude that pre-K can work, see the review by Yoshikawa et al., and the review by Kay and Pennucci (Report 14-01-2201) for the Washington State Institute for Public Policy.

About timbartik

Tim Bartik is a senior economist at the Upjohn Institute for Employment Research, a non-profit and non-partisan research organization in Kalamazoo, Michigan. His research specializes in state and local economic development policies and local labor markets.
This entry was posted in Early childhood programs. Bookmark the permalink.