The Cato Institute, a well-known libertarian think tank, sponsored a discussion of research on pre-K on January 7, 2014. I watched a live stream of the event. The discussion featured George Mason professor David Armor, Brookings Institution researcher Russ Whitehurst, and Georgetown professors Deborah Phillips and Bill Gormley. Conor Williams provided some coverage of this event.
The discussion was really in part a debate, with Armor and Whitehurst arguing that the research evidence is insufficient to support widespread expansion of pre-K programs, and Phillips and Gormley arguing that the research supported the effectiveness of high-quality pre-K in affecting the life course of disadvantaged children. The arguments included the following: Armor emphasized what he sees as the deficiencies of the regression discontinuity studies, an argument I have previously discussed. Whitehurst emphasized the statistical insignificance of the Head Start random assignment results. Phillips emphasized what we know about child development in early childhood, and the statistical consensus of pre-K’s effectiveness summarized by the recent report by a group headed by Yoshikawa and Weiland, and which also included Gormley and Phillips. Gormley argued that the regression discontinuity studies were valid, because there is no sign of bias due to attrition in that the treatment and comparison groups are similar in observable variables.
What occurred to me is that the debate over expanding pre-K is in part a philosophical debate, not one that hinges solely on the details of empirical studies. Whitehurst at some point made the statement that the Head Start random assignment experiment showed “no sustained impacts”. Later on, he states that after the pre-K year, there were “no effects”. (Whitehurst previously said something similar in a Brookings post last January: “There is no measurable advantage to children in elementary school of having participated in Head Start…. Head Start does not improve the school readiness of children from low-income families.”)
That’s not exactly what the Head Start random assignment study shows. What it shows is that the point estimate of the effect of Head Start on cognitive skills is statistically insignificantly different from zero as of third grade. The point estimates of the effects of Head Start in this study decline by over 70% from the end of Head Start to the end of third grade. The resulting point estimate at third grade would predict that Head Start would improve future earnings by a little over 1%, which is not a trivial amount of money over a lifetime. But we cannot statistically reject the possibility that the true effect might be zero. We also cannot statistically reject the possibility that the true effect might be 2 or 3 times as large. (Note to wonks: this is using data from the Head Start final impact report to calculate average effect sizes of 0.22 and 0.06 at the end of Head Start and the end of 3rd grade on the PPVT, the WJ III Letter-Word ID test, and the WJ III Math Applied Problems test. These test score effects are then used in conjunction with estimates by Chetty et al. of how test scores affect adult earnings.)
In addition, past studies of early childhood programs, including Perry, the Chicago Child-Parent Study, kindergarten class quality (Chetty et al.), and Head Start itself suggest that test score effects of these interventions often fade, but then the programs still have larger effects on adult outcomes than would be predicted by these faded test score effects. It is certainly possible that this will occur in the Head Start experiment.
Therefore, the Head Start random assignment experiment study is hardly strong evidence in favor of the effectiveness of Head Start, as it was in 2002, in improving third grade outcomes. On the other hand, given that there is much other empirical evidence in favor of pre-K programs, and even Head Start, and given the statistical uncertainty in these Head Start results, the Head Start random assignment experiment is not strong evidence against the effectiveness of all publicly-funded pre-K.
But how do we interpret these results? One possibility is that we have a strong prior belief that the effect of pre-K programs is zero, either because we are generally skeptical of government intervention, or because we don’t think that academic intervention at age 4 makes sense from a child development standpoint. In addition, perhaps we are concerned about the danger of wasting money on a pre-K program that doesn’t work, which will drive up either deficits or taxes, which might be viewed as adding to our fiscal problems.
Another possibility is that we believe, based on the child development literature, that it is plausible that more time in educational programs at age 4 can make a difference. We find that there is research evidence from a number of studies that supports that plausible hypothesis, even with test score effects fading. In addition, perhaps we are concerned about the dangers of NOT expanding pre-K funding. Income inequality is a pressing problem that is difficult to address. Developing human capital seems like a key way to address income inequality. Adding more early learning time is a straightforward policy that addresses human capital development. We know how to add high-quality early learning time, and we have done so in a number of state and local areas. Failure to do so may have a large opportunity cost.
In other words, is the greater danger from expanding a pre-K program that doesn’t work? Or is the greater danger from not expanding pre-K programs that could make a major difference to many children’s future?
There will always be some policy uncertainties. It is more difficult to precisely estimate long-run effects of programs than short-run effects, and more difficult to precisely estimate aggregate effects of programs than effects on a specific group of individuals. Random assignment experiments will always be scarce because they are difficult and expensive to run.
How we resolve policy uncertainty is a choice. That choice is based in part not on the empirical evidence, but on our prior beliefs about child development, government intervention, and the relative dangers of excessive government spending versus increased income inequality.
One way to reduce the risk of doing the wrong thing is to expand pre-K, but to do so in a way that maximizes the probability that the intervention is high quality. This suggests that we should err on the side of spending more per child, and that we should be doing a great deal of monitoring of quality and results in pre-K programs.
Tim, I would like to make the additional points that the control group kids took up ECE, and that the findings indicate no difference in the effect of Head Start against that control group ECE experience – this is true even for IOT analyses; and that the IOT analyses do show some small 3rd grade benefits of HS in reading and vocab and that there are health and family benefits, and that these benefits should not be discounted; and that HS programs provides capacity in a field where there isn’t enough. There is only one thing to do and that is make Head Start classrooms better. C
You are right that the control group often enrolled in pre-K programs other than Head Start. As of the spring of the age 4 year, the age 4 treatment group HS participation rate was 77%, versus 14% in the control group. The treatment group’s participation in other types of centers was 11%, versus 35% in the control group. If one thought that the impact of the non-Head-Start centers was the same as for Head Start, then what is crucial is what the experiment did to total center participation, which was 88% in the treatment group, versus 49% in the control group. Under that assumption, any impacts estimated in the Head Start study should be multiplied by one over 0.39 (=.88-.49), or by about 2.5, to get the impacts of Head Start relative to no preschool.
The main criticism that could be made of this line of reasoning is that the non-Head-Start centers did not in fact seem to be higher quality than Head Start, at least on average, based on measures such as teacher credentials, etc. However, this doesn’t preclude the possibility that the non-Head-Start centers included some very good centers that could have pushed up performance of the control group quite a bit.