Dr. Grover Whitehurst’s latest criticisms of Obama’s preschool plan at the Brown Center website at the Brookings Institution have drawn some attention. He has done numerous posts criticizing Obama’s preschool plan, some of which I’ve responded to in previous posts.
Dr. Whitehurst’s latest criticism is based on recent evidence from the Vanderbilt study of the Tennessee Voluntary Pre-K Program. This study used a randomized control trial methodology. The results suggest that most of the academic and behavioral effects of Tennessee’s pre-K program had faded by the end of kindergarten and the end of first grade.
Dr. Whitehurst argues the following in his concluding paragraph:
“I see these findings as devastating for advocates of the expansion of state pre-k programs. This is the first large scale randomized trial of a present-day state pre-k program. Its methodology soundly trumps the quasi-experimental approaches that have heretofore been the only source of data on which to infer the impact of these programs. And its results align almost perfectly with those of the Head Start Impact Study, the only other large randomized trial that examines the longitudinal effects of having attended a public pre-k program. Based on what we have learned from these studies, the most defensible conclusion is that these statewide programs are not working to meaningfully increase the academic achievement or social/emotional skills and dispositions of children from low-income families. I wish this weren’t so, but facts are stubborn things. Maybe we should figure out how to deliver effective programs before the federal government funds preschool for all.”
I have a number of detailed responses to this argument. But to sum up:
Incomplete findings from one good but imperfect study of one state’s quite imperfect pre-K program do not trump the many good studies of many pre-K programs that show that such programs can be effective, with the right resources and design. It is unwise for either opponents or proponents of expanded pre-K to over-react to one study; rather, decisions should be based on the overall weight of the research evidence.
What follows are more detailed responses:
1. I agree with Sara Mead ‘s comment at Education Week that one can hardly view the latest Tennessee Pre-K results as an argument in favor of pre-K. On the other hand, I also agree with her point that this one item of data does not trump all the other good evidence for pre-K effectiveness, from numerous studies.
2. Dr. Whitehurst is of the opinion that randomized control trials trump all other evidence by far. I disagree. Why do I disagree? First, randomized control trials in practice are hard to run perfectly, which often limits their advantages compared to non-randomized studies.
Second, many non-randomized studies have good comparison groups, for example the Chicago Child-Parent Center studies compare similar neighborhoods with different pre-K access, some Head Start studies compare different siblings in the same family with different Head Start enrollment or different counties with different Head Start access due to federal policies, and the regression discontinuity studies of state pre-K compare kids with differential timing of pre-K access based on birth date. All of these studies with good comparison groups find some good evidence of pre-K’s effectiveness.
We don’t just throw all this info out because of one study of one program in one state. This would be true even if this Vanderbilt study had no issues and one thought that Tennessee’s program was the best program in the country.
3. In the particular case of the Tennessee evaluation, there were some problems with the randomized trial, particularly in Cohort 1 (2009-2010 pre-K participants), in that the parent consent rates were low and variable between Tennessee pre-K participants and non-participants. For example, in Cohort 1 of the study, the Tennessee study only had parental consent to look at the data for 46% of pre-K participants and 32% of non-participants. This improved in the second cohort (pre-K participants in 2010-2011) to 74% of participants and 68% of non-participants providing parental consent. Dr. Whitehurst explicitly says that he focuses his attention on the evidence only in cases where more data was provided due to parental consent.
The problems with parental consent mean that for most of the comparisons, the actual children on whom data were collected no longer constitute a pure random assignment experiment, particularly in Cohort 1. In other words, it could well be that in Cohort 1, although the full treatment sample and full control sample might on average be similar in unobserved characteristics (e.g., parent motivation), as the initial assignment was determined randomly, this might not be at all true of the 46% of pre-K participants and 32% of non-participants for whom most of the data are available. Parental consent may not be random with respect to unobserved characteristics of children and families.
The Vanderbilt researchers tried very hard to control for this problem, using appropriate methods. However, these methods, such as propensity score matching and statistical controls, are the same methods that people use WITHOUT random assignment data, and have the same issue — one can only control for variables one observes, not variables one does not observe. Furthermore, there are many modeling choices in dealing with these issues, and different modeling choices may yield different results.
4. It is of interest that for one of the variables, retention in kindergarten, for which info is available for the full sample, the pre-K program appears to cut retention in kindergarten from 8% to 4%. That is curious if there are really no end of kindergarten effects on achievement or behavior, which is what the data on the smaller sample suggests. Why would the retention rate be cut in half? Something must be going on to produce this result that we don’t observe for the smaller sample.
Furthermore, in the smaller sample for which parental consent WAS obtained, retention was only cut from 6% to 4% — which is a curious discrepancy between the smaller sample, on which Dr. Whitehurst bases his conclusions, and the full sample.
5. The retention differences mean that more of the weaker pre-K students get promoted to first-grade on time, which may be good for them, but which will tend to depress end of first grade scores in the treatment group relative to the control group.
6. Tennessee’s pre-K program appears to spend, according to NIEER, about $5,814 annually per child for a full-day program. Data from the Institute for Women’s Policy Research suggests that high-quality full-day pre-K might cost $9,000 or so annually per child. Tennessee has a lower cost of living and lower teacher salaries, but there does seem to be some gap there. NIEER estimated that Tennessee probably needs to spend at least $2,000 extra per child to consistently deliver quality.
7. As Steve Barnett of NIEER has pointed out, Tennessee’s program results at the end of the pre-K year were on the low end compared to some other state pre-K studies. Perhaps end of pre-K results are more likely to persist if the initial end of pre-K results are larger. Perhaps there is some critical size of effects that one needs to get at the end of pre-K before one can expect much persistence.
8. Sara Mead also raises the point that there may be effects of collective pre-K that differ from individual pre-K. That is, if one puts an entire class through pre-K, and combines this with the right K-3 policies, then teachers in K-3 can teach the entire class more effectively to a higher level. On the other hand, if we just put a few kids through pre-K, then teachers may find that they have to teach the same curriculum at the same pace to meet the needs of the overall class. This result may tend to drag down any initial advantages for the pre-K kids, particularly if the initial advantages at kindergarten entrance are small.
It is of interest here that the Chicago Child-Parent Center study essentially was comparing kids in different neighborhoods that were similar in neighborhood characteristics except for whether they had the CPC program. Did CPC help allow subsequent classroom teaching to improve? Maybe.
9. One in general has to ask why studies sometimes find fade-out and that the control or comparison groups catch up to the treatment group. Some of it may be that all kids are experiencing the same curriculum, which will tend to over time reduce performance differences in the individual comparisons. Another issue is that it is quite possible that teachers are intervening to provide extra help to kids who are behind. If there are initially more of such kids in the control group, then more kids in the control group will get such help. But this is actually another benefit of pre-K — it may reduce the need for teachers to provide remedial help to the pre-K kids, and free up teacher time to do other things.
10. Having said all that: the latest Tennessee Pre-K results do not provide any strong evidence in favor of pre-K. Maybe it is due to lack of full data on all survey respondents or limitations of Tennessee’s program or the lack of community effects in such a study, to reiterate the points mentioned above. It is hard to be sure without better data, ideally on the entire Tennessee sample, and more in-depth studies of what is going on in Tennessee, for example compared to Tulsa or New Jersey or Boston.
On the other hand, I don’t think the latest Tennessee results provide any strong evidence against the general consensus of the research literature, that many state and local pre-K programs are quite effective.
11. Is the implicit message from Dr. Whitehurst that a pre-K program for which we ONLY have evidence for effects at pre-K exit or kindergarten entrance is of no use? Does that really make sense? Is that really a tenable position? Is that the attitude of most middle-class parents — “We don’t care about whether our child is ready for kindergarten, because we’re sure that any initial advantages will fade.” This needs to be thought through. And one needs to think through why fade-out might occur and what it might mean.
12. Finally, what we really should be talking about is how we can replicate state and local pre-K programs that show much larger effects than in Tennessee, such as the programs in Tulsa, Boston, or New Jersey.