Russ Whitehurst of the Brookings Institution has still another blog post attacking President Obama’s preschool proposal. (I have previously responded to three previous blog posts on this topic by Whitehurst.)
The most recent post by Whitehurst, co-authored with David Armor, was posted on July 24, 2013. The post is entitled “Obama’s Preschool Proposal is Not Based on Sound Research”.
The Whitehurst-Armor argument can be summarized as follows: Head Start does not have large enough test score effects, states with extensive pre-K programs do not have large enough boosts in 4th grade test scores, and existing studies of state and local pre-K programs are methodologically deficient.
My counter-argument can be summarized as follows: Head Start has shown large long-run effects even with test-score fading; even if we think Head Start effects are too small, Head Start evidence is of limited relevance to the Obama proposal; even boosts in 4th grade test scores that appear modest can have economic and social benefits that far exceed pre-K costs; there are many methodologically sound studies of state and local pre-K programs that show large short-run and long-run effects.
In other words, the Whitehurst-Armor post ignores the extensive evidence that contradicts their thesis, and that is most directly relevant to evaluating the Obama Administration preschool proposal.
The Whitehurst-Armor blog post begins by again citing the Head Start random assignment study:
“The most credible recent study of pre-K outcomes, the federal Head Start Impact Study, found only small differences at the end of the Head Start year between the performance of children randomly assigned to Head Start vs. the control group, e.g., about a month’s superiority in vocabulary for the Head Start group. There were virtually no differences between Head Start and the control group once the children were in elementary school.”
My first point is that it is questionable whether the most relevant recent research for the Obama Administration’s preschool proposal is a study of Head Start. The Obama Administration’s preschool proposal is not a proposal that relies on expanding Head Start. Rather, the Obama Administration proposal is to expand support for state pre-K programs for 4-year-olds. Head Start resources would be shifted towards 3-year olds. I discussed this previously in a blog post on a Wall Street Journal editorial, which also tried to use the Head Start impact study results to criticize proposals to expand state pre-K programs.
The better state and local pre-K programs show larger short-run effects on test scores than is true of Head Start. (See my review of this evidence in Table 1 and surrounding text of my recent paper on a Kalamazoo pre-K program.) Whitehurst questions some of this evidence, to which I will respond below, but it certainly is of limited relevance to use Head Start evidence to attack a proposal that expands quite different preschool programs that are sometimes more educationally focused than some Head Start programs, and that have a quite different governance structure.
My second point is that to the extent to which the Head Start research evidence is relevant to a proposal to expand state pre-K programs, Whitehurst and Armor are ignoring the significant research evidence that shows long-run effects of Head Start. As I have pointed out in previous blog posts, there is research evidence with good comparison groups that shows long-run Head Start benefits. This includes research that compares similar counties with different early access to Head Start, and studies that compare siblings in which one sibling participates in Head Start and the other does not.
What is particularly interesting is that this long-run research evidence also shows long-run benefits of Head Start even when short-term effects fade. For example, Deming’s research shows quite large effects of Head Start in increasing educational attainment and employment rates, and reducing crime involvement. These benefits are sufficient to predict a long-run wage gain due to Head Start of 11%, and a rate of return to the public investment in Head Start of almost 8% in real terms. But Deming’s research also shows that the initial test score effects of Head Start fade to statistical insignificance as former participants go through the K-12 system. Apparently this fading of test score effects does not determine other long-term outcomes.
Whitehurst and Armor then mention Whitehurst’s previous blog post that argued that states with more pre-K enrollment did not have significantly higher 4th grade test score results. As I argued in a previous response, the correlation between state pre-K enrollment and 4th grade test scores is actually sufficiently strong to imply a 5 to 1 benefit-cost ratio for adding enrollment in typical state pre-K programs.
Whitehurst and Armor then go on to argue extensively against the methodology used in studying many (not all!) state pre-K programs, which are labeled as “regression discontinuity” studies. What these studies do is administer the same tests to pre-K entrants as they enter pre-K, and to pre-K graduates as they enter kindergarten. All the students administered tests are similar in whatever observed and unobserved characteristics determine selection into the pre-K program. But the pre-K graduates differ in two respects: they are a year older; they have experienced a year of pre-K. Because pre-K programs and kindergarten programs use an age cut-off to determine enrollment, we have children within each group who differ in age by up to a year. In fact, we have children in the pre-K entrant group who are just a few days younger than children in the pre-K graduate group. Therefore, we can use the evidence in the sample on how test scores vary with age to separate out the effects of aging on test scores from the effects of pre-K. One way to understand this is that we see how test scores “jump” when comparing students who just missed the pre-K cutoff the previous year versus students a few days older who just made the pre-K cutoff the previous year. This “jump” is the estimated effect of pre-K. (For more extensive discussions of this methodology, see my paper on Kalamazoo pre-K, or my paper with Gormley and Adelstein on Tulsa pre-K.)
My response to Whitehurst and Armor’s critique of regression discontinuity methodology can be summarized as follows: first, regression discontinuity studies do provide reliable and policy-relevant estimates of the short-term effects of age 4 pre-K programs versus no such programs. Second, their critique ignores the many good studies of state and local pre-K programs that do not rely on regression discontinuity.
Whitehurst and Armor’s first methodological comment is that regression discontinuity studies are making a somewhat different comparison than would be made by a random assignment study of pre-K. This is true, but is less policy relevant than they imply. Regression discontinuity studies are using a comparison group of children who just missed the pre-K age cutoff for 4-year olds. In contrast, the comparison group in a random assignment study is age-eligible for 4-year-old preschool programs. As they point out, the comparison group in a random assignment study is more likely to participate in preschool and more likely to participate in more educationally-oriented preschool than is true of children who miss the age-4 pre-K cutoff.
Whitehurst and Armor’s point is that what a state legislator should want to know is how will expanding pre-K access affect children compared to not expanding access? Therefore, the relevant counterfactual is what these 4-year-olds would be doing without the state program. They argue that this is what the random assignment experiment estimates.
However, their argument overlooks that what the state legislator really should want to know is how to benefits and costs compare between two groups, one of which has greater access to cheaper and higher-quality pre-K than the second group. So in the random assignment study, the benefit cost analysis would not only have to look at the benefits of the greater preschool access, but also the cost savings for existing preschool programs from greater public preschool access. Many of the children in the control group are also in preschool programs with large costs paid for by the government or by parents. These cost savings should be subtracted from the net costs of expanding pre-K. Therefore, the benefit-cost analysis from a random assignment study should be comparing the CHANGE in benefits from the change in preschool access with the CHANGE in costs from the change in preschool access. It would under-estimate the benefit-cost ratio if we compared this CHANGE in benefits with the total public costs of the pre-K expansion, as this overlooks the reduction in costs of existing preschool programs.
But exactly this same calculation can be done using the estimates from the regression discontinuity studies. Regression discontinuity studies come closer than random assignment studies to estimating the impact of preschool versus no preschool for children; in contrast, random assignment studies compare a particular pre-K program versus currently available pre-K programs. But if we examine a proposed expansion of pre-K programs, we can estimate how this will affect the total number of kids enrolled in reasonable quality pre-K programs, either by plausible assumptions before the program is implemented, or by actual data after the program is implemented. The regression discontinuity studies provide useful estimates of the kindergarten readiness test score effects of having more kids involved in reasonable quality pre-K programs. These estimates can be compared with the net incremental costs of funding these additional slots.
Although it might seem that this benefit-cost calculation from regression discontinuity research evidence requires more assumptions, exactly the same exercise would have to be done using evidence from random assignment studies. Any random assignment study’s estimated effects for pre-K programs depend in part on what other pre-K programs are around. But what pre-K programs are around is always changing due to different parent behavior or changes in a wide variety of government subsidies for pre-K and child care. So random assignment studies can only be used for policy analysis if we adjust the raw estimates for changes over time in what other pre-K programs are available. There is no getting around the need for us to adjust our benefit-cost estimates for the current educational environment.
Whitehurst and Armor’s second methodological point is that regression discontinuity studies may suffer from differential attrition in the “treatment group” versus the comparison group, which may bias the results. This is true, but is also true of any real-world random assignment study. In random assignment studies, it is almost always true that not all participants in the treatment and control group can be tracked down, and that attrition could possibly vary based on the effects of the “treatment”.
However, in both regression discontinuity studies and random assignment studies, we can make some attempt to see how attrition may bias the results by looking at how observable variables differ between the treatment group and the comparison group who remain, after attrition. We can never test for possible differences between the two groups in “unobservable” variables. However, it seems plausible that if attrition led to some differences in unobservable variables, it would also lead to some differences in observable variables.
These tests for differences in observable variables have been regularly done in regression discontinuity studies of state and local pre-K programs. For example, my paper with Gormley and Adelstein did such a test. We found no evidence that attrition or anything else led to any differences in observable variables between the treatment and comparison groups, the pre-K graduates versus the pre-K entrants.
Whitehurst and Armor’s final methodological point is that “age-cutoff regression discontinuity designs produce implausibly large estimates of effects.” Why are they implausibly large? Because they are much larger than the Head Start effects! The reasoning here is somewhat circular. It appears that no regression discontinuity estimates for state pre-K programs will be accepted by Whitehurst if they significantly exceed the random assignment Head Start estimates, even though these estimates are for quite different programs.
As the discussion above suggests, we would expect regression discontinuity studies to yield somewhat larger raw effects for pre-K programs than would be true for random assignment studies. Regression discontinuity studies estimate the effect of pre-K versus no pre-K, whereas random assignment studies estimate the effect of a pre-K program versus the status quo of what is available. However, once these estimates are embedded in a benefit cost analysis that compares the differences in pre-K access and costs in two different scenarios, these differences will disappear.
Are regression discontinuity studies estimated effects implausibly large? The very studies that Whitehurst and Armor cite don’t suggest this. They cite a Tulsa study showing test score improvements by 9 months, a New Jersey study that shows test score improvements by 4 months, and a Boston study that finds test score improvements by 6 months. If typical students improve by 9 months during the pre-K year without pre-K, then what these studies are suggesting is that pre-K, versus no pre-K, improves learning during the school year by from 44% (New Jersey, an additional 4 months on a base of 9 months) to 67% (Boston) to 100% (Tulsa). These improvements from pre-K versus no pre-K do not sound implausible to me – learning pace increases by 40% to 100% seem intuitively plausible in an educationally-focused program.
There are two other pieces of evidence that I would present for the reliability of the regression discontinuity evidence. First, there is a study, in Tennessee, that uses both regression discontinuity methods and random assignment methods to study a state pre-K program. Both approaches show statistically significant effects of Tennessee’s pre-K program (See, for example, Lipsey et al, 2010). The regression discontinuity estimates are somewhat larger. (An effect size averaging 0.64 across the various tests, versus an effect size of 0.34 from random assignment.) But we would expect this, as the regression discontinuity estimates are measuring the effects of preschool versus no preschool, whereas the random assignment estimates are measuring expanded preschool versus what preschool is currently available. The estimates are reasonably consistent.
Second, my estimates with Gormley and Adelstein using regression discontinuity methods for Tulsa’s pre-K program show a pattern of results across half-day and full-day pre-K programs that is reasonable. For example, our estimates show that half-day pre-K increases test scores for children eligible for a free lunch by 12 percentiles, whereas full-day pre-K increase test scores for children eligible for a free lunch by 18 percentiles. This pattern is very reasonable, as one would expect a greater return to full-day than half-day pre-K, but perhaps not a doubled return. If regression discontinuity estimates were seriously biased, we would not necessarily expect these biases to result in such a reasonable pattern in estimated test score effects.
Finally, Whitehurst and Armor’s critique of regression discontinuity methods ignores the research evidence from studies of state and local pre-K programs that use other research methods. There are a wide variety of such studies. The studies with the best long-run evidence are for the Chicago Child-Parent Center program. Arthur Reynolds and his colleagues have done a series of studies of this program showing large long-run effects in boosting educational attainment, reducing crime, and boosting earnings. The implied benefit-cost ratios are quite large.
I don’t see how any balanced discussion of Obama’s proposal to expand state and local pre-K programs can ignore the Chicago Child-Parent Center program and its research evidence. The CPC program is much more similar to the higher-quality state and local pre-K programs around the U.S. than is true of Head Start.
In sum, there is significant research evidence that supports efforts to expand high-quality state and local pre-K programs for 4 year olds. This evidence goes well beyond the Head Start evidence to consider numerous regression discontinuity and other studies of state and local pre-K programs.
Is the evidence perfect? No, but in the real world, evidence for any policy intervention will never be perfect. Whitehurst and Armor go on to advocate for more “demonstration” research projects on pre-K rather than implementing pre-K on a large scale. If the case for pre-K is plausible, this position has a tremendous “opportunity cost”: all the children who could have benefitted from pre-K with an expansion who will not do so because we are waiting around for the elusive definitive study that will answer all questions.
In my view, the research case for state and local pre-K is strong enough that a better course of action is implementing greater access to higher quality pre-K on a large scale while continuing to study how to improve pre-K quality. While more research is always needed, the need for long-term research should not trump the needs of children today.