The appeal of universal programs rests in part on simplicity

A summary of my paper with my colleague Marta Lachowska on the Kalamazoo Promise recently was published in Education Next. (The summary even received a tweet from Arne Duncan!) The Kalamazoo Promise is a program begun in 2005, under which anonymous private donors promised to provide all graduates of Kalamazoo Public Schools with up to 4 years of free tuition at public colleges and universities in the United States.

Our paper relied on one aspect of the Kalamazoo Promise that provides a “natural experiment”. Promise eligibility requires that students be continuously enrolled in Kalamazoo Public Schools since the beginning of 9th grade. Our paper compared the behavior and academic achievement of high school students who were “Promise eligible”, versus high school students who were “Promise ineligible”, based on length of enrollment in KPS, from before to after the Promise announcement in 2005.

We found statistically significant and large effects of the Promise on improving the behavior of all students, and on improving high school GPA for African-American students.  The point estimates of effects on all students’ GPA were positive, but insignificantly different from zero. Some of the estimated effects are large. For example, in the 2007-2008 school year, the estimated Promise effect on the GPA of African-American students is an increase of 0.7 points, on a four-point scale.

I think several points from these findings might be relevant to early childhood education advocates.

First, this study, like many other studies, shows that there are definitely many interventions after early childhood that can make a difference in educational attainment and life prospects. I think it is both a political mistake and substantively wrong to argue that early childhood education inherently has a higher rate of return than later interventions. There are many later interventions with high rates of return, for example some high school tutoring and counseling programs, and demand-oriented adult job training programs. The argument for early childhood education is that it has a high benefit/cost ratio or a high rate of return, not that other interventions don’t also have a high rate of return.

Second, I do think it is true that many later interventions are more complicated to implement than early childhood education. In early childhood education, we are essentially adding learning time. The research evidence suggests that if this is implemented reasonably well, by a typical government agency, we get long-term benefits that significantly exceed costs.

For many later interventions, implementation is more complex, and more politically and substantively difficult. For example, improving teacher quality and school quality in K-12 education is a huge challenge.

The Kalamazoo Promise is an exception to this general pattern for later interventions. The program is simple: graduate from high school and get into a college, and you get a scholarship that pays the tuition. The form required to get the Promise is one page long.

Third, I think the Promise points out one of the virtues of universal programs: simplicity. The Promise would be much more complicated to implement if eligibility depended on family income, student performance in high school, etc. And a more complicated program would be more difficult to explain to parents and students, which makes it less likely to affect attitudes and behavior.

Targeted programs are more complicated to administer, and hence more costly. They are harder to explain to those who are eligible, which restricts participation. Targeting also imposes an implicit tax on earnings, which may discourage labor force participation.

Having said that, targeting is obviously justified if the benefits are demonstrably greater for the targeted group, and if we can organize the targeting so that it is as simple to administer as possible, and so that it minimizes the implicit tax on earnings.  But I think the administrative issues with targeting should not be under-emphasized. Any discussion of universality versus targeted programs in early childhood education needs to consider these practical implementation issues.

Posted in Distribution of benefits, Early childhood program design issues, Early childhood programs

Dealing with uncertainty in research on pre-K

Jason Richwine, in a recent blog post at “The Corner” blog of National Review, expressed surprise at my interpretation of the estimated effects in the Head Start randomized control trial.

I had pointed out that the impact estimates, while not statistically significantly different from zero, are also not statistically significantly different from predicting a 2 to 3% increase in adult earnings, which would probably be sufficient for Head Start to pass a benefit-cost test from earnings effects alone.

Richwine argues that the estimates and their confidence intervals also can’t rule out that Head Start has negative effects. He interprets my comments as arguing that the Head Start impact estimates are “large”. He concludes by arguing the following:

“Such analysis reverses the traditional burden of proof: Rather than showing that government preschool works, advocates now demand proof that it doesn’t work.”

These comments raise some interesting issues about how policymakers should make policy when given research that inevitably has some uncertainty about its estimates.

In making policy decisions, concepts such as the “burden of proof” are more confusing than helpful. The “burden of proof” is a legal concept used in court cases. In making policy, what we have are estimates with some uncertainty, and we have to decide what policy rules are likely over the long-haul to maximize net social benefits.

If the only evidence on public pre-K was the Head Start experiment, policymakers would face a difficult policy decision with considerably uncertain evidence. The point estimates of test score effects at the end of 3rd grade suggest a little more than a 1% increase in adult earnings. This is a modest-sized effect, in my opinion, not a “large effect”, although what is “large” or “modest” is a highly subjective judgment, not a rigorous scientific judgment.  But because adult earnings are so large over an entire career, it would sum to many thousands of dollars. The present value of this earnings gain would probably exceed $5,000.  Head Start costs more than that, but then Head Start also clearly has benefits in the value of the child care services it provides to parents. So the point estimate implies a close call on net benefits.

Furthermore, there is significant uncertainty in these estimates.  The confidence interval includes zero and negative effects, as well as positive effects two or three times as large. How should policymakers deal with such uncertainty?

One approach is to take a skeptical attitude, and assume effects are zero until proven otherwise. But this skeptical approach would not be a particularly good policy rule to adopt if one were faced with many policy decisions over a long period of time. If a policymaker were simply trying to maximize the expected present value of net benefits over thousands of policy decisions, each with evidence from only one experiment, then the optimal decision rule would be to use each experiment’s point estimate to guide decisions, regardless of the confidence intervals. If we use the point estimates, which represent the mean expected impact of each intervention, then over time we will maximize net social benefits by following this rule.

In other words, the legal “burden of proof” principle is not a particularly good guide to making policy decisions over time. The legal rule that we should convict someone of a crime only if they are guilty “beyond a reasonable doubt” is ultimately based on the judgment that we find it socially abhorrent to deprive someone of their life or liberty based on any lesser standard. The huge social cost of convicting an innocent is not really relevant to deciding whether to spend a little more or less on some social or educational program. The costs of mistakenly expanding a social or educational program are not as great as the cost of locking someone up because the probability is 51% that they are guilty.

Another important point is that the Head Start experiment is NOT the only good evidence on the effects of pre-K. We have good evidence from two randomized experiments, Perry and Abecedarian, that pre-K can have large long-run effects. For example, long-run earnings effects are 19% in Perry. We also have good evidence from some natural experiments of long-run earnings effects, for example 8% in the Chicago Child-Parent Center study and 11% in Deming’s study of Head Start.  Finally, we have some good natural experiments, for example in Tulsa and Boston, that show short-run test score effects of pre-K that are larger than found in the Head Start experiment.

In social science, or for that matter natural science, how we interpret any new experiment is influenced by what we already know. If we have substantial reasons from prior research to believe that variable X affects outcome Y, then in considering new evidence, our prior belief is not that X has no effect on Y. In interpreting the new research, we would ask whether the estimated effects in the new research are consistent not only with a null hypothesis of zero effects, but also with a null hypothesis of the estimated effects implied by prior research. Both of these null hypotheses are interesting to explore.  If the new research shows lower effects of X on Y than implied by prior research, this should influence us going forward towards believing that X has lower effects.

In the case of the Head Start experiment, the modest effects found should influence researchers towards believing that at least some pre-K programs have considerably smaller effects than found by Perry or the Chicago Child-Parent Center study or the Tulsa or Boston studies. It should also influence us towards wondering whether Head Start as of the 2002 experiment might have lower effects than it did in the past. And it might influence us towards desiring to reform Head Start to increase its effectiveness, in part by imitating the practices of pre-K programs that have larger estimated effects.  As Barnett has pointed out, there is some evidence that Head Start has increased its educational effectiveness since the time of the 2002 experiment.

The Head Start experiment by itself is not strong evidence in favor of public pre-K. But it is not the only evidence, and it is not necessarily inconsistent with this other evidence. On the whole, the weight of the evidence, as suggested by a number of reviews of the research, is that high-quality pre-K programs can make a significant difference in improving the opportunities of children.  The estimated benefits in the bulk of the research are sufficient to be significantly greater than program costs.

Posted in Early childhood programs

More on weighing the evidence on pre-K

Andrew Coulson of the Cato Institute has a blog post commenting on the debate between me and Russ Whitehurst over what evidence to believe about the effects of pre-K programs.

Coulson’s argument is that the only reliable evidence for ascertaining the effects of “large-scale” public pre-K programs is the randomized control trials for Head Start and Tennessee pre-K, and that these studies reach a consensus: “program effects fade out by the elementary school years…” In Coulson’s view, this is the “evidence that matters when discussing proposals for expanding government pre-K”.

What evidence doesn’t matter, in Coulson’s view? First, he doesn’t regard the Perry Preschool program and Abecedarian “randomized control trial” studies as mattering, because these were “tiny programs”, and it is “difficult to massively replicate any service without compromising its quality”.  Second, he doesn’t regard any non-experimental study as mattering.

The main thing that Coulson’s blog post overlooks is that not all randomized control trials provide evidence of equal quality, and that not all non-experimental studies are of equal quality.  Some non-experimental studies provide evidence on the true causal effects of public policies that is superior to that of some randomized control trial studies.

As mentioned in a previous blog post, the key problem that all pre-K studies are trying to address is “selection bias”. We are concerned that children and families participating in pre-K may differ from those who do not participate. We can control for observed characteristics of the children and families, but unobserved characteristics of children and families may differ between the treatment and comparison groups. These unobserved differences could be causing different outcomes, which would bias the estimated effects of pre-K by some amount that is potentially large and of unknown sign.

In theory, a perfectly run randomized control trial addresses this selection bias issue. Because participation is determined randomly, the treatment and comparison groups would be expected to on average be similar in unobserved characteristics.

But this advantage only holds fully for the full original sample. If there is “large enough” attrition in the randomly chosen groups, the final sample of treatment and comparison households could easily differ greatly in unobserved characteristics.  The final observed sample in that case would no longer really be randomly chosen.

In the case of pre-K, as I pointed out in a previous blog post, the Tennessee pre-K study has some serious problems with large and differential attrition. Therefore, I don’t think it at all appropriate to cite it as meeting some “gold standard” of providing evidence that is more reliable than any study that makes a serious effort to control for observable characteristics. The evidence from the Tennessee pre-K study is suggestive but not definitive, as is true for all studies that can only control for observable characteristics.

On the other hand, non-experimental studies can persuasively deal with selection bias if these studies rely on a “natural experiment”, in which access to the pre-K program varies due to geography or age or some other factor that is plausibly unrelated to unobserved child and family characteristics. All “non-experimental studies” are not created equal. Some non-experimental studies only can control for observable characteristics. Other non-experimental studies have variations in access that while not randomly assigned, may be almost as good as random for identifying the causal effects of the pre-K program.  One such study may be an outlier, but multiple studies increases confidence that we are identifying true causal effects of pre-K on outcomes.

We have many “natural experiments” that show large effects of pre-K. These include: Ladd et al.’s study of North Carolina pre-K; Ludwig et al., Currie et al., and Deming’s studies of Head Start; the various studies of the Chicago Child-Parent Center program; the various “regression discontinuity” studies of state and local pre-K programs. This evidence matters.

The Head Start and Tennessee evidence also matters. The Tennessee evidence is only suggestive, but is not particularly supportive of the effectiveness of that state’s program, which may have too low funding per child to be effective.  The Head Start RCT evidence also suggests that Head Start during that period may not have particularly effective relative to the other pre-K alternatives available to the control group, which included state pre-K programs. Changes over time in Head Start quality and Head Start alternatives is the most straightforward way to reconcile the fading effects in the Head Start RCT with the previous natural experiments that suggest that Head Start has long-term effects. Recent Head Start reforms may have improved its quality relative to Head Start’s quality at the time of the RCT. Head Start reforms to improve quality should continue.

Two other points. I would challenge the argument that Perry and Abecedarian don’t matter. These are programs with characteristics and delivery models that are well-understood. Today’s Educare program is quite similar to the Abecedarian program. Perry is essentially a smaller class size two-year version of today’s state and local pre-K programs. I think the Perry and Abecedarian evidence, in conjunction with the natural experiments, suggest that high-quality pre-K can make a difference.

Second, fading test-score effects are found in a variety of pre-K programs, including Perry, Abecedarian, and the Chicago CPC program. Such fading test-score effects appear to be consistent with large long-term effects on adult outcomes.  One theory to account for this fading and re-emergence is the importance of so-called “soft skills”, such as social skills and character skills. Such soft skills are important in determining educational attainment and adult earnings.

A broad view of the research evidence on pre-K suggests that a variety of programs have strong effects, although this should not be interpreted as meaning that every program always works. There is a lot of variation in program quality and program effects over space and time. For other recent research reviews that also take a broader view of the research evidence, and conclude that pre-K can work, see the review by Yoshikawa et al., and the review by Kay and Pennucci (Report 14-01-2201) for the Washington State Institute for Public Policy.

Posted in Early childhood programs

Weighing the preschool research evidence

Professor Bruce Fuller had an op-ed on preschool in the Washington Post on February 9. Professor Fuller’s interpretations of preschool research omit some important research.

Specifically, Professor Fuller argues that “youngsters from middle-class and well-off homes benefit little from preschool”.  He goes on to say that “young children attending quality half-day programs display the same learning gains as those attending full-day programs”.  Therefore, “we must avoid squandering scarce dollars on full-day programs for children who gain little from preschool”.

Professor Fuller cites some studies that support his arguments. But he fails to mention other studies that go against his arguments.

For example, Professor Fuller does not mention the research studies in Tulsa and Boston that find that universal preschool produces benefits for middle-class children that are only slightly less than the benefits for low-income children. Professor Fuller also does not mention a research study from New Jersey that finds significantly greater benefits from full-day preschool compared to half-day preschool.

An obvious and important question is: which studies should you believe? Should we believe the studies that Professor Fuller cites, or the studies that I cite? Or should we just say that the evidence is mixed and uncertain, which can be interpreted as an argument for inaction until more research is done?

The key problem in any preschool research is what social scientists call “selection bias”. The families that choose preschool differ from those who do not choose preschool, due to both family characteristics that we can observe, and family characteristics that we can’t observe. In addition, programs may choose to select preschool participants due to both observed and unobserved family characteristics.

For example, perhaps families that are more ambitious choose preschool. Or perhaps some preschool programs try to choose children who are easier to manage. Either source of selection would tend to mean that preschool participants will tend to do better than non-participants because of pre-existing family and child characteristics, above and beyond the true effect of the preschool program. Selection bias in estimating program effects would be positive.

Alternatively, perhaps families that are having more trouble with their children tend to try to put their children in preschool. Or perhaps preschool programs with a social mission try to choose needier children. These sources of selection will tend to produce a negative selection bias in estimating the true effects of preschool.

How can this selection bias be dealt with? If there are infinite resources and time, the ideal method is a large and perfectly-run randomized control trial. Preschool applicants would be randomly divided into a treatment and control group. As a result, we would expect average observed and unobserved characteristics in both the treatment and control group to be similar, and as the sample size gets larger, that expectation is increasingly likely to be realized.

But randomized trials are expensive and difficult to run, particularly on a large scale. Therefore, an alternative is to rely on natural experiments, in which some aspect of the world has resulted in different children having differing access to preschool, for reasons that have nothing to do with unobserved characteristics of the child and his or her family.  The treatment and comparison groups, with different access to preschool, will differ in preschool participation, but not observed and unobserved characteristics, and therefore we can interpret the outcome differences as being due to preschool, not pre-existing differences between the two groups.

A third method of trying to control for selection bias is to control for observed characteristics of the child and family.  Such controls help, but by their very nature cannot control for unobserved pre-existing differences between the treatment and comparison groups. Hence, such estimates may be subject to selection biases of unknown size and sign.

The Tulsa and Boston evidence that I am citing on middle-class benefits is based on natural experiments. Access to preschool and to kindergarten is based on an age cutoff.  The essence of the methodology used in these two studies is to compare the test scores of children who just missed the kindergarten age cut-off and are therefore just entering preschool, with test scores of similar children who just made the kindergarten age cut-off, who are just entering kindergarten, and who participated in preschool the preceding year.  These two groups are arguably similar in unobserved as well as observed characteristics because they were similarly selected into the same preschool program. The timing of their preschool access was based on age, and a few days of age in either direction should not make a big direct difference in test scores. The “jump” in test scores that is observed for the slightly older group in such studies is therefore reasonably attributable to the preschool participation the preceding year.

The New Jersey evidence I am citing on full-day versus half-day preschool is based on a randomized control trial. Excess applicants for a full-day preschool opportunity were randomly assigned to either receive full-day preschool, or only receive half-day preschool. The results showed significantly greater test score effects of full-day preschool. In Bartik (2011), I used these estimates to predict that full-day preschool produces 56% greater earnings benefits than half-day preschool.  Therefore, there are some diminishing returns to preschool time (benefits are not doubled), but there are benefits to full-day preschool over half-day preschool.

Most of the evidence that Professor Fuller cites is from the third category of studies, which only can control for observable child and family characteristics. These studies may be biased upwards or downwards by selection bias. Therefore, I would not weigh these studies as heavily.

In my view, the research studies that should receive the greatest weight use randomized or natural experiments to examine the causal effects of preschool, which avoids problems due to selection bias. The research studies that use such evidence support middle-class benefits of preschool, and support greater benefits for full-day programs.

Posted in Distribution of benefits, Early childhood program design issues, Early childhood programs | 4 Comments

What the available evidence shows about middle-class benefits of early childhood education

At the recent Education Writers Association conference on early childhood education, Russ Whitehurst of the Brookings Institution cited Tulsa and Boston studies as evidence that the benefits of early childhood education are much greater for low-income children than for middle-class children.

This is incorrect. The Tulsa and Boston studies actually provide evidence that the benefits of early childhood education are only modestly less for middle-class children than for lower-income children. In Tulsa, the research, in a paper on which I was a co-author, suggests that the test score boost from full-day pre-K for middle class children is about 88% of the boost for lower income children. In Boston, Weiland and Yoshikawa’s research suggests test score benefits of Boston’s full-day pre-K program for middle-class children are 71% of the benefits for lower-income children.

These test score benefits for middle-class children are sufficient to predict adult earnings gains that will be many multiples of costs. The Tulsa study calculates that the ratio of the present value of future adult earnings benefits to program costs for full-day pre-K is 2.82 for middle-class children, which is only modestly less than the 3.09 ratio for children eligible for a free lunch.  For Boston, my analysis of Weiland and Yoshikawa’s findings suggest that the ratio of the present value of future adult earnings benefits to costs for Boston’s full-day pre-K program is 2.30 for middle class children, versus 3.22 for children eligible for a subsidized lunch.

A key point in both findings is that the ratio of predicted future adult earnings benefits for middle class children to program costs is much greater than one. Providing free, high-quality pre-K to middle class children can be rationalized because economic benefits exceed costs. Universal pre-K may also win middle-class votes and support, but universal pre-K can be rationalized on its economic merits rather than just on political expediency.

I know of no other evidence that allows a direct comparison of the relative benefits of pre-K for middle-class and lower-income children. There is one study of pre-K for middle-class children in Utah that shows some benefits.

There might be various reasons why the social benefits of pre-K for lower-income children are much greater than for middle class children, even if the dollar earnings benefits are similar. Lower-income children would be predicted to have baseline adult earnings that are lower, so a similar dollar benefit will be a larger percentage boost to adult earnings.  In Tulsa, our study predicts that the percentage boost to adult earnings for children eligible for a free lunch is over 10%, whereas the percentage boost for middle-class children is between 5 and 6%. We might judge that providing extra dollars to lower income children is more valuable because it has a more dramatic impact on their future well-being.  In addition, it is a plausible hypothesis that pre-K may have greater benefits in reducing crime and welfare usage for children from lower-income families than for middle-class children, although I know of no empirical evidence for or against such greater relative benefits.

For child care programs, Duncan and Sojourner’s study of the Infant Health and Development program suggests that this program only boosts test scores for lower-income children. For parenting programs, studies of the Nurse Family Partnership suggest that NFP only works for lower-income families, not middle-class families.  Pre-K may be different from child care and parenting programs because pre-K may provide social and cognitive learning in a group setting that is hard for many middle class families to duplicate on their own.

The evidence is sparse on the absolute and relative benefits of early childhood education for middle-class children. This evidence is always likely to be sparse because there is not great interest from government or the philanthropic community in sponsoring extensive research on how early childhood education affects the middle class.  But the available evidence provides some economic support for universality in pre-K programs, while the pattern of benefits for children would argue for targeting child care and parenting programs on lower-income families.  Considering how programs benefit parents might alter these calculations for relative benefits and costs for different income groups, and is an important topic for future research.

Posted in Distribution of benefits | 1 Comment

The research consensus on early childhood education

On February 3, 2014, I spoke at a conference on early childhood education sponsored by the Education Writers Association. Later, the conference heard from many other speakers, including Russ Whitehurst of the Brookings Institution.

Whitehurst expressed uncertainty about whether early childhood education has lasting impacts. Journalists listening to Whitehurst might conclude that there is no research consensus on the impacts of early childhood education.

In my opinion, there is a research consensus that high-quality early childhood education can have lasting impacts. Whitehurst’s perspective represents a minority of researchers who dissent from what the bulk of the research shows.

Whitehurst’s position is based on emphasizing two studies, while downplaying all other research studies. He emphasized the Head Start randomized control trial and the Tennessee pre-K randomized control trial. These studies found immediate impacts of pre-K, which faded over time.

However, most good studies do find lasting impacts of early childhood education. These studies include the two best-publicized randomized control trials of early childhood education, the Perry Preschool study and the Abecedarian study.

Whitehurst’s presentation implied that Perry and Abecedarian are less relevant because they were done a long time ago. But early childhood studies that look at adult impacts up to 36 years later necessarily must have been started a long-time ago.

Whitehurst also implied that the two studies are less relevant because they provided services that differ greatly from what early childhood education programs are today. But the Abecedarian program is quite similar in services offered to the present-day Educare program.

And Perry Preschool does not differ in kind from many pre-K programs today. It had lower class sizes than most of today’s programs, and provided services for ages 3 and 4 rather than just age 4.  But it is similar to many of today’s programs, for example programs in Tulsa, Boston, and the Chicago Child-Parent Center in using certified teachers paid public school wages. And Perry was only a half-day program whereas Boston’s program is full-day and Tulsa’s program includes many full-day centers.

Perry’s program had an estimated 19% effect in increasing adult earnings.  The larger class size and one-year nature of many of today’s pre-K programs might somewhat decrease adult earnings impacts, while a full-day program might increase adult earnings impacts. Based on what we know about how class size, full-day vs. half-day, and two years versus one year affects impacts, we might think that many of today’s pre-K programs might have somewhat lower adult earnings impacts than 19%.  But even an impact of 6% or more (which is what we tend to find in the studies of pre-K reviewed below) would have a very high ratio of benefits to costs.

More importantly, there are many other studies than Perry and Abecedarian that show that early childhood education can have lasting impacts.  These include the Infant Health and Development Program, also a randomized control trial, as shown in a recently published study by Duncan and Sojourner.

These other studies also include many other studies which are not randomized control trials, but which do have very good comparison groups.  These studies are “natural experiments”, in which whether or not a child participates in early childhood education is determined by the accidents of geography or age or other circumstances that are likely unrelated to unobserved characteristics of the child or family. These natural experiments provide good evidence because the lasting differences between the treatment group and the comparison group are most plausibly attributed to the program’s true causal effects, as there is no good reason to think that there are significant unobserved pre-existing differences between the treatment group and the comparison group. Natural experiments that show lasting impacts of early childhood education  include the many Chicago Child-Parent Center studies, studies of Head Start by Deming , Currie et al. and Ludwig et al. , and a study of North Carolina’s early childhood programs by Ladd et al.

Many of the research studies that find lasting impacts of early childhood education also find that cognitive test score impacts fade over time.  Test score fading is found in Perry Preschool, the Abecedarian program, the Chicago Child-Parent Center program, and in Deming’s study of Head Start. Despite test score impacts that fade during the K-12 years, all these studies find large impacts of early childhood education on adult outcomes, with these impacts being “large” in the sense that they either directly show large percentage adult earnings impacts, or have educational attainment impacts that would predict large percentage adult earnings impacts.

Therefore, contrary to the impression left by Whitehurst’s presentation, there is a significant research basis for believing that even if there is fading in the Head Start randomized control trial and the Tennessee pre-K randomized control trial, there may well be later large effects on adult outcomes.  The most plausible theory for test score fading but long-term adult benefits is that early childhood education leads to lasting impacts on “soft skills” (social skills, character skills). These lasting soft-skill effects are extremely important in determining adult outcomes in higher educational attainment and higher employment rates and wage rates.

In the case of Head Start, the 3rd grade cognitive test score impacts, while mostly statistically insignificantly different from zero, are large enough that they would predict over a 1% impact on adult earnings, which would be a lot of money over a person’s entire working career. If these are faded impacts that significantly underpredict adult earnings impacts, the true adult earnings impact could be much greater. In addition, the confidence interval on these 3rd grade test score impacts is large enough that while it cannot rule out zero impacts, it also cannot rule out test score impacts that would predict a 2 to 3% increase in adult earnings. Therefore, the Head Start impacts are consistent with both zero test score impacts at third grade, and with test score impacts that would be large enough to be relevant for policy purposes.

The research literature also suggests that the early post-program impacts of early childhood education on test scores are better predictors of long-term impacts on adult earnings than are later, faded test score impacts. This finding occurs in the Chicago CPC study, the Abecedarian study, the Perry study, and Deming’s Head Start study. This finding also occurs in Chetty et al.’s study of the adult earnings impacts of higher kindergarten quality.

Therefore, the many recent “regression discontinuity” studies of state and local pre-K that show large effects on kindergarten entrance test scores adds some additional support to the notion that pre-K has large impacts. (These studies include studies by the National Institute for Early Education Research in seven states, and studies of Tulsa, Boston, and Kalamazoo.) These studies do not directly show lasting impacts of pre-K. But they show much larger immediate impacts of many state and local pre-K programs than are found in the Head Start randomized control trial or the Tennessee randomized control trial.  Based on the studies that show that early test score impacts predict long-term adult earnings effects, there is good reason to think that these state and local pre-K programs will significantly increase adult earnings.

We should also recognize that even if randomized control trials are the “gold standard” for research evidence, natural experiments meet a good “silver standard” that should also be considered in deciding on the research consensus. No one study is perfect in methodology, and therefore we should consider what the bulk of studies show. Furthermore, most studies look just at one program, and therefore if we want to know whether early childhood education in general tends to work, we also need to look at the bulk of studies, rather than just one study of one program.

Like most studies, the Head Start randomized control trial and the Tennessee randomized control trial have some methodological limitations. The Head Start experiment had an unusually large number of members of the control group that participated in some other pre-K program (about half), therefore it is better interpreted as an experimental study of whether Head Start as of 2003-04 was on average better in its test score impacts than the average quality of other pre-K alternatives, including state pre-K programs.

The Tennessee study had large and differential attrition from the treatment group and the control group. For example, in the first cohort, the analyses were based on data from only 46% of the treatment group and 32% of the control group. Although the original treatment and control groups might be similar in unobserved characteristics, it is quite possible that the much smaller group on which the research was largely based may have large differences among the treatment and control group, which raises questions about whether or not the study meets the gold standard. The researchers tried to control for observable differences among the two groups, but obviously it is impossible to statistically control for unobservable differences, which is what a randomized control trial is trying to do, and what a natural experiment is argued to do.  There are some signs that the full sample showed larger impacts on kindergarten retention than the smaller sample after attrition, which suggests that fears of possible biases may be warranted.

In addition, neither Head Start as of 2003-04 nor the Tennessee pre-K program necessarily represent the impact of the highest quality pre-K programs. As mentioned, many regression discontinuity studies of state and local pre-K programs show higher immediate impacts than Head Start (see also Wong et al.). Tennessee’s pre-K program is judged by the National Institute for Early Education Research to spend $2,000 per child less than is thought to be a reasonable amount per child to facilitate quality pre-K services.

Overall, the bulk of the research evidence from many studies has convinced most researchers that high-quality early childhood education can have large and lasting impacts.  The Head Start randomized control trial and the Tennessee randomized control trial do not provide additional evidence that supports the effectiveness of early childhood education. But contrary to Whitehurst and a minority of researchers, these two studies are insufficient to overturn the bulk of the research evidence from multiple studies of a wide variety of programs.

Posted in Uncategorized | 2 Comments

Why early childhood education can significantly reduce income inequality

President Obama’s State of the Union address Tuesday night is rumored to talk about a variety of measures to reduce income inequality (perhaps reframed as building “ladders of opportunity” for the poor and middle-class), including early childhood education. I thought it useful to review again why early childhood education can be of particular help in boosting the economic prospects of lower income groups, thereby reducing economic inequality.

First, even if pre-K is universally available, the evidence suggests that high-quality pre-K provides a similar dollar boost to future earnings for children from all income classes. But because children from lower-income families tend to have lower baseline future economic prospects, the percentage boost to earnings from universal pre-K is much greater for children from lower-income households.

Second, the evidence suggests that high-quality child care programs and high-quality parenting programs are much more effective in boosting future incomes for children for lower-income families than for other income groups. As a result, it makes sense for child care and parenting programs to be targeted at lower-income groups, as the benefit-cost ratio for these programs will be far greater for these groups.

How much good can early childhood education do to boost income prospects for children from low-income families? Full-day pre-K for one school year at age 4 can boost long-run earnings by 10%.  A more expensive full-time child care and pre-K program from birth to age 5 can boost future earnings for children from lower-income families by 26%. High-quality parenting programs such as the Nurse Family Partnership could boost earnings another 3%.  Therefore, the earnings boost from a comprehensive package of early childhood education programs could be as great as 29%.

I think most people would regard a 29% boost to earnings as a large earnings boost.  Empirically, such an earnings boost would be roughly sufficient to offset the amount that income growth for the lowest income quintile in the U.S. has lagged behind average income growth since 1979.

Why does early childhood education tend to reduce income inequality? Parenting and child care programs provide services that many lower-income families are unable to provide adequately on their own, unlike higher-income families. For pre-K programs, all children benefit from a service that helps promote cognitive skills as well as social skills and character skills in a group setting. But it’s inherently harder to provide the same percentage boost to middle-class earnings from a service delivered with uniform quality to all children, as the baseline earnings for middle-class groups are so much higher.  As writers such as Lane Kenworthy have emphasized, universal public services of high quality for all are inherently redistributive even if they benefit all income groups.

Early childhood education by itself does not solve all the problems of income inequality and limited upward mobility. The broader K-12 system and higher education and training system need reform to be more effective and provide broader access to all. We need to figure out how to revitalize the American system of job creation to make full employment a reality. A variety of measures from higher minimum wages to expanded wage subsidies are needed to increase take-home wages. But greater access to higher-quality early childhood education can help provide all children with the skills they need to take advantage of the opportunities provided by a better educational and economic system.

Posted in Distribution of benefits | 1 Comment