What would it cost to transform “The Hell of American Day Care”?

Jonathan Cohn, a senior editor for The New Republic, wrote an outstanding article there a few weeks ago, entitled “The Hell of American Day Care: An investigation into the barely regulated, unsafe business of looking after our children”. The entire article is essential reading, even though some of it is troublesome reading. Cohn frames the article around the deaths of some children due to a fire in a Texas day care center, which was caused by the neglect of the day care center’s director.  But he then goes on to raise broader policy issues about what we as a society are failing to do to provide better child development for our nation’s children.

I want to focus in this blog post on one issue Cohn raises in a follow-up interview with Dylan Mathews of the Washington Post: what would it cost to make quality child care/preschool available to all families who need this assistance from birth to age 5? Cohn says in that interview that “I talked to some experts about what a true universal child-care program would cost. Nobody felt comfortable giving me a solid estimate.” I want to provide at least one estimate, and comment on its implications.

To summarize my conclusions: Providing access to quality child care for all birth to age 5 would probably cost around $100 billion annually in additional government funding. This amount of money is affordable and still somewhat less in child care and preschool funding than other leading countries, but would be politically difficult. This increases the importance of doing research on how we might hold down the costs by changes in child care and preschool design that could provide quality services at lower costs.  It also increases the importance of policies that would raise the economic position of lower-income families and thereby reduce the need for subsidies. Finally, I think the political cost of comprehensive birth to 5 services for kids increases interest in an incremental approach that would focus on age 4 preschool, which probably has the greatest ratio of benefits to costs.

The U.S. has around 19.7 million children under the age of 5. Of those children, around 25% are in families below the poverty line, 22% are in families with incomes between 100% and 200% of the poverty line, and 16% are in families with incomes between 200% and 300% of the poverty line.  Let’s assume that this 63% of families with incomes below 300% of the poverty line are the families for whom we are most concerned to make sure that quality child care and preschool is available and affordable.

The Abecedarian program was a high-quality full-time child care and preschool program from birth until age 5. The program has good random assignment research that suggests that the program increases long-run earnings of former child participants by an average of 10%. The program also provides considerable earnings benefits for parents, both in the short-run and the long-run, by allowing parents to accumulate more work experience and education. The Abecedarian program cost around $16,000 per child per year. The Educare  program that is supported by the Ounce of Prevention Fund and the Buffett Early Childhood Fund is very similar in design to the Abecedarian program.

Suppose we wanted to set up an Abecedarian program that would be available to all families on a sliding fee scale.  To pick somewhat arbitrary fees, assume that the government subsidy would be 90% of costs for families below the poverty line, 80% of costs for families between 100% and 200% of the poverty line, and 60% of costs for families between 200% and 300% of the poverty line.  Beyond 300% of the poverty line, families would have to pay full costs. Assume further that about 75% of all families receiving subsidies would participate in the program.

Under these assumptions, the government subsidy costs for this program would be about $118 billion per year. Of this total, $52 billion would go to families below the poverty line. There would be some cost savings offsets. With this new program, we would no longer need to spend money on Head Start, the Child Care Development Block Grant, or almost all the funds for state pre-K programs. This would save around $20 billion in annual costs. So the net costs of this program would be around $98 billion per year.

$98 billion is of course a lot of money. It is over $300 per capita averaged over the entire U.S. population, so funding it would require somehow collecting revenues of $1200 per year for the “average” family of four. I think that’s a very hard sell.

On the other hand, $98 billion represents between 2 and 3 percent of total federal, state, and local government receipts per year. $98 billion is around 8% of total state and local taxes. This amount of spending is equal to about 17% of total K-12 spending.  Therefore, what we’re talking about is a major expansion of about one-sixth in our spending on child education, which would require about an 8% tax increase if financed at the state and local level, but less than a 3% expansion of government revenues if we also got the federal government involved.  So if the financing is broader than just individual income taxes for the median household, then the proposal seems more affordable.

Based on OECD figures, such a proposal, if enacted, would increase U.S. spending on child care and education programs for preschool-age children from 0.4% of our Gross Domestic Product to about 1% of U.S. Gross Domestic Product, or an increase in our commitment of about two-and-a-half times. This would still leave the U.S. somewhat behind other leading countries in the percent of the national economy devoted to preschool-age programs.  For example, France, Finland, the United Kingdom, Denmark, Sweden, and Norway all spend 1.1% or more of their Gross Domestic Product on preschool and child care programs for preschool-age children. Therefore, it is not unusual for a leading industrial country to provide government subsidies of 1% or more of GDP in programs for preschool-age children.

Given the political difficulties in the U.S. of finding an extra $100 billion for child care and preschool, it seems wise to consider some alternatives that might make progress more politically attainable.  One alternative is doing research to see whether we can still increase quality preschool access, but at somewhat lower costs. We could use some good experiments that looked at different class-size ratios and different teacher training and teacher credential requirements at different ages. For example, the Educare model has 3 adults for every 8 infants and toddlers, and 3 adults for every 17 preschoolers – what are the cost/quality tradeoffs from tweaking those ratios? Right now, we don’t have enough evidence on this topic.

It would also be helpful if policies could lower the percentage of families that need large subsidies for child care and preschool. It is certainly disturbing that one-quarter of all American preschoolers are below the poverty line, almost half are below 200% of the poverty line, and over 60% are below 300% of the poverty line.  Job creation programs and training programs that would raise employment rates, and minimum wage and expanded wage subsidies that would raise earnings, would help. Government encouragement of paid family leave would reduce the needs for infant care, which is very expensive.

The enormous cost of providing full access to Abecedarian/Educare subsidies for needy preschool children is one reason for the interest in age-4 preschool as a political strategy. Universal half-day pre-K for 4-year olds might cost about $14 billion in additional funding per year.  This is about one-seventh the cost of full implementation of an Abecedarian style model. An extra $14 billion in funding is obviously more politically feasible than getting an extra $98 billion in funding, and in addition, universal pre-K would provide direct services to all children.

I think gross benefits and net benefits would be higher from full implementation of Abecedarian/Educare compared to universal pre-K for four-year olds. Depending upon what assumptions are made,  the gross increase in earnings from full implementation  of Abecedarian/Educare are probably from two to four times the increase in earnings from universal pre-K for four-year olds. But the benefits are not seven times as great. So, the benefit-cost ratio is somewhat higher for the four-year old pre-K approach.

A hybrid political approach is to begin with expanding 4-year old preschool, but at the same time seek to increase quality standards and access for child care and preschool for younger children. This should be coupled with research that would enable us to say more about quality and cost tradeoffs, and policies that would boost living standards for lower-income families with young children.  A package of policies may be needed to move us away from “hell”, and towards better child development for all American children.

Posted in Early childhood program design issues, Early childhood programs

My preschool and economic development presentation is TED’s “talk of the day”

On May 6, 2013, TED posted a fifteen minute presentation by me on pre-K and economic development as its “Talk of the Day”. TED slightly re-edited this from a TEDx talk I gave at Miami University last year.

In just a few hours, this TED posting led to over 10,000 views. The comments at TED after the video are interesting. I’ve posted a few responses there, and may comment further at this blog later on.

Posted in Early childhood programs, Economic development

Expanded pre-K is fiscally sustainable

The popular Washington Post blog “Wonkblog” had a post on April 11 2013 from Brad Plumer that got my attention with this headline:  “Funding preschool with a cigarette tax is unsustainable”.

The gist of the article is as follows: Although the proposed cigarette tax increase of $1.95 per pack would fully pay for the Obama Administration’s proposal for expanded preschool programs and expanded home visiting programs over a ten-year budget horizon, the annual funding per year at the end of the 10 year period is insufficient to pay for the program on an ongoing basis. Over the 10 year time horizon, total revenue is $78 billion and spending is $77 billion. But in the last year of the time horizon, 2023, total revenue is $6.1 billion and spending is $11.6 billion. 

The reason for this is two-fold: First, the preschool program is gradually expanded over time, which is eventually offset, but only partly, through a reduced federal share of the funding. Second, real revenue from the cigarette tax declines over time because the tax will reduce smoking more and more over time.

However, these figures do not take into account plausible positive fiscal feedbacks from pre-K. It is conventional practice of the Office of Management and Budget (OMB) and the Congressional Budget Office (CBO) to not consider most of the feedback effects of revenue and spending proposals from their effects on the economy and other fiscal categories. Thus, we would consider how a cigarette tax would affect smoking and hence cigarette tax revenues, but not how this would affect health care spending. For pre-K funding, we would consider likely take-up rates by states of the program, but would not consider any effects of the program in reducing special education spending, reducing prison costs, or increasing tax revenues from effects on parental earnings or the earnings as adults of former child participants.

The rationale for not doing “dynamic scoring” is that this might lead to temptations for budget gamesmanship, where politicians encourage budget agencies to make unrealistic assumptions about dynamic responses to reduce or eliminate the costs of political proposals. For example, if one makes extreme enough assumptions about how labor supply and capital investment respond to taxes, one can reduce or even eliminate negative effects of tax rate reductions on government revenues. 

However, although perhaps we don’t want OMB or CBO to be tempted by dynamic scoring, this is no reason for journalists or policy wonks to ignore plausible, research-proven economic effects of budget proposals in evaluating the proposals. In the case of pre-K, for example, we have very good evidence, for example from the Chicago Child-Parent Center program, that high-quality preschool reduces special education assignment rates by over 40%.  Pre-K also reduces crime rates by over 25%, which will reduce prison costs and other criminal justice system costs. Finally, pre-K will increase the adult earnings of former participants by over 7%, which will increase government tax revenues.

In their analysis of the benefits and costs of the Chicago Child Parent Center program, Art Reynolds and his colleagues find that the present value of the fiscal benefits of the CPC program outweigh the costs. The calculated present value of fiscal benefits is almost three times the costs, at 288%. (This includes their estimates of tax revenue increases, savings on special ed, savings on grade retention, reduced criminal justice system expenditures, savings in the child welfare system, and increased college tuition subsidies, but excludes estimates for possible savings on treatment of depression and substance abuse.)  The three biggest categories of savings are:  criminal justice system cost savings (present value of 106% of program costs); increased tax contributions as adults of former child participants (75% of program costs); reduced special education costs (63% of program costs).

Many of these fiscal offsets occur outside the 10-year time window that is often considered in OMB or CBO budget analyses. But the 10-year window is arbitrary. The figures given above are adjusted to present value terms, so the figures do attempt to discount these future fiscal benefits to reflect both inflation and the likely reduced value of future dollar benefits when viewed from the perspective of today.

Furthermore, some of these fiscal offsets will start occurring almost immediately. For example, this is true of special education cost savings.  In projections done for chapter 7 of my book Investing in Kids, I made relatively conservative assumptions about special education cost savings. As of 10 years after a universal preschool program is initiated, my estimates suggest that special education cost savings offset about 34% of the program’s annual costs.  This increases beyond the 10-year window to special education cost savings offsetting 48% of a universal pre-K program’s gross costs in the long-run.  The percentage cost savings would probably be greater for a more income- targeted pre-K program, which is largely what the Obama Administration’s pre-K proposal will be funding.

Other fiscal impact projections show even greater immediate fiscal benefits of pre-K programs. For example, Robert Lynch’s 2007 book, “Enriching Children, Enriching the Nation”, simulates that the fiscal “break-even” of an income-targeted pre-K program will occur after about 9 years.  This is probably the most relevant simulation to the Obama Administration proposal, which mostly funds pre-K for lower-income children.  But even for a more universal program, Lynch’s simulation suggest that as of 10 years after the program is begun, fiscal benefits offset over half of program costs. 

A realistic analysis of large-scale pre-K programs, based on good research evidence, suggests that expanding pre-K is one of the most fiscally sustainable policy options for the U.S. Even without a cigarette tax to finance pre-K, large-scale pre-K programs would pay for themselves in the long-run, and would pay a sizable share of costs even by the end of a ten-year time horizon.  Expanding pre-K actually would reduce the long-run ratio of government debt to U.S. economic output.  Deficit hawks should be strong supporters of expanding pre-K as a way to reduce long-run government budget deficits, even if we ignore the many benefits of pre-K for private individuals and private businesses.

Posted in Early childhood program design issues, Early childhood programs, Timing of benefits | 8 Comments

My presentation at the Wisconsin Family Impact Seminar

I made a presentation on early childhood programs and state economic development in Madison, Wisconsin, as part of a program for state legislators and state policymakers sponsored by the Wisconsin Family Impact Seminar, on February 13, 2013. This presentation included some Wisconsin-specific material, but mostly would be applicable to any state or local area in the U.S.

At the website for the Wisconsin Family Impact Seminar, they have posted various materials associated with my presentation, including:  a 12-page briefing report that provides a summary for policymakers on my research on this topic;  my PowerPoint presentation; an audio of my 25 minute presentation; a video of my presentation.

Posted in Early childhood program design issues, Early childhood programs, Economic development

Recent research on how educational benefits of high-quality child care vary by income

An excellent recent paper by Greg Duncan and Aaron Sojourner has important implications for understanding the effects of different types of early childhood programs for different income groups.

Duncan and Sojourner look at the effects of the Infant Health and Development Program (IHDP) on later IQ and test scores of children from different income groups.  This program was modeled after the Abecedarian program, which provided full-time full-year child care and preschool from birth to age 5. However, the IHDP program was more focused on early childcare. The main service that the IHDP program provided was high-quality child care at ages 1 and 2. IHDP also provided some home visiting services from birth to age 3. However, services ended at age 3, unlike the Abecedarian program, which went on to provide preschool at ages 3 and 4.

The IHDP was run as a random assignment experiment. Because of random assignment, we would expect that any differences between the treatment group which received program services, and a control group which did not, are most likely to be due to the IHDP program, and not due to unobserved differences between these two groups.

IHDP was targeted at low-birth-weight children. However, Duncan and Sojourner focus their attention at results for the “heavier” low-birth-weight children.  They argue that with some weighting of their sample, the control group’s developmental pattern is similar to the development of typical U.S. children. Therefore, they argue that their estimates of IHDP effects for “heavier” low-birth-weight children might reflect what this program would do in the general U.S. population.

Although IHDP was targeted at low-birth-weight children, it was not targeted explicitly by income. Therefore, IHDP is unusual among early childhood programs in having random assignment evidence for the effects of a program for children from different income groups.

Duncan and Sojourner’s results imply that an IHDP-style program provided to the general U.S. population would have substantial effects in reducing the educational achievement gaps between different income groups. They estimate that for the “heavier” sample of children, IHDP had large effects in both the short-run and long-run on  improving IQ and test scores for low-income children (from families below 180% of the poverty line), but did not have statistically significant long-run effects on higher-income children (from families above 180% of the poverty line).  (There were some short-run positive effects of the IHDP program for higher-income children, but long-run effects were statistically insignificant, with a tendency towards negative point estimates.)

For example, as of age 8 (third grade), an IHDP-style program targeted at low-income children would be expected to increase educational achievement among this group by a sufficient amount to eliminate between one-third and three-fifths of the expected achievement gap at 3rd grade between low-income and higher-income children.  These empirical estimates for 3rd grade, 5 years after program services ceased, imply that the program moves the achievement of low-income children up by about one-half grade level.  (Effect sizes are 0.30 in reading, 0.44 in math, which are about half the grade 2 to grade 3 gains in those subjects estimated by Bloom et al.)

Using estimates from Chetty et al. on how 3rd grade test scores affect later adult earnings, and estimates from Bartik, Gormley and Adelstein of expected adult income of children from different income groups, I project that for low-income children, the 3rd grade test score effects estimated by Duncan/Sojourner would be consistent with a lifetime increase in adult earnings of about 13%. This is a large effect that would have significant effects on the income distribution.

I also project from the Duncan/Sojourner estimates, and the Abecedarian results, that an IHDP program targeted at low-income families is likely to pass a benefit-cost test. The program appears to have annual costs similar to the Abecedarian program, of around $16,000 per year for the two years of child care provided, with some additional funds for the home visiting.  $35,000 per child would probably cover total three-year program costs in an established, permanent IHDP program.  Based on the Abecedarian program (see calculations reported in chapter 4 of my book Investing in Kids ), savings in costs for other subsidized child care programs might offset  at least 15% of program costs, so the net incremental costs of IHDP might be $30,000 per child.

On the benefit side, I project, based on Duncan and Sojourner’s estimated test score effects, and Chetty et al.’s estimated effects on adult earnings, that the IHDP might increase the present value of future adult earnings for former participants by $33,000 per participant. There also might be other benefits associated with program participation, such as lower crime by former participants. (Some suggestive but inconclusive evidence for anti-crime effects is reported by McCormick et al.) In addition, we would expect considerable earnings benefits for parents due to the free child care provided by IHDP, which will increase parental work experience in the short-run. This short-run increase in work experience will develop job skills, and thereby increase parental earnings in the long-run.  Based on the Abecedarian program, such parental benefits might have a lifetime present value of around $30,000 per family participating in the program. (This is calculated as 2/5ths of the parental benefits that I calculated for the Abecedarian program in chapter 4 of Investing in Kids ).

How does this fit into the broader research literature on early childhood programs? We only have limited evidence on how effects on child development of early childhood programs vary with the child’s family income. As argued in Bartik, Gormley, and Adelstein, the evidence from universal preschool programs such as Tulsa’s suggest that there are similar test score effects and future dollar earnings effects for children from different income groups. On the other hand, estimates from the Nurse Family Partnership suggest that home visiting programs from the pre-natal period up to age 2 have stronger effects on children from disadvantaged families.

Therefore, based on what we know now, it might be a reasonable hypothesis that early childhood development programs that intervene via home visiting or child care prior to age 3 have more significant effects on child development and life course for children from lower-income families,  whereas children from higher income families have much smaller benefits from these very early interventions. On the other hand, preschool, at least at age 4, seems to have broader benefits across children from different income groups.

This makes some sense in terms of what parents from different income groups can typically provide for children at different ages.  High-quality developmental early-age child care services and parenting programs might be a more valuable supplemental on average for at least some low-income families, whereas on average more middle-income families might be more readily able to do just as well in promoting such early-age child development on their own.  But preschool may provide services that are difficult for parents from a wide variety of income groups to provide on their own. For example, preschool may help develop a child’s skills in dealing with larger groups of peers and with non-parental authority figures, which may be useful skills in school and later on in life.

In sum, we shouldn’t over-generalize about the effects of early childhood programs by income group. The pattern of income group effects appears to vary by type of program and by what ages are being considered.

Does this mean that early-age child care programs such as IHDP should be targeted only at low-income families? Not necessarily, for at least two reasons. First, as argued by Duncan and Sojourner, IHDP provided child care in income-integrated settings. It is possible that part of the positive effects of IHDP were related to positive peer effects from such income-integration. It is possible for a targeted child care program to include income integration (for example, by providing vouchers to programs that include other income groups), but perhaps more challenging to do so.

Second, even if early age child care programs’ benefits are less for lower-income families for former child participants, we should also consider effects on parents. The free child care may have large benefits for parents in higher-income families.

One policy option would be to explore programs similar to some Scandinavian countries, in which universal child-care programs are combined with family-income-based fees. This would encourage income integration of child-care programs and provide some help for parents from all income groups, while targeting more governmental assistance for these very expensive child care services on lower-income children.  As children get to preschool age, public assistance might be broadened to more income groups, to reflect the broader benefits of preschool across income groups.

Posted in Distribution of benefits, Early childhood program design issues, Early childhood programs | 2 Comments

What does research say about the proposed expansion of Michigan’s Great Start Readiness Program?

Michigan Governor Rick Snyder recently proposed a major expansion of the state’s pre-K program, called the Great Start Readiness Program (GSRP).  From reports in Gongwer News Service, legislators and others have expressed various doubts about the proposed expansion. This blog post attempts to clarify what research says about this proposed expansion.

As discussed in my previous blog post, the proposal would increase Michigan’s preschool funding from $109 million in fiscal year 2013, to $174 million in fiscal year 2014, and $239 million in fiscal year 2015.  The proposal both increases the state’s funding per preschool slot and increases the number of slots.  Funding per half-day slot for one child would increase from $3400 to $3625. The percentage of all 4 year olds in the program would increase from 20% of all Michigan four-year olds currently to 42% two-years from now.  By two years from now, the number of slots would come close to providing full access to quality pre-K for the main group targeted by GSRP, which is families below 300% of the poverty line.  However, the state would still be below the leading states in providing access to all 4-year-olds; for example, Oklahoma provides state-subsidized pre-K for 74% of all four-year olds.

What research findings are more relevant to the proposed GSRP expansion? What misconceptions about what research shows need to be clarified?

Here is the summary of my response to these questions:

1. Studies with disparate methodologies support the short-term and long-term effectiveness of GSRP.

2. Other large-scale state and local pre-K programs similar to GSRP have also shown evidence of short-term and long-term benefits.

3. Evidence from the Abecedarian Program and Perry Preschool provide rigorous evidence that more intensive early childhood programs than GSRP can yield even larger effects than GSRP on adult earnings and other outcomes, but we don’t need adult earnings effects anything close to these intensive programs for less costly programs such as GSRP to pass a benefit cost test. 

4. The more mixed educational results of Head Start suggest that more educationally focused programs such as GSRP can achieve more consistent educational results.

5. Programs similar to GSRP show strong benefits for middle-class as well as low-income children; this finding and peer effects suggest that there may be benefits for the state from relatively broad income eligibility standards for GSRP.

6. Keeping class size down, and keeping teacher quality up, helps improve benefit-cost ratios in programs such as GSRP.

7. Expanding half-day preschool programs to full-day programs, or one-year preschool programs to two-year preschool programs, probably have net benefits, but probably also reduce benefit-cost ratios somewhat (e.g., there are diminishing returns).  However, expanding half-day preschool programs to full-day programs also makes it easier for some families to access preschool programs, with the proportion of families in this situation varying across different local areas. Therefore, GSRP’s focus on one-year programs, with local flexibility on the mix of half-day versus full-day programs, probably makes sense.

8. GSRP per student funding has steadily lost ground to inflation over the past 10 years.  The proposed increase to $3625 per half-day slot only makes up for a small part of this decline in real funding. This decline in real funding creates challenges in maintaining access and quality, and particularly threatens the ability of the program to fund quality private preschool programs.

9. The proposal funds expanded GSRP out of the School Aid Fund.  Funding expanded GSRP out of reduced K-12 spending may offset roughly two-fifths of the positive educational and economic effects of expanded GSRP funding, although the exact offset depends upon assumptions about the productivity of K-12 spending. This funding source does not maximize the use of state educational resources to develop a better quality state labor force.

Now, the further details elaborating on these responses and providing or linking to evidence:

1. Studies with disparate methodologies support the short-term and long-term effectiveness of GSRP.  A number of studies from the High Scope Foundation compare GSRP participants with non-participants who are similar in observed pre-program characteristics such as family income, whether single parent family, etc. These studies have found both short-run and long-run effects of GSRP in improving various academic outcomes, most notably in improving on-time high school graduation of GSRP participants by reducing grade retention.

Although GSRP participants and non-participants in this study are similar in observable characteristics, it is certainly possible that GSRP participants and non-participants differ in unobserved characteristics. The econometrics buzz word for this possible problem is that the estimates might be subject to “selection bias”, because either GSRP families self-select their child into the family, or because of criteria that GSRP sites use for selecting participants.

This selection bias could go in either direction.  For example, it is possible that GSRP participants tend to come from families with more ambitious or aware parents, which might mean that GSRP participants would have unobserved characteristics that would have promoted greater success even without the GSRP program. On the other hand, the GSRP state office encourages local programs to select GSRP participants on the basis of greater risk factors, which we do not fully measure for non-participants, and such risk factors might cause GSRP participants to do worse than observationally similar non-participants without the program.

But GSRP’s effectiveness is also supported by other research that is less likely to be subject to possible selection bias.  Research by Wong et al. has evaluated GSRP by comparing test scores of children who are just entering GSRP, with children who have just entered kindergarten after participating in GSRP the previous years. This methodology relies on the fact that Michigan uses an age cut-off to determine eligibility for participation in GSRP and kindergarten. Therefore, these researchers can observe similar children whose families chose to select GSRP, and who GSRP sites select for enrollment, but who differed in age, which therefore determined what year they enrolled in the program. The researchers can observe how test scores increase with age of the child, and test for whether there is an abrupt jump in test scores at the age cut-off for being able to enroll in GSRP the previous year.  Intuitively, this methodology relies on comparing the test scores of children whose main difference is only a few days in age: some children were just old enough to participate in GSRP last year, and are entering kindergarten now; other children just missed the age cut-off for GSRP last year, and are just entering the GSRP program this year.  The difference between these two groups of children is hypothesized to be due to one group participating in GSRP for one year, versus the other group participating in other non-GSRP activities during the previous year.  The remaining age differences can be controlled for because we observe students who differ in age within each group.

This methodology for evaluating GSRP, which is called a “regression discontinuity” methodology (the “discontinuity” is the “jump” in test scores), is a well-established and respected econometrics methodology.  If random assignment is the “gold standard” for evaluating programs, regression discontinuity is a good “silver standard” methodology. Such regression discontinuity studies are less subject to selection bias because all the children in the study have been selected into the same program, with just some difference in the timing of that participation due to the age cutoff.  Such regression discontinuity methodologies have been used for evaluating pre-K programs in many state and local areas other than Michigan, including New Jersey, New Mexico, West Virginia, Oklahoma,  South Carolina,  and Tulsa.

This “regression discontinuity” study of Michigan’s GSRP program finds that GSRP significantly increases test scores on various academic tests. The average increase in test scores due to GSRP is estimated to be an “effect size” of 0.42. “Effect size” is education statistics jargon for measuring test score effects as a proportion of the typical variation across test scores for a given cohort of students. An effect of 0.42 in effect-size units corresponds to increasing the “percentile” test score of students by about 16 percentile points. That is, if the student would have otherwise scored better on tests entering kindergarten than 34% of all entering kindergartners, then GSRP would be expected to advance them to the 50th percentile or the median performance of all entering kindergartners. I think that most parents would regard such a test score gain as a major improvement. It corresponds to about a 55% increase in what we would normally expect children to learn without pre-K during the year before kindergarten.

Based on research by Chetty et al. that looks at the effects of kindergarten test scores on adult earnings, we would expect this 16 percentile gain in test scores to increase future adult earnings by about 8%. For each child participating in GSRP, the average discounted present value of his or her expected future earnings would increase by over $25,000. (This extrapolates my methodology used with Gormley and Adelstein for Tulsa to the Michigan context, which is obviously conservative given that earnings are somewhat higher in Michigan.) This is obviously far in excess of the per-child cost of GSRP.

We can add to these earnings benefits the benefits of GSRP in reducing the costs to the taxpayer of grade retention, which requires an extra year of funding during the K-12 years. The High Scope studies estimate that the cost savings from GSRP’s effects in reducing grade retention might eventually cover over 40% of the state’s costs of GSRP. From both the earnings benefits and grade retention cost savings, research directly on GSRP suggests that the benefits of GSRP are many multiples of its costs

2. Other large-scale state and local pre-K programs similar to GSRP have also shown evidence of short-term and long-term benefits.   There are a variety of large-scale state and local preschool programs around the U.S. that are similar to GSRP, both  in program design (a half-day or full-day program at age 4) and in costs per student (on the order of $4,000 to $6000 for a half-day program – with what local school districts pay for GSRP in subsidizing the program, for much of GSRP’s history its total costs per half-day program for a child have probably been close to  $5,000 in 2012 dollars).  Research evidence for short run benefits of these programs in improving kindergarten test scores has been found for such programs in West Virginia, New Jersey, South Carolina, Oklahoma, New Mexico, Tennessee, Tulsa, and Chicago. Research evidence for longer-term benefits on later test scores has been found in North Carolina, Georgia, New Jersey, and Chicago.

Probably the most important outside research evidence relevant to GSRP is from the Chicago Child-Parent Center program. This program, operated by Chicago Public Schools, provided half-day preschool for one or two years for children in low-income neighborhoods. The program was evaluated by comparing children in neighborhoods provided such preschool services with children in similar neighborhoods without such services. The program has now done follow-up with participants and non-participants up to age 28. This long-term follow-up with a good comparison group provides useful information in attempting to see whether GSRP is likely to have positive benefits in adulthood.

Research has found significant benefits of CPC in improving educational attainment and adult earnings, and in reducing crime.  The overall calculated benefit cost ratio for the program is almost $11 per dollar of costs. Of these benefits, about one-third are due to higher earnings for former participants. About half the benefits are due to the program’s effects in reducing former participants’ involvement in crime, which reduces costs to the criminal justice system of crime as well as costs to victims of crime

CPC is quite similar to GSRP in many of its design features. CPC and GSRP have similar class size limits and child to adult ratios (CPC is a max of 17 children to 2 adults, while GSRP for most students is a max class size of 18 with a max child to adult ratio of 8 to 1.)  GSRP is a half-day program for most students, as is CPC.  CPC tends to have more classroom days than GSRP; CPC is a 5-day a week program for the 9 month school year, and sometimes also include includes a six-week summer program; GSRP operates for a minimum of 4 classroom days per week for 30 weeks, with the remaining day for planning and home visits.  As I review below, this lesser time might reduce GSRP results somewhat compared to CPC, but would be expected to do so by less in percentage terms than the cut in preschool time, which may actually increase GSRP’s benefit-cost ratio per child above CPC.   Adjusted for Michigan prices in 2012, CPC costs about $5600 per child per year. This is above what GSRP currently receives per child from the state, but GSRP programs receive considerable local subsidies from local school districts, and for most of GSRP’s history, its real funding per child has been considerably higher than it is today. CPC is a two year program (ages 3 and 4) for about 55% of its participants, and a one-year program (age 4) for about 45% of its participants, which is different from GSRP’s focus on age 4. However, the available evidence on CPC suggests that the benefit-cost ratio for the CPC participants who only participate for one year at age 4 is actually higher, at over $13 in benefits per dollar of program costs. Adding a second year does not seem to quite double benefits, although benefits do exceed costs from this expansion.  I will return to this topic of program design later.

Overall, these research studies of other state and local programs suggest that preschool programs can produce significant short-run and long-run benefits. These benefits are estimated for programs run at a large scale by a variety of state agencies and a variety of local school districts. These benefits also are occurring for programs whose costs per student are in the $4,000 to $6,000 range, similar to GSRP, and for programs that are similar to GSRP in design. All this evidence from other state and local areas is consistent with direct research evidence for the GSRP program.

3.Evidence from the Abecedarian Program and Perry Preschool provide rigorous evidence that more intensive early childhood programs than GSRP can yield even larger effects than GSRP on adult earnings and other outcomes, but we don’t need adult earnings effects  anything close to these intensive programs for less costly programs such as GSRP to pass a benefit cost test.  

The Abecedarian program is full-time full-year high quality child care and preschool from birth to age 5 for highly disadvantaged families. The program shows strong benefits (e.g., estimated adult earnings benefits of former child participants of 14%), based on a rigorous random assignment methodology. However, because the program is considerably more intensive than GSRP (e.g., birth to five full-time rather than a mostly half-day school-year program at age 4), the main relevant finding for GSRP is that more intensive programs may yield larger effects.  However, preschool programs don’t need earnings effects as great as 14% to pass a benefit-cost test, given that preschool is considerably cheaper than the Abecedarian program, which has gross costs of around $80,000 per child (e.g., $16,000 per year for five years).

Perry Preschool also was a more intense program than GSRP. Perry was similar to GSRP in offering only a half-day program. But its class size ratio was 13 students to 2 teachers.  And almost all students in Perry participated for 2 school years, at both age 3 and age 4.  Perry’s gross cost per student per year was around $11,000 in today’s dollars, which is considerably more expensive than GSRP.

Perry shows large benefits (e.g., about a 19% increase in adult earnings of former participants), based on evidence from a rigorous random assignment experiment.  Because of the differences between class size and duration of Perry versus GSRP, what Perry directly implies for GSRP is that smaller class sizes and longer duration might increase the adult earnings impact of a preschool program.  However, we do not need adult earnings effects of anything close to 19% to get net benefits from a program as cheap as GSRP.  I will return to this topic of program design later.

4. The more mixed educational results of Head Start suggest that more educationally focused programs such as GSRP can achieve more consistent educational results.

Head Start has a more diverse mission based on its history and design, seeking to improve family health, not just kindergarten readiness.  As I have reviewed before, Head Start’s research evidence is more mixed. Meta-analysis of Head Start suggests immediate test score effects of an “effect size” of 0.31, about one-fourth less than the GSRP program.  Several analyses suggest that much of these test score effects fade during K-12, although the exact timing of this fading is disputed. This fading is also found in other early childhood programs. However, studies have also found strong Head Start benefits that re-emerge during adulthood (e.g., one study predicted an 11% increase on adult earnings of former Head Start participants), possibly due to Head Start’s effects on social skills.

Why might GSRP get somewhat greater kindergarten readiness effects than Head Start? In part, because GSRP is more focused on kindergarten readiness. In addition, GSRP is less income-segregated than Head Start, because it includes families up to 300% of the poverty line, whereas Head Start is almost all families below 100% of the poverty line. Research suggests that there are peer effects in preschool, so that preschool is more effective for low-income children when preschool classes also include middle-class children. Finally, it is possible that the greater local control in GSRP may increase effectiveness compared to Head Start, which has extensive federal regulatory requirements that may be well-intentioned, but may distract from focusing on the highest priorities for improving student kindergarten readiness.

5. Programs similar to GSRP show strong benefits for middle-class as well as low-income children; this finding and peer effects suggest that there may be benefits for the state from relatively broad income eligibility standards for GSRP.

Tulsa’s state-funded pre-K program is similar to GSRP in design and funding level. The program is a half-day or full-day program for one school year at age 4. Tulsa’s program operates for 2.5 hours per day for a half-day program, but for five days per week for a school year; Michigan’s half-day GSRP is 3 hours per day for a minimum of 4 days per week for a minimum of 30 weeks.  The total number of hours is about a fourth higher in Tulsa than the Michigan minimum, which we would expect to increase total effects of the program by somewhat less than one-fourth.  (More on this later.)  Total Tulsa state or local funding per student, adjusted to year 2012 Michigan prices, is about $5200 for a half-day program. This is probably above what state and local funding per student is for Michigan’s program, although it is more similar to Michigan’s historical levels of funding.

Based on research by me and Gormley and Adelstein, Tulsa’s pre-K program increases kindergarten entrance scores by similar amounts for low-income students and middle class students. Kindergarten entrance scores go up about 11 or 12 percentiles for a half-day program, and about 18 to 19 percentiles for a full-day program.

We estimate that these percentile increases in kindergarten test scores would increase the present value of future earnings by about $18,000 per middle-income participant for a half-day program, and $29,000 per middle-income participant for a full-day program. The estimated earnings increases for low-income participants are only slightly higher, at $21,000 and $32,000. (This adjusts our paper’s estimates to 2012 Michigan prices, using the Consumer Price Index and estimates of relative regional prices. Free-lunch students are “low income”; full-price lunch students are “middle income”. )  These earnings increases are in the range of 3 to 4 times preschool program costs for both low-income and middle income groups.

Middle-income children may benefit from pre-K because even the best parents have challenges in providing on their own all the services of a high-quality pre-K program, such as extended work and play with other students that will build both academic and social skills. Furthermore, many middle-class parents may have trouble affording a quality pre-K program that may cost around $5,000 for a half-day program for one school year.  Obviously at some income level such an investment is readily affordable without subsidy, but that point is probably well above median family income levels.

Based on this research, a state’s investment in broadening quality pre-K access to middle-income students is likely to pay off in higher skills and earnings for state residents. In addition, it may be easier to keep the quality high in pre-K programs that have a broad range of family income among students.  There is evidence of positive peer effects in preschool, which may help low-income students be more successful in a pre-K program that includes middle-class students.

Therefore, it seems strange for state policymakers to be discussing, as reported by Gongwer, the possibility of further targeting GSRP to low-income students. Such increased targeting would make GSRP more like Head Start, reducing positive peer effects, and would not realize the possible gains for the state by improving the future labor supply of a broad range of state residents.

6. Keeping class size down, and keeping teacher quality up, helps improve benefit-cost ratios in programs such as GSRP.  

As reviewed in chapter 5 of my book Investing in Kids, the evidence suggests that reducing class size in preschool probably increases benefit cost ratios, at least if we’re considering class sizes in the range down to a class size of 15 students to 2 teachers.  Whether there are benefits beyond that point is uncertain.  GSRP as currently constituted has a maximum class size of 18, and by requiring a maximum 8 to 1 child to teacher ratio, has some incentives for keeping class size at 16 to 2, as adding an additional student requires adding another teacher.

Most successful preschool programs have had some teacher credential requirements, including the Perry Preschool Program, the Abecedarian program, the Chicago Child-Parent Center program, Tulsa’ preschool program, and most state-funded preschool programs. GSRP’s requirements for most programs it funds are not unusual in requiring lead teachers to be certified teachers with an early childhood endorsement.

The role of teacher credentials in preschool quality is contested. Some studies suggest that higher educational credentials improve a preschool’s effectiveness, whereas other studies do not support this hypothesis (see chapter 5 of my book Investing in Kids for citations). What is generally agreed is that specific training and knowledge by teachers about early childhood education improves effectiveness. In addition, reducing teacher turnover among preschool teachers would be expected to increase preschool effectiveness.  This may mean that there is some interaction among teacher salaries and credentials and preschool effectiveness. If teacher salaries among preschool teachers are inadequate, increasing teacher credential requirements may actually be counterproductive, as it means that the credentialed preschool teachers will have other educational job options.  On the other hand, higher preschool teacher salaries may reduce turnover, and allow preschools to better compete for higher-quality teachers.

Even modest increases in teacher quality have huge payoffs.  It doesn’t take much of an increase in kindergarten readiness to increase the present value of predicted future earnings by a large amount. For example, suppose some teacher quality initiative in preschool results in kindergarten readiness increasing by one percentile in test score gains for the average student. This one percentage test score gain will increase the present value of earnings per student by about $1800. If we multiply this by a preschool class size of 15 students, the total increase in the present value of earnings for an entire preschool class is $27,000. It is worth seeing if some combination of education, training, and salary improvements might help achieve such classroom gains.

7. Expanding half-day preschool programs to full-day programs, or one-year preschool programs to two-year preschool programs, probably have net benefits, but probably also reduce benefit-cost ratios somewhat (e.g., there are diminishing returns).  However, expanding half-day preschool programs to full-day programs also makes it easier for some families to access preschool programs, with the proportion of families in this situation varying across different local areas. Therefore, GSRP’s focus on one-year programs, with local flexibility on the mix of half-day versus full-day programs, probably makes sense.     

The research evidence suggests that full-day pre-K probably adds 60% to the benefits of half-day pre-K. Two-year pre-K (adding age 3 to an age 4 program) probably adds about 50% to the benefits of one year of pre-K at age 4. In both cases, net benefits are probably positive. But because benefits don’t come close to doubling, the benefit cost ratio of these program expansions is less than the ratio for a half-day, one-year preschool program. (For sources for these estimates, see chapter 5 of my book Investing in Kids).

In the case of full-day preschool, one advantage is that full-day programs are easier for many parents to access, as full-day preschool involves fewer complications in trying to arrange wraparound child care. Therefore, one additional benefit from full-day preschool is that a higher percentage of parents will voluntarily choose full-day programs, which will help improve child outcomes and the state economy if these preschool programs are high-quality.

It seems likely that the availability of wraparound services varies greatly across different local areas and even across different neighborhoods. Therefore, there is some rationale for providing a range of options across different local areas and within local areas, to meet the diversity of needs.

8. GSRP per student funding has steadily lost ground to inflation over the past 10 years.  The proposed increase to $3625 per half-day slot only makes up for a small part of this decline in real funding. This decline in real funding creates challenges in maintaining access and quality, and particularly threatens the ability of the program to fund quality private preschool programs.

From 1990-91 to 2003-04, GSRP generally received increased per student funding every 3 or 4 years, often a sizable enough amount to offset intervening inflation. Over this time period, in 2012-13 dollars, GSRP funding averaged $4300 per half-day slot. In 2012-13 dollars, GSRP funding per half-day slot has declined steadily every year since 2000-2001, when real funding per half-day slot in today’s dollars was $4378. The bump in nominal funding in 2007-08 from $3300 to $3400 didn’t even make up for inflation. The proposed increase in funding for 2013-14 to $3625 is the first GSRP increase to exceed inflation since 2000-01. However, even with this increase, real funding would be projected to stay well below the program’s typical funding per slot for most of its history.

One question that this raises is whether GSRP’s current funding is adequate to maintain quality. For K-12 districts, GSRP can be cross-subsidized from the district’s K-12 operating funds, and therefore the question is whether districts are able and willing to afford to provide such cross-subsidies.

For private preschool providers, in most cases such cross-subsidies are infeasible. Therefore, the issue is whether private preschool providers can provide quality preschool services at the provided per child amount. This issue is more acute because the proposed program expansion suggests requiring that 20% of all funds go to private providers. It may be possible for private preschool programs to find some quality teachers at relatively low salaries. It is more difficult to find many such quality teachers who are willing to work steadily at low salaries. Scaling is a big problem when you are doing things on the cheap.

Establishing a clear standard for what GSRP programs should receive in funding per child to ensure quality is difficult. Presumably there is not an abrupt cliff, but rather a range of quality levels that gradually change as funding gradually changes.  The Institute for Women’s Policy Research has provided a report that examines how costs of a pre-K program vary with various characteristics. Based on their report, for a half-day program that operates every day of one school year, and pays salaries competitive with what is paid to public school kindergarten teachers,  a program with similar adult to child ratios as GSRP would cost $5,121 in 2013-2014 projected prices, and $5,227 in 2014-15 projected prices. (This calculation uses data from Table 2 of IWPR’s report, “Meaningful Investments in Pre-K: Estimating the Per-Child Costs of Quality Programs”.  I used the average of class size ratios of 15 to 2 and 17 to 2 to reflect GSRP’s standard of an 8 to 1 child to teacher ratio. I used their Bachelor’s Degree I estimates.  I adjusted to future prices by assuming that future inflation would be the same as it has been from 2011 to 2012. )

The proposed GSRP funding of $3625 for a half-day slot for  2013-14 and 2014-15, up from $3400 today, is about 30% less than these figures from IWPR of over $5,000 for a half-day slot. IWPR’s estimate of over $5,000 is for a program that would operate 5 days a week for an entire school year.  GSRP has minimum standards of 4 days per week for 30 weeks. The lesser time would presumably reduce GSRP costs somewhat. But would it reduce these costs by 30%? That seems doubtful.  Even with a modest reduction in the length of the classroom program, GSRP still needs to pay annual salaries that would help attract and retain quality teachers who in many cases have options to teach in the K-12 system.

9. The proposal funds expanded GSRP out of the School Aid Fund.  Funding expanded GSRP out of reduced K-12 spending may offset roughly two-fifths of the positive educational and economic effects of expanded GSRP funding, although the exact offset depends upon assumptions about the productivity of K-12 spending. This funding source does not maximize the use of state educational resources to develop a better quality state labor force.

K-12 funding as well as pre-K funding can be productive in improving educational outcomes and future earnings.  The challenge is in determining what productivity assumptions for K-12 spending are reasonable, and can be tied to rigorous research.

In chapter 7 of my book Investing in Kids, I considered a scenario in which expanded early childhood programs were financed by reduced K-12 spending. I assumed that the K-12 spending had productivity effects similar to the effects of changes in class size in early elementary school. The advantage of this assumption is that we have reliable evidence, from the Tennessee Class-Size Study, of the short-run and long-run effects of class size changes in early elementary school. This is an educational intervention for which we know K-12 spending makes a difference, and in which expanded K-12 spending will have economic benefits that exceed costs.  But class size reduction is also a relatively expensive intervention. It is an intervention in which large changes in spending yield moderate changes in educational outcomes, and these moderate changes in educational outcomes cause significant changes in future earnings.

Based on this assumption, the negative effects of reducing K-12 spending would offset about 2/5ths of the positive effects of expanding preschool. While this reallocation boosts the overall efficiency of how we use educational resources, it is counter-productive if the goal is to maximize cost-effective ways of improving the overall quality of Michigan’s labor force.  

Posted in Early childhood program design issues, Early childhood programs, Local variation in benefits

An analysis of the Dalmia/Snell Wall Street Journal article on Georgia and Oklahoma, or the difficulties of case study analysis

A recent opinion column (March 1, 2013)  in the Wall Street Journal, by Shikha Dalmia and Lisa Snell of the Reason Foundation, criticized proposals for universal preschool programs on the basis of the experience of Georgia and Oklahoma, which have been among the leaders in expanding access to preschool.

Their main argument is that if universal preschool is so great, why haven’t Oklahoma and Georgia done better on various social and educational indicators?  Citing statistics for various years, they argue that

“…Neither state program has demonstrated major social benefits…. A … realistic report card for the two states:

Lowering teen births: Oklahoma, Fail; Georgia, C.

Raising graduation rates: Oklahoma, Fail; Georgia, Fail.

Raising fourth-grade NAEP reading scores: Oklahoma, Fail; Georgia, C.

Closing the minority achievement gap: Oklahoma, Fail; Georgia, C.”

What this analysis fails to reckon with are the many other social and economic forces that frequently shift state educational and social indicators. Once one accounts for the statistical uncertainty due to such forces, it is very difficult from case studies of one or two states to determine the effects of even major educational and social interventions with the needed degree of precision.

In making this argument, I am going against ordinary human intuition. We human beings think in terms of specific examples and anecdotes. We love to generalize from a particular individual to everyone, or from a particular state to the world.  Both liberals and conservatives do this. Politicians do it when they argue for re-election on the basis that the national or state economy is doing well. Voters do it when they reward politicians on that basis.

But from a social science perspective, case study analysis is quite difficult to do with sufficiently good statistical precision. If we’re trying to estimate whether any one policy in one state (or one metro area, or one school district) has made a “statistically significant” difference to the state, detecting such a difference requires both very large effects and statistically sophisticated procedures.

Let me take as an example the case of Oklahoma preschool and detecting its effects on 4th grade test scores on the National Assessment of Educational Progress. Prior to 1998, Oklahoma had a targeted preschool program that typically enrolled 11% or less in pre-K. In 1997-98, Oklahoma enrolled about 5% of all 4-year olds in state-supported pre-K.  In 1998-99, this jumped to 38% in the state pre-K program.  The program then expanded more gradually until today, Oklahoma enrolls about 74% of all 4-year-olds in state preschool. But the big jump occurred between 1997-98 and 1998-99.

Given all the other things affecting 4th grade test scores, if we want to detect the aggregate effect of this preschool expansion, we need data on 4th grade test scores based on children who were age 4 “before” this 1998-99 “big jump” in preschool enrollment, and also 4th grade test scores for children who were age 4 “after” this 1998-99 “big jump” in preschool enrollment. To help control for other factors affecting 4th grade test scores, it would probably be best to consider the closest observations we can get “before and after” the big jump in preschool enrollment. If two observations are closer in time, we can hope that “other factors” affecting test scores will not have changed as much.

In the case of the NAEP, we happen to have reading and math results for 4th graders in the winters of 2003 and 2005, which would correspond to children who were four year-olds in the 1997-98 and 1999-2000 school years.  Over that two-year time period, Oklahoma preschool enrollment went up from about 5% of all 4-year-olds to around 51%, a jump of 46%.

What would we expect to happen to NAEP 4th-grade test scores due to an increase in Oklahoma preschool enrollment of 46%? According to the NIEER study of five state pre-K programs, Oklahoma’s program raises both literacy and math test scores at kindergarten entrance of pre-K participants by about an “effect size” of 0.35. (This literacy test score effect averages results across a vocabulary test and a “print awareness” test; the math test score effect comes from a single test).   This effect size calculates the test score change as a proportion of the standard deviation of the test scores across different students. This corresponds to an increase of between 13 and 14 “percentile points”, which certainly seems “large” in that most parents would care about such an improvement for their child. (We’ll see later how this translates into benefit-cost analysis of the program.)

But this increase only occurs for the additional pre-K participants, who are 46% of the four-year old population. So we would expect kindergarten entrance scores over ALL 4-year-olds in Oklahoma to only go up by 46% times 0.35, or an increase in “effect size units” of 0.16.

But the NAEP data are for fourth grade. The evidence suggests that we should expect to see some considerable depreciation of initial cognitive effects of preschool between kindergarten and fourth grade. This is true even for programs, such as Perry, Head Start, and the Chicago Child-Parent Center program, that show a considerable “bounceback” in program effects in adulthood.

It is certainly plausible that the initial effects of Oklahoma pre-K could depreciate in half by fourth grade.  If so, the average “effect size” we would expect to see at 4th grade would be an increase in math and reading test scores by an effect size of 0.08.

On the NAEP, an effect size of 0.08 corresponds to an increase in the NAEP score of a little less than 3 points. (The NAEP standard deviation is around 35 points, and 0.08 times 35 is 2.8.)

What do we actually observe for Oklahoma? From 2003 to 2005, Oklahoma’s 4 grade reading NAEP test score increased by 0.5 points LESS than the U.S. But over the same time period, Oklahoma’s NAEP math test score results for 4th grade increased by 1.9 points MORE than the nation as a whole.

But there are two sources of uncertainty in interpreting these estimates as showing the effect of Oklahoma’s preschool program.  The first is that we don’t have an infinite sample size of students taking the NAEP test in Oklahoma and the U.S. Because we have a finite sample, there is some uncertainty about whether the Oklahoma and U.S. results accurately represent the population of students in Oklahoma and the U.S.  in 2003 and 2005.

This sampling uncertainty by itself is sufficient to make it hard to tell whether the NAEP results deviate from what we would expect. The “95% confidence interval” for the Oklahoma reading test score decline of 0.5 points relative to the U.S. is plus or minus 3.3 points.  Therefore, 19 times out of 20, the true number for the population as a whole would be somewhere in the range from minus 3.8 points to plus 2.8 points.

Similarly, for the test score gain in Oklahoma relative to the nation of 1.9 points from 2003 to 2005, the 95% confidence interval is plus or minus 2.7 points. If we had an infinitely sized sample, the probability is 95% that Oklahoma’s advantage over this time period would be somewhere between plus 4.6 points and minus 0.8 points.

But this is only one source of uncertainty about the estimates. Even if we had an infinite sample for both Oklahoma and the U.S., we know that there could be a wide variety of influences (changes in demographic mix, changes in school funding, changes in cultural trends, etc.) that might cause test scores to change. The same is true of other educational and social trends such as high school graduation rates and teen pregnancy

We can gauge the possible size of this additional uncertainty, due to unobserved variables that cause test scores or other state outcomes to fluctuate, by observing how much test scores or other social outcomes in various states fluctuate from year to year, above and beyond what we would expect due to limited sample sizes. When economists have looked at these fluctuations across states, they frequently find that such fluctuations increase uncertainty sufficiently to increase confidence intervals from two-fold to five-fold. That is what is found in Conley and Taber’s paper on this topic, which looks at influences on state college attendance. And it is found in Fitzpatrick’s paper on Georgia, which is looking at how preschool influences 4th grade NAEP test scores.

Thus, the true confidence intervals are not plus or minus 3 points. They are probably at least plus or minus 6 points, and possibly much larger.

We can intuitively understand this by noticing that test scores in states such as Oklahoma fluctuate by many points over relatively short-term time periods, even when preschool access is not changing.  For example, from 2000 to 2003, Oklahoma’s 4th grade math NAEP test scores increased by 5 points. From 1998 to 2002, Oklahoma’s 4th grade reading NAEP test scores dropped by 6 points. These are time periods that match up with time periods 5 years earlier, when these 4th-graders were 4-years old,  during which preschool access in Oklahoma was not significantly changing. Apparently there are many other educational, demographic, and social trends that can cause quite dramatic short-run fluctuations in test scores.

The bottom –line statistical conclusion from this discussion is that we cannot tell whether the jump in preschool access from 1997-98 to 1999-2000 had the expected effect of increasing Oklahoma’s NAEP scores relative to the nation by 3 points.  The estimated differences between Oklahoma and the U.S. changes in test scores probably have confidence intervals of plus or minus 6 points at least, and the changes we actually observe are therefore quite consistent with the expected effect of Oklahoma’s preschool program over this time period. Unfortunately, the confidence intervals are so large that we cannot tell whether the test score gains due to preschool match expectations, or are zero, or are larger than expected.

But, one could argue, if we can’t tell in Oklahoma’s aggregate statistics that preschool is making a difference, isn’t it the case that preschool is having too minor an effect for it to be a worthwhile investment?  No, that does not follow. For example, suppose that the NIEER estimates are right that preschool increases kindergarten entrance test scores by about 13 or 14 percentile points. Not only is this large from the perspective of parents, but it would be predicted, based on Chetty et al.’s results, to yield large percentage effects in earnings. Chetty et al.’s results imply that such a kindergarten test score increase would be expected to increase adult earnings by about 7%. I think most people would regard this as a large effect.

From a benefit-cost standpoint, it certainly is a large effect. In my paper with Gormley and Adelstein, we used Chetty et al.’s numbers to calculate that a 1 percentile increase in kindergarten test scores would increase the present value of future earnings by about $1500 (in 2005-06 prices). A 13 percentile increase would increase the present value of adult earnings by almost $20,000. This is for a preschool program that in Tulsa had total costs of $4400 for a half-day program and $8800 for a full-day program.

Why aren’t such large earnings effects easier to detect in 4th grade test scores? In sum, these effects are hard to detect because: effects are spread across an entire cohort of children, many of whom do not experience increased access to preschool, and this lessens the average effect on the cohort; test score effects commonly fade but then re-emerge in improved adult outcomes; there are a lot of forces causing test scores to fluctuate.

Does this mean it is impossible to tell whether preschool works? No, it is possible to detect preschool’s effects, but to do so we need better statistical evidence than is provided by a case study of one state’s aggregate data. We can learn a lot more if we have a program group and good comparison groups in the same state, which lessens the problem of other factors driving test scores. Or, we can learn a lot more if we have observations on many states that have dramatically expanded high-quality preschool during different time periods, which helps makes it easier to disentangle preschool’s influence from other forces that affect test scores.

Does this mean that preschool does not provide a miracle solution that immediately solves all social ills? Yes, it does mean that preschool is not a miracle solution, but that is quite different from saying that preschool does not pass a benefit-cost test. If preschool had effects on participants at kindergarten entrance that were, for example, five times as great as were estimated by NIEER for Oklahoma, say an effect size of 1.75 at kindergarten entrance, then even when these effects are dissipated by being measured over an entire cohort, and even if the test scores depreciated over time, and even if other factors affected test scores, then we could still readily detect preschool’s effects in aggregate state test score trends at 4th grade. But then we would be talking about preschool moving students from the 4th percentile to the 50th percentile, or from the 50th percentile to the 96th percentile, which are extraordinarily large effects, well beyond what it is reasonable to expect. But we don’t need effects anywhere near that large for preschool to pass a benefit-cost test, given that the cost of a quality half-day preschool program is only around $5,000 per year for one student.

In sum, the key problem with the Dalmia/Snell article is that it does not provide strong evidence from a social science perspective. A case study of one state’s aggregate data will rarely provide convincing evidence for or against any social intervention, even in cases where that social intervention has a high benefit-cost ratio.  Case studies of one or two states may be a persuasive political argument, but evaluating the benefits and costs of any policy usually requires other, better evidence that can more accurately detect policy-relevant effects.

Posted in Early childhood programs | 1 Comment