What works in early childhood education panel

Daphna Bassok, Howard Bloom, Christina Weiland and Hirokazu Yoshikawa discuss what works - and doesn’t - in early childhood education. March, 2016.

Transcript:

>> So thank you all for joining us today. So, I'm Susan Dynarski and along with my colleague Brian Jacob at the Education Policy Initiative at the Ford School. We're happy to co-host this event with the School of Education. We thank the guest nurse Charles H. and Susan Gessner for their generous support of this event and for the staff who did all the work to make this happen so we can enjoy it. So this is a very exciting time to be talking about early education. The White House has been advocating for preschool expansion ever since the State of the Union in 2013 when the president announced his universal preschool initiative. And since then there's been a burst of energy in-- to early learning at both the state and the federal level. Even here at the University of Michigan we've been paying attention to this issue quite a lot lately. Just a few weeks ago, we had Greg Duncan speaking at the Institute for Social Research and our colleagues at the Ford school host a discussion of child care policy in Ontario and Michigan early this week. So with so much interest in these topics, it's a good time to get the design of these systems right, and that's what these distinguished researchers are going to be talking about. So let me tell you right now who they are. So, to my right here and starting us off is going to be Harvard Bloom-- Howard Bloom, who used to be at Harvard, yeah he was, at the Kennedy School. Howard Bloom, Chief Social Scientist at MDRC Research Institute based in New York, and Howard leads the development of the experimental methods for evaluating programs impacts including a currently reanalysis of the National Head Start Impact Study. To his right, following on would be Christina Weiland, she's assistant professor right here at the School of Ed. And she focuses on the effect of early childhood interventions and public policies on children's development especially kids from low income families. Next, we've got Daphna Bassok who is visiting from warmer climes, Virginia; she's an assistant professor in education policy and the Curry School of Education. And she was undergraduate here at the University of Michigan. And she is less loyal to Michigan than she is in Zingerman's, where she worked all the way through her undergrad career. That's what she told me. Her research includes work on the effects of pre-kindergarten on educational outcomes, the early childhood teacher labor force and trends in kindergarten becoming more academically demanding. And finally we've got Hirokazu Yoshikawa, the Courtney Sale Ross Professor of Globalization and Education at NYU Steinhardt. And he's a community and development psychologist who studies the effects of public policies and programs on children's development. So let me tell you about the format. So the speakers will speak. There's going to be a Q&A moderated by Brian Jacob. You've got note cards to write questions down on and we will collect those periodically and Brian will read from them and ask the questions. And if you're this type of person who wants to go on to Twitter, you can also put your question on the Twitter using the #EPIearlyED. You can also snark there or say nice things on Twitter. OK. And so now I'd like to present our first speaker, Howard will get us started. Thank you.

[ Applause ]

>> Thank you, Susan and thank you all for coming today and I thank you, Chris, for inviting me to come. I-- what I want to talk about, I am definitely not an early child education expert, but I'm going to talk about some work that Chris and I have been doing over the last four or five years, which is a reanalysis of something called the National Head Start Impact Study. It's a very important study. It's got a lot of information about Head Start and its alternatives. And in many ways it's a really a good case study in terms of how to study variation in impact. Let me just get back to the beginning of this thing. Yeah. And that's what I'm going to talk about. I mean, what I do know is something about is how to study impact variation. I've been working with a number of colleagues over the last years trying to develop methods and ways of thinking about variation in impacts of programs or interventions across individuals or subgroups of individuals and across-sites. And so the work that we've been doing on reanalyzing the National Head Start Impact Study is an application of those kinds of thoughts that kind of thinking to substance that's highly relevant for today's discussion. The Head Start Impacts Study was a congressionally mandated evaluation of the National Head Start Program. It is the first large-scale randomized study of Head Start and quite frankly is the only large-scale randomized study of Head Start. It was conducted in two-- the program year 2002-2003, so you get a sense of when the program I'm going to be talking about was being fielded and what it was it's being evaluated. It was conducted-- This is very, very unusual not unique but almost unique, very unusual in evaluation research in a nationally representative sample of sites. Sites being a Head Start centers, I'm assuming everybody knows about the Head Start centers. But it's national representative sample of over subscribed Head Start centers, which was pretty much most all of the Head Start centers but not all of the Head Start centers. Oversubscribed in a sense that they had more applicants than they had seats and so that it was possible both about logistically and on sort of political and ethical levels to justify a randomize experiment much like it. And I'm assuming most people here know what a randomize experiment is. You're randomizing applicants into the treatment, which in this case would be Head Start and an offer to attend a Head Start center or control status, which in that case would be not getting the Head Start offer. That creates an ability to compare a treatment group of kids who, in a sense won that lottery if you will, to a control group of people who lost that lottery, who in all respects on average over all should be the same in both measured and unmeasured ways, which give you the kind of rigor of-- that you don't get in evaluation designs that aren't randomize. So that's why the randomize part of it is important. It's an unusually rigorous way to say whatever difference we see we do in a treatment in control group later on in terms of their outcomes is arguably caused by the offer of Head Start versus not getting any offer at Head Start. So that's a really important starting point for all of these things. The study produced a sample of roughly 4400 kids that were randomized in about 350 Head Starts centers. The analysis we're going to be talking about is the subset of that, it's more like 3500 kid in about 300 centers, but it's a lot of kids and a lot of centers that we're talking about in a nationally representative sample. And this-- the sample, this data is now public-- not quite publically available. It's what's called a restricted use file, so under certain guidelines you can use this data and it's been used by a number of researchers. So I'm not going to talk just about what Chris and I have done and what we've found, but I'm trying to look across what others who have been using this data set seemed to have concluded and a little bit about why. But mostly about our stuff since we did it and-- but also what others are finding and what that means and why they think that what they've found is to be believed. OK. So, I don't know how to separate these pages, so we'll look at them at the same time. So, my presentation to go-- I just said is to briefly synthesize and summarize what's been learned about impact variation from the Head Start Impact Study. OK. My conclusions, I'm going to jump from the conclusions to define. So first conclusion about-- so I have conclusions about impact variation across individual kids. I have conclusions about impact variation across-sites and then a couple of conclusions about the role of early child education quality, because a lot of people had been trying to deal with the issue and what's the relation between quality and outcomes and impacts and such. So I want to talk a little bit about each of those things. All right. The first conclusion then is the first sub bullet, if you will. That thing right there. Impacts tend to compensate for limited prior English. This is the finding that Chris and I came upon. And I want to demonstrate to you the results that we based that conclusion on. And what I mean by compensation, is kid's who originally before the experiment started did less well than others on a pretest of-- as it turns out, receptive vocabulary. They had bigger impacts than kids who did better than them on the pretest, but only inside of dual language learners. And I want-- now I want to make this point. OK. And this is a finding that it's sort of unique to our work. It has-- The other folks, some people disagree with it, some people simply aren't speaking to that point. OK. But let me show you the findings. The findings are right next to it. OK. There are two sets of findings. The top panel is one set of findings, the bottom panel repeats those findings for another cut through the day and let me describe what these findings mean. The two columns are for different outcome measures. OK. They're different tests of kid's ability to do cognitive things. So one is something called the PPVT, which is the Peabody Picture Vocabulary Test, that was a test. For me, effects like-- this is receptive vocabulary, OK, can people understand what they're hearing. OK. Another test is effects-- is for early numeracy. That's the Woodcock-Johnson Applied Problems Test. It's for early numeracy. Can they deal with early numerical kinds of things? OK. We look at other outcomes, but these two outcomes we see this very, very pronounced effect. And what is this-- pattern of effect? It's a pronounced pattern of effects. OK. We break the sample into dual language learners. At baseline kids, they-- they were-- a number of characteristics of the kids were enumerated, the data were taken on them, and so some kids were designated as dual language learners and there-- and/or not. OK. And so dual language learners is one subset of this sample and English-only is the rest of the kids in a sample. So it's a binary thing. It's one group or the other. And inside of the dual language learners, we split that sub sample into two sub, sub samples, if you will, which is those dual language learners that were-- what we call low pretest performers, and we define that in a particular way as in this panel, scoring at or below the 33rd percentile of all the kids in the study. OK. So those who have scored on a pretest before-- pretty much before Head Start started, OK, those who are in the lower 3rd of the pretest versus those who weren't in the lower 3rd, everybody else. So that's the split inside of dual language learners, low pretest performers, other sample members. Likewise, the same split with the same criteria for English-only sample members. And then we estimated the average impacts for each of those four subgroups, are you with me? OK. And this is the pattern that we see. For the dual language learners, the low pretest performers have an average impact measured in what's called the standardized mean difference effect size, and most people who do education research know what that means, and those of you who consume education research have read findings that are put out in a metric, but it's a particular metric that is meaningful to researchers and not as meaningful to other people. But an effect size of 0.36 is a very, very large effect, so anything-- I mean, if people differ. But I would say anything above 0.15 is something to be relatively pleased about. OK. And if you're talking about more than double that, that's a very large average effect. OK. That's on the PPVT take for the low-- the dual language learners who are low pretest performers. The other sample members who are not low pretest performers is a 0.19. That's not nothing but it's not anywhere near 0.36. And the fact that that's bolded that those two are bolded means if those two findings are statistically significantly different from each other. OK. So there's a big difference in their actual values of those estimates and they are statistically significant. They are more different than could have happened by chance. Over here for the other-- the applied math, you see a similar pattern. OK. A big positive effect for the low pretest performance amongst dual language learners and a negative but not statistically significant, so you can actually say that's different from zero with any confidence. But the difference between those two is absolutely statistically significant and quite large. All right. Now, so what the-- there's a compensatory pattern, compens-- the word-- the people who are using compensatory means you're compensating those who started off towards the bottom more than you're helping the kids who weren't at the bottom. So that word people are using all the time. You see it amongst the dual language learners. Can everybody see that finding amongst the dual language learners? You do not see it amongst the rest of the sample. You do not see that pattern. So among the English-only sample members, these are the comparable findings and they're all relatively-- they're small to modest at best and there's no clear pattern at all between the low pretest learners and the rest of the sample for the English-only. We think that that is basically evidence that Head Start is compensating for lack of prior access to the English language because we're talking about receptive vocabulary here as one of the outcomes and then math where you got-- that test was given in English, so you got to understand English to do the math test. We don't think there are other people-- there's a group out of UC Irvine, Marianne Bitler and her colleagues, where they're talking about finding a compensatory finding, but they're not talking about an English language. It's like specifying what it's compensating for, it's just compensating. But not specifying for English language because they're-- they look at this pattern in a different way and they don't think they're finding this effect. I don't think however you can explain this effective way. Now, one other thing we did since you-- we didn't want-- we wondered whether is this finding-- is this pattern of findings sensitive to where you make the cut off for low pretest? OK. The third is low but it's not as, you know, you could go lower. So we cut it off-- We did the same analysis for a different threshold, a different pretest threshold which is the lower-- lowest 20% of the pretests scores. Are you with me? So it's the same analysis, only calling low pretest performers, those who are in the low is 20% of the pretest scores, and you see the exact the same pattern only a bit more extreme, a bit more extreme. So we think that's evidence of compensation for the English language. Now, one of the things we did was-- and we're not quite sure what to make of it and we've been changing the way we analyze the data and I don't have a slide to go with this. As we followed them up, there are later tests. This test-- these outcomes were at the end of the Head Start year. They were randomized before Head Start. There was a Head Start 2003-- 2002 to 2003 year and the posttest was given in the spring of 2003. These findings are based on the Head Start year-- the end of the Head Start year. There were later ways of follow-up for the Head Start Impact Study where a lot of these findings faded away and I-- some of you who saw Greg Duncan talk here recently heard him talk about fade out and this is one of numerous sort of examples of that. We're not-- One of the question we have in which we're not finalized on the analysis yet is, is there a fade out of this pattern here amongst the dual language learners? The dual language learners, there's less fade out than the other people. I mean, there's a bigger impact to start with and there is a lot of fade out but there may be some residual effect or not. We haven't really decided what we think yet on that, but the policy implications of that are profound. If it fades out, what does that mean? If it doesn't fade out, what does that mean? But it's very clear to us that there's this English language compensation thing going on. That's one set to find. The other thing I want to talk about and then I'll try to talk about other people, but I'm not going to talk about other people given that I just saw I had one minute, but that's OK. We could do that with questions and answers. The other thing I want to talk about is cross-site impact variation, which is where we spent a lot of our time, both developing methodologies and applying. The question here is how much do the impacts of Head Start now on six different outcomes, four cognitive outcomes and two social-emotional outcomes, how much do they vary across-sites. So what you're looking at is a slide where the first column is the average, the cross-site average impact, again in this effect size metric. What you are looking at in the second column is the estimated cross-site standard deviation across-sites. So for example, this finding here and then I'll stop because I know I have to stop. It says for receptive vocabulary on the PPVT, OK, the overall average that we estimate-- and this is the effect of attending Head Start, OK, is 0.17. The estimates cross-sites standard deviation is 0.17 in the same metric, what that suggests, OK, if the cross-site distribution of impacts of Head Start is anywhere near normally distributed. That'll [inaudible] anywhere near basically. That 95% of the sites, what it implies is 95% of the sites are somewhere 0.17 plus or minus 2 times 0.17, which by my calculation since I don't do arithmetic particularly well, is somewhere between 0.17 and a minus 0.17 and 0.51. OK. It's a big range most of is positive, little of it's negative but it's a very, very big range. So then the question is why all at variation, and I'll stop in a second. The other aspect of this, this is worth looking at if we had more time or if somebody asked a question perhaps, is that you've got an early numerously outcome with a decent size average and in much smaller variation. Chris and others thinks that might be due to lack of variation in the way early math was taught in Head Start, and anywhere at that point in time, and I'll stop there just sort of suggesting that stuff.

[ Applause ]

>> There we go. Hi everybody. It's great to be here having this conversation and getting a chance to hear questions about the work. Thanks of the invitation. My birthday was recently and so I feel like it was a big present to get to invite these wonderful colleagues here to have this conversation. So I volunteered to go after Howard Bloom, which is something I rarely do and try not to do because he's always a hard act to follow, because the content of the presentation that you saw about Head Start really sets the stage for what you can think of kind of as preschool 1.0 question. So, if we increase access to preschool, who benefits, right? And so you saw some really interesting findings around dual language learners who have low baseline test scores in particular benefiting about variation and impacts across centers as well. And I'm going no talk today a little bit about the 2.0 question, which is jumping a little bit into beyond access to thinking about how you scale high quality in particular. So I'm going to talk about what specific program elements work best for ensuring high quality and promoting initial and lasting gains for children. So to launch into that, let me define what we mean generally in early childhood when we talk about high quality. So we conceptualize it is being in two buckets. The first is structural features like class size, ratios, teacher education and training. These things are fairly easy to regulate and monitor from a policy perspective. We have done so and what we've seen is that we have moved to a place where very few programs-- public programs at least do poorly on this domain of quality. They are sort of in a middling range. A harder one to regulate are process feature, so these are your high quality interactions within the classroom, particularly rich learning opportunities, are the children being challenged in a way that's developmentally appropriate and is pushing them along a continuum of skill building that they are fully capable of doing at this age. This is harder to regulate and what we know about the relationship between structural and process quality, is that structural quality sets the stage for process quality to occur, but it alone isn't sufficient, right? So, class size isn't enough. And nationally on process quality from the data that we have, we do have some good news in terms of emotional support which is a piece of process quality. We do a pretty good job in our public programs of making kids feel supported emotionally in the classrooms. So these are data from five public systems around the country that are large-scale, the Boston pre-k, Tulsa pre-k, Tulsa Head Start, National Head Start and the 11-state pre-k study. So, that dotted line up there is about the good threshold on this measure and pretty much all of these systems are clearing it. So if you look at instructional support, however, we have a problem on our hands nationally. So, that black line is the adequate line as you can see folks are pretty much not clearing it or well below with the exception of Boston, which I'll talk about in more detail. OK. So if we're thinking about what works in early childhood education, I think that's the problem we have to think about because that's probably the one we're struggling with the most. And Hiro, who's on the panel today and a group of experts reviewed the literature in 2013 as this proposal from Obama was coming out to give some guidance to policy makers about what we know works in education-- early education and identify the strongest hope model, which across a set of about 12 to 15 RCTs that have occurred over the last decade. There's been kind of a pattern of emerging success, which is that you take a domain-specific curricula that was developed by somebody who's an expert in a particular area of early childhood development, so now Duke here at the School of Ed., has one of this curricula that works for literacy for example, and if you know now, you know, she knows that domain like the back of her hand, so it's not a surprise that, you know, that curriculum in particular works well. So if you pair that with regular in classroom coaching by a supportive mentor, that seems to be a winning package. So we have this pattern in which that is the thing if you're going to tell people what to do, that's probably the thing you would tell them to do in the program if they're going to have one. So we do have some examples that are really important of combining multiple domain-specific curricula. So, preschool teachers are asked to improve children's learning across a variety of things, not just, say, in literacy and language but also math, socio-emotional development. So we do have some important examples of folks putting different curricula together and having success with doing so. And I'm going to talk specifically about one of the places where we've seen success and that's in the Boston pre-k program, and I with a lot of folks in this room, actually have been investigating this model. Hiro, Howard, students of Shana Rochester's, who's in the back. Anna Shapiro, if she is here, Sania Zaidi, Becky Unterman who's here from California and Howard, too. So, Boston is an interesting model because it's not one that was tightly controlled by researchers or by these curriculum developers. It's a district that looked at the literature and said, what are we going to do to improve our model and sort of ran this improvement system themselves. So to take you through this, at the beginning of the program's history in 2005 when they began the program, they made really strong structural quality investments. So teachers were paid on the same scale as K-12 teachers and they were subject to the same educational requirements as K-12 teachers including a master's degree within five years. Those are fairly rare features particularly in 2005 within our systems. However, in 2006 when the quality of the program was measured by an outside group, there was a finding that the instructional quality was-- it was pretty low. It looked like other places that we have nationally now, and the headline on the front page of the Boston Globe is what you see on the slide up there, "Boston preschools falling far short of goals hobbled by mediocre instruction," which is a pretty scary headline on the front page of your hometown paper if you're a new pre-k program. So, they took those finding and moved forward with them. So they put into place proven language, literacy, and math curricula that they combined for teachers, so teachers get teaching guides. And they developed a coaching system in which there was a coach coming in to watch the teacher's instruction on a weekly to biweekly basis across the district and to give them feedback that was supportive and not punitive. And when the research teams that I've been a part of came in and we did work on this model, what we saw that it-- was that within these two years after implementing of making this switch to this system that it had this-- the highest instructional quality we've seen nationally in a large-scale system and that it had impressive child impacts. So I'll show some of these, Howard already explained to you what an effect size is which is very nice. So, these are the impacts on the domains that were directly targeted by this model, and so the impacts for vocabulary and math in particular the largest we've seen in a public pre-k program that's large-scale, and the early reading one as well around alphabet knowledge and that kind of thing was a large impact. We also saw spillover of the impacts onto other domains, particularly executive functions skills, so this was not directly targeted by the model but, you know, it's one kid, one brain and these things are linked and so we saw some evidence that there were spillover here. We also saw that the children who particularly benefited for the model were children who are low income and children who are Latino, but everybody benefit from the model. And we also found that 2/3 of our control group weren't just home with their parents, they were actually in other preschool programs around the city. So this is a pretty strong counterfactual relative to some the other preschool programs that we have where maybe the options in the control group are not quite as robust. So, as we look nationally though and think about what works, we don't see too many places making the decisions that Boston made. So, nationally most programs aren't using domain-specific evidence-based curricula. They're using whole child curricula and so I've put out the most popular choice, which is creative curricula and this is its effectiveness rating in the What Works Clearinghouse. So for these important scales-- mathematics, oral language, phonological awareness, print knowledge, the effectiveness rating is zero. That's in contrast to building blocks, which is the math curricula that's used in Boston and at some other large-scale systems where you see an improvement index of 36 percentile points and effectiveness rating at the most positive end of that scale. OK. For coaching, again this isn't a practice that's particularly widespread. We don't have a great data on this, but it's not something that is really commonly implemented within large-scale programs, although that is changing somewhat. And the history of why that is, is not particularly definitive but it's probably due to the fact that some of these curricular are newer, the ones that we have a stronger evidence-based for takes time for people to pick up on the latest thing. They-- We also have some requirements in some systems that teachers and the systems have to pick a curriculum that covers every child domain, which is going to lead you to one of those whole child curricula. And we also have some programs that require you to collect data on students who are in your program and those tend to be tied often to whole child curricula. So if you are a district and you have to buy something, maybe it's better to buy one thing than two things and try to integrate them yourself. And so, we do have a lot of unanswered questions and thinking about, you know, best advice for where we should go forward and which model within the programs that we have and some of those questions, I think I got-- have about five slides on what we don't know. But just to call at a couple of things, we need to-- we don't understand entirely how folks are making these curriculum and PD decisions at the pre-k level. We also have a new promising curricula with Nell Duke and Doug Clements and others in which domain-specific experts have come together to make one curricular so that you're not tasked with cobbling together a bunch of different curricula, but it hasn't been tested yet. We also don't know really how to best sequence things from preschool to third grade in ways that recognize that preschool's part of a pipeline and just the beginning of the education system and to think about how it to sequence it appropriately. But I think with all of those unknowns and given where we are in the literature right now, one thing that I'm thinking a lot about is the potential to work within the Every Student Succeed Act, which is our new federal education law to potentially nudge localities to adopt evidence-based curricula and coaching. So within that, we have for the first time a definition of what evidence-based means and some incentives that are intended to help nudge folks to using things with higher evidence basis. I think there's a lot of work to still be done to see what the policy will look like on the ground level and some negotiations that are happening around the rules. But that is one thing that as we go forward we may see some movement towards the models that have more evidence. We also have a lot of folks working on active ingredients in preschool for time. I won't go into that. That will help us in five years have better answers about this, what works question. And I think nationally we are seeing a shift that's hopeful around not just talking about whether we should have preschool or not but at the same time talking about what it should look like, which is a very important question. But it's not enough to just have access, but you have to access and quality at the same time. So, thank you. I look forward the questions.

[ Applause ]

>> OK. That was two total lead ins into what I was going to talk about which will help. I'm going to talk about some recent policy efforts to create some of the quality changes that Chris was talking about. So, as Chris kind of mentioned we've moved from-- should we have access, should we provide preschool to more around the quality and that was kind of my starting point as well. So, this first figure I'm showing is the preschool enrollment of children of who are 3 to 5 in the United States and you can see that it's been sort of-- the blue bar on the top is the national trends in school enrollment for young kids. So this is not kindergarten, but any sort of preschool-type experience and you can see it's been rising very rapidly since the '80s. But something to note, that purple line kind of went up until the mid '90s and then it's been pretty flat. That's the private enrollment in preschool, so parents paying for their kids to go to child care. And that green line is kids enrolled in public preschool, so a really large public investment in providing preschool for kids. And really going to some kind of nonparental care is very, very common place for kids in our country right now. So those first three bars on the left are children 3 to 5 and this is from 2012, and you can see that about 80% of kids are in nonparental care between the ages of 3 and 5, and 60% of them are in something like the child care center. So these are 3 to 5-year-old. And then there's another 12% who are being taken care of by a non-relative in some sort of home-based setting. And even if look all the way in the right, which is all children 0 to 5, there's lots of kids going to child care on a regular basis. So even including infants, 60% of kids are in nonparental care and about 50% of them are in non-relative care. So, basically, going to preschool and public support for going to preschool is basically a thing now. It's happening and the support for it is high. Where we're moving now is more of a discussion around what that should look like and how do we ensure that the quality is in place so that these preschool investments lead to the kind of positive benefits that people tout around early childhood. So, we know from some of the studies that Chris discussed, that high quality early childhood experiences can be linked to a host of kind of positive benefits, both short and very long-term, but a lot of the programs the people are attending today are mediocre to not very good, particularly for kids in low income communities. And as Howard pointed it out, the variation is really tremendous. And the variation that Howard and Chris talked about, so far has been really around variation within the Head Start Program or variation within the pre-k programs. But there's also a tremendous variation which I'll talk about sort of across the entire sector, which also includes licensed child care centers that are generally of the much lower quality than either of the kind of large publicly funded programs. And especially because of some of the results of the Head Start Impact Study that have suggested fade out and recently of a study of a large-scale preschool program in Tennessee which suggested that by third grade, the kids who went to preschool were not doing any better than the kids who didn't. There's been a lot of talk about, well how do we take this idea of high quality preschool to scale and how do we make it work, and the focus is really been on quality. So, to give you a sense of the fragmentation. These are the rules for teacher education across state pre-k and Head Start, which are the more highly regulated relative to the programs with less regulation. So in a family child care home, this is a person who is taking care of children in their house. Only 18 states require that the person leading these have a high school diploma, and even when you look at private child care centers, 36 states require a high school diploma or-- and no states require kind of a an associate's degree or a bachelor's degree. So this is a very low level of education required compared to a Head Start Program, where essentially nearly 100% of the teachers have a degree and 73% have a college education, and in state pre-k 53% of the teachers have a requirement for a BA. And the regulations are pretty powerful. So, in 1997 if you looked at Head Start, only 1/3 of the teachers had a degree at that time, and through two reauthorizations of Head Start, there were pieces in the legislation that's said, first, by 2003 we need 50% of Head Start teachers to have an associate's degrees, then by 2013 we need 50% to have a bachelor's degree. The regulation was in place and you can see that today all the teachers in Head Start do have that education level, but it required a large investment. And despite that investment within one sector there's still huge differences across the sectors and what kids experience. So this is just one example. This is teacher's years of education across sectors, and those purple bars on the left there are the experience-- the teachers that are working with 2-year-olds. And so you can see that the first one is formal care, so this is taking your child to some sort of center-based care and there you have teachers, who on average have one more year after high school. And in the home settings, it's just basically a high school diploma. If you look at the formal sector, quite a bit more education when you're looking at the teachers of 4-year-olds than 2-year-olds So the teachers of 2-year-olds have about one year post-high school, whereas the teachers of 4-year-olds have three years for post-high school. And then finally those blue bars on the end, is the variation across the sectors with informal. So, these are the programs that 4-year-olds attend. There's private centers, Head Start and pre-k, and you see that the pre-k teachers which are often times linked to the public schools and have the same kind of requirements as the k-12 system, have substantially more education. And this is a very similar picture talking about, do you have a degree in early childhood? And again, if you're looking at the experiences of the 2-year-olds in this country, even if the formal sector only 20% of the teachers working with young kids have a degree. Going to the 4-year-olds, about 60% have. So that's a huge disparity in the kind of person who is the educational level of a person working with toddlers versus 4-year-old. And again, you see that centers-- the private child care center has much lower levels of education relative to the pre-k programs. OK. So, accountability has come up as a strategy to address some of these quality programs. And what do we mean by accountability? Basically, the ideas is to create a set of quality sent-- set standards that go across these sectors and say, "Here's what we mean when we talk about a high quality program and measure it and provide both financial incentives and supports for programs to try to improve over time, and disseminate the information to parents and other stakeholders, so hopefully they can make a decision to select the care that has higher level of quality." The Race to The Top-Early Learning Challenge expanded interest in these programs by requiring, that in order to get the money which was $1 billion that has been distributed for early childhood programs, you needed to design and implement it to your quality rating system. And today 40 states have them, most of which started since 2011. So this is just a map of where of these programs are located, you see 40 have them and the states that don't are working on it right now, so this is becoming statewide. And the idea is you measure quality, you provide these ratings parents and providers respond and over time we're going to work towards improved outcomes, both because the centers will get better or because parents will opt for the higher quality programs and the lower quality ones will leave the market. OK. So really quickly, I just wanted to talk through four of what I think are the big central issues around, whether these accountability systems are likely to have the desired impact in early childhood. One big sort of philosophical question is what should we be incentivizing in these early childhood systems? So, if you think about K-12 accountability, what we have been incentivizing is test score gains, and that is just not in the cards for an early childhood accountability system, both because we don't want to be testing 0 to 5-year-olds and because it is a difficult and expensive and challenging to do that. But if not that, what should be the things that are measuring. Should it be the structural kinds of quality measures that Chris measured? Should it be something about the quality of instruction? And the tradeoff there is that the structural features are not very good predictors to kids learning. So, knowing things about how the classroom looks won't necessary tell you much about how much the kids are learning. The quality of instruction interactions is much better, but also very expensive and time consuming to collect. What states actually are collecting is a lot and varied measures. So health and safety, curriculum, developmental screenings, family partnerships, professional development, education levels, ratios, environmental ratings, and this is just kind of a smattering. States are basically containing-- collecting a ton of data. So, that brings me to this second question of, so how do you combine all these things your correcting into something meaningful that's going to help you improve the system? Ideally, you would want to make a system, where if you're going to call something a three-star program versus a four-star program, it's because the four-star program facilitates something better learning or some other outcome that we care about. But we actually don't know that well the way to create a recipe of these many ingredients that we might think each of this things are individually important, we don't know exactly how to link them up all together to create what we want, and certainly not in five little bins of this amount is better than that amount. So just as an example, this is Michigan's Star Rating System and you can see it's moving from a one-star where the programs don't mean much of the quality of requirements to a five-star when they meet all. This is a little hard to see, but within their program there is a bunch of different points assigned to different kinds of things like family, partnerships, the administration, the physical environment, a lot of different items. And they've recently done a study where they changed the point allocations to different things and it completely kind of overturned which programs were linked as high quality versus low quality. So, there is a lot of struggling around how to define the quality ratings. And in a national study, people took national data and try to look at the different ways state are combining quality measures to predict for outcomes, math, pre-reading, language, and social skills. And the findings were very striking. The first finding was each of the individual pieces was not terribly good predictor of the outcomes they cared about. So the staff quality, the ratios, the family partnerships, and the environment did not predict to any of the outcomes that they cared about. And really, only the interactions mattered. And then when they took a bunch of state models of how to combine these pieces into kind of an index, they've found that it was pretty much a smattering and not predictive at all of kids learning. So, there's this big puzzle around kind of-- if we're going to collect this quality information, how do we make it predict the things we care about. A third big one is basically, can these programs work, do they create incentives that lead programs to improve? And we do have kind of new research that suggest that they do-- that basically random assignment into getting a three versus a four-star program incentivizes programs to try to improve and make quite bit of change in quality over time. So, that's encouraging. And the fourth one and then I'll wrap up, has to do with whether it's really the case that parents can respond to these-- that given all there are other constraints, and will vote with their fee for the more high quality program. So, to give you a sense, the information is becoming much more common. So, preschools, we'll note on their websites that they received a good rating and there's newspaper articles, and there is a lot of kind of emphasis on trying to inform parents out of the ratings, but there is no empirical evidence yet of the extent to which parents are responding to this information, especially low income parents who are-- have a lot to balance in selecting a child care and have many kind of other features like hours, transportation, and services provided that might sort of trump the kind of quality measures that are being considered here. But we do know that parents, in general now tend to think of their child care as being quite good. So, 74% of parents in a large study said that their child care center with either perfect or excellent. And so parents are not being particularly discerning about the quality. And in work that I'm doing in Louisiana, we saw that 80% of parents indicated that their center to where their child was at was their top choice. Only 2/3 never visited another center, 40% never even considered another center, and I think together that does suggest that there is potentially a really important informational and asymmetry so that this information could come in and be very useful. So to wrap up, scaled-up preschool initiatives are really meeting a focus on quality, and accountability initiatives are one way that has worked towards it. I think there's a lot of potential there around reducing the fragmentation particularly with the child care sector relative to the public preschools and pre-k, but everything kind of relies on knowing how to measure the quality and our understanding of exactly what to measure and to what to rate is hard. And in addition, without really focusing on supporting programs to improve like the "I" in improvement systems, they're unlikely to create the results that people seek. I'll stop there.

[ Applause ]

>> Yes. Great. So, actually I think the sequence of these talks really seems preplanned, but I-- kind of was but-- so, I'll be talking about this kind of an eight-- from an eight-year project in Chile, what a long-term research project has shown us in that context around this really quite difficult struggle to improve the process quality that I think all three speakers talked about. So, before that, this is a very collaborative project across Harvard University, NYU, in NGO, in Chile and local university there in Santiago. So, the story around the United States is actually very similar to the one in low and middle-income countries. So there's been a lot of high expectations built up in the field around the long-term impacts of early childhood education. But as much as the United States struggles with quality in low and middle-income countries, that struggle is perhaps even more magnified. The expectations are reflected in the past 20 years of research from the evaluation sciences and also neuroscience in the new sustainable development goal target 4.2 which is under education and learning, which talks-- which states that by 2030, the goal is to ensure access to quality early childhood development care and pre-primary education so that children are ready for primary education. So you do see the word quality in there for the first time. And early childhood development was represented in the 2000 to 2015 Millennium Development Goals only in terms of infant mortality and maternal mortality. So this is in advance to think of beyond survival, these issues of learning and development. On the other hand, the raise the challenge of what quality means. So, I'm actually-- I think Chris and Daphna both covered this issue of quality, so I'm just actually going to skip over that slide. But the context of this study is in Chile, which has recently made it into the OECD and so it transitioned from a being a middle-income country to a high-income country and yet showing all the pattern of inequalities in school readiness and learning that are really actually quite similar to those in the United States. And there's been a rapid expansion of early childhood access under the first Bachelet administration into through the Panetta administration and into the second Bachelet administration. So that now, for over 78% of 4-year-olds are attending pre-primary education. And they have a great structure that's similar to ours in the sense that 5-year-olds attend kindergarten-- it's actually called kindergarten 4-year-olds attend pre-kindergarten. So, this is a project that started in 2006 and 2007 with a extensive stakeholder process around coming together around what would be the goal of a project, in Chris' terms, a preschool 2.0 project to improve the quality of primary education in Chile. And so there was this-- a wide stakeholder process piloting the actual setting of goals that in fact language and early pre-literacy skills would be a major focus of a effort to improve quality, but secondary emphasis on health and on socio-emotional development. And between 2008 and 2012 after a year of piloting and implementation, we conducted the first school level RCT of educational improvement in the country of Chile, with about 64 preschools about 2000 kids. What was the intervention? This was actually interesting because Chile, at that time did not have evidence or curricula that really met the kinds of standards, like Chris was talking about in terms of sequenced activities based on developmental evidence from that country. Instead, what they asked and what was provided was a sense of what are the good instructional strategies to promote vocabulary development, oral comprehension, and the kind of traditional focus on some early literacy kinds of skills. And so-- But at that point there was no ability to suggest, for example, how frequently teachers would do things like read books to children. There were suggestions on how to do interactive book reading, but not how often to do it because that was not acceptable within the major systems of public preschools in Chile at that time. So this is a little bit like coaching plus good instructional strategies but perhaps not curricula. This was the first test of coaching provided in twice a month to teachers with feedback and observation in the classroom. So, what were the results? There were positives impacts, and now you're familiar with the effect size metrics, so between 0.4 and 0.8 on the class, which is the most widely used observational quality measure for process quality. In the United States, it is actually the monitoring instrument for Head Start. It is part of many of these QRIS systems. And what we found, for example, was that it had exactly the same psychometric properties as it does in the United States. And so, it divides into these areas of emotional support classroom organization and instructional support. And before you look at these bars, what we've found was exactly the same pattern as in the United States which is fairly good emotional support and classroom organization, which is kind of like the organization of the routine of the classroom, but a much lower levels of instructional support, in fact a little lower than the average in many of these studies in the United States, like in Head Start or in the 11 state pre-k study. But these are the actual effect sizes of this intervention. And so, they look like they're really quite large by Howard's standards on classroom quality, but why did they not produce then subsequent impacts on children's language outcomes which is the lower graph? Well, if you look at the American evidence and our own study where we linked the class to child outcomes using the standard approach to doing this, which is a live model which folks want to get in to the technicalities we can talk about controlling for earlier child skills. The relationship between the class and child learning outcomes is small, which means that a one standard deviation increase in the class is generally associated with about a 0.10 to 0.12-ish improvement in child cognitive skills by the end of preschool. So if you keep that in mind, then that tells us something about the fact that you can get actually fairly robust effects on the class that are still not sufficient to drive statistically significant improvements or substantially meaningful effects. Now, we do start seeing a 0.09 which is like the hint of an effect on that early measure of pre-literacy, which is decoding, understanding, being able to identify letters and words. We did do a follow-up, we did find that there were quite high rates of absenteeism form preschool and when we adjust for that and look at the impacts of this program on kids who are most likely to attend consistently, we saw some more positive indication that the program was producing somewhat more robust positive language and pre-literacy effects for those kids who are most likely to attends consistently. So, average levels of absenteeism in Chile were about 23 % on any given day and measuring on 15 kind of randomly selected days across the year, any kid followed individually missed about that amount, about a quarter of days, so that's substantial absenteeism. So what did children actually experience? We were actually very interested in that, and luckily what we did to observe classroom quality was we actually video taped. And so we have gone back to those video tapes again and again and again. And there are wonderful source of dissertations, and one study by Susana Mendive, well this is not a dissertation, this was just a side study that with an army of coders, Susana Mendive at the Catolica University and Chris conducted a minute-by-minute video coding of-- on both the targeted and non-targeted teaching strategies from this program across the experimental and control group, so we could actually look at experimental effects on the number of minuets of targeted instruction and the denominator is 80 minutes-- by the way of kind of the way that the class works, you kind of pick 20-minute segments for them across a randomly selected pre-school day. And what we see here is that the number of minutes, first of all in the control group, is distressingly small. Just think of this as within a kind of an 80 minutes denominator. The average number of targeted which are good kind of language instruction strategies was about eight or nine minutes. And before you get super distressed about Chili, you can get distressed about the United States because this is actually not that different from the data on minutes of good language instruction in preschools in the United States. So we're not too far of. Now this produced, again significant increases, but up to levels of about 12 or 13 minutes of good language instruction. The good news was non-targeted things. These are things like simply repeating syllables declined over time. So, what happened in the middle of experiment was an opportunity to scale the program into another region of Chile. And that was before the experimental results came out but we decided how do we scale? And the approach that we picked, because our Health Director Mary Catherine Arbour, who leads this phase work, this is after the experiment had connections with the Institute on Healthcare Improvement, which has developed approach to improving health care systems at scale that has been used worldwide, has reduced, for example, infant mortality nationwide in Ghana. And this is-- came out of the corporate world actually, originally in the 1960s and '70s. But it is about bringing stakeholder groups together to set quantifiable goals for quality improvement within a given intervention, whether it's a health care system, or in this case we applied it to the area of early childhood education for the first time. So this actually, if you've seen these kinds of cycles, a plan do study act cycles are about a group coming together setting goals, setting quantifiable goals and then sharing information on their progress towards those goals. And it really has a lot of links to other kinds of things like design thinking and I think some of these rapid cycle innovation models. And so this was piloted in 14 schools with networks of teachers, principals and parents, and teachers, aides, and school leadership. And the idea is that this group of 14 schools actually gets together once every two or three months, first to set goals within the theory of change of this model, this approach in Buen Comienzo, and they set goals for what they wanted to improve. So for example, they knew that the average number of minutes within the main model was about 13, they've heard this, they're actually given this information of language instructional strategies and they set the goal of let's get to 30 minutes per day of those instructional strategies. And so let me-- well, OK. So these are technicalities that I'm not go in to. So I'm going to instead show you-- OK. So these are 14 schools and we ended up comparing them to 49 schools that used the basic model but without this continuous quality improvement process. So, the setting of goals I'm going to give an example of how these set of schools actually set a particular vocabulary-based goal. And this was to introduce one new vocabulary word per day with rotating strategies for incorporation of the new word. The idea was to get beyond introducing new vocabulary with simply a definition. OK. Here is a badger. A badger is kind of animal that looks like this and then just stopping or not even introducing a new vocabulary word. The idea was to actually link that to what are other animals that look like a badger. Have you ever seen a badger? Let's draw-- Let's use multiple ways to link this to children's everyday experiences and to other kinds of words that are related in a kind of a conceptual network. And so the idea was to try to build these more sophisticated vocabulary strategies. And they developed a measure. What is the measure to the kind of track improvement in this? The number of kids within the class room everyday who use a new-- this new vocabulary word with versus without in adult's help. So what we want to see is spontaneous use of new vocabulary by children as an indicator of quality, and this was measured in this kind of rapid cycle approach everyday by teachers. They would fill this out and then share their Excel spreadsheets three months later and see where did this work and where did this not work. And ultimately, that approach is what we have started to evaluate because it turns out this continuous quality improvement strategy has actually not been evaluated causally and so we used-- we are starting to use quasi experimental methods, propensity score methods to look at this. This is about-- I'm not going to explain what this is, but we're getting to the purpose of what propensity score is trying to do, which is to bring a treatment and control comparison somewhat closer to an experiment. And the good news is we are starting to see effects on these language outcomes looking like they're moving in the right direction. So in effect size on vocabulary of about 0.31 comparing children who experienced this model with the continuous quality improvement to those in the model without continuous quality improvement. So we think that this was an approach to develop buy-in in the new region for this model and teachers had a tremendous, from focus groups, a very positive experience feeling supported and working with peers towards quality improvement. And we are starting to see, again language and literacy outcomes moving in the right direction. There are limitations as in any study. So, look for-- looking forward to the discussion, but the idea is that these forms of kind of quality improvement, we need to be much more creative about thinking about how to improve systems that are already at scale, such as early childhood education increasingly is around the world. Thanks very much.

[ Applause ]

>> OK. Well, I'd like to thank all of the panelists for some great set up remarks and reports of different results. My name is Brian Jacob. I'm the co-director of EPI. Welcome everybody. I had the fun task of getting to ask questions and I of course have some of my own but I also have lots that had come from the audience. So to start out, I wanted to ask the panelists to talk about what they think the goals of early childhood outcome should be-- early childhood education should be. I mean, I think that may be is an implicit assumption kind of underlying a lot of this. Is-- is the assumption that we should be maximizing, standardized math and reading scores, receptive vocabulary so forth, or other things and how does that interact with various measures of quality, and how do you think about that? So--

>> Sure, so I will jump in and then other folks please also. So I think in terms of what should we be maximizing in preschool, so I think this is a really great question. I think one of the things that we don't want to do is just focus on these early academic skills, right? And I think most people agree with that. It's not just about knowing the alphabet, we need more focus on unconstrained skills, so building language. Because what we know is that we're pretty good at teaching most kids to decode and they've got that pretty much down by the end of third grade. But most kids don't learn to comprehend in a way that puts him above proficiency levels in the test that we have. And so I think, you know, focusing on those kinds of critical thinking skills and background knowledge and vocabulary, is something that at a lot of early childhood folks I think would get behind when you talk to them, the practitioners on the ground. But that, when we walk into a classroom, we often don't see nearly as much in the way of asking kids to solve problems in multiple ways and around pushing those kinds of rich conversations with children. We have a lot of, you know, that the time you study are not that encouraging that we do have in early childhood classrooms.

>> So I agree with that. I guess, I think that one goal of the program is to basically support families that have-- especially in targeted programs to have a lot of very complex set of circumstances have, oftentimes singles parents and who are dealing with a lot issues simultaneously, and to give kids and families a safe place for kids to be, a place where they can be engaged, where they can have a lot of kind of the experiences that a lot-- that my kids have on a daily basis at home. So, exposure to new experiences, learning how to interact with other kids, learning how to have challenging experiences in a kind of a safe place, learning how to talk to lots of new people, so things like that. And then I think that the-- there's lots of skills that could fall into that, but making kids comfortable in social situations and getting them to a place that when they are transitioning into a school setting, they already know how to be in groups and they already know how to interact and make friends, and basically learn is a key goal.

>> I think the only big theme I would add is to think as we do about public education as a potential lever to reduce inequality, now that's as difficult an issue for early childhood education as it is for public education. So, should we be surprised that we're running into the same issues that affect a public education and primary and secondary and higher at levels, I don't think we should be surprised. But I think in a way the history of early childhood program evaluation started of with such a bang and such a kind of positive sense from these early studies, like the Perry Preschool or the Abecedarian Program, that I think for some folks it feels perhaps like a harsh awakening, that the systems issues, accountability issues, what is quality and how to promote learning and development across multiple domains are-- you know, I mean they play out in a different way and then even in a more fragmented system where-- and only in certain places is there a move towards universalities. So as I think Daphna's presentation showed the fragmentation across types of care is vast. And another pattern that's coming from three or four studies using different methods in the Head Start Impact Study is that, if you compare the effects of Head Start to kids who are staying at home, that's when the impacts of Head Start are the most robust on cognitive skills. And that's important point that many kids are in informal care of settings. Many kids are at home, many kids are in centers, but these systems themselves are not coordinated and have vastly different levels of supports, if we're going to think about these issues of inequality.

>> OK. Another question coming from several folks in the audience-- they wanted to hear more about Howard's cross-site variation. So you can--

>> So somebody picked me up on my request.

>> -- tell us, what were some of the things that were kind of predicting cross-site variation?

>> Well we don't-- OK. So here's-- it's very difficult to predict cross-site variation. We actually will be doing-- are doing some of it, but a fellow by the name of Chris Walters as some of you know personally, has looked at cross-site effect variation using the Head Start Impact Study, and argues that he's found positive effects on cognitive outcomes of full day versus half day, positive effects on social-emotional outcomes of home visiting, more versus less home visiting. And he also-- there's two studies that really point to the fact that the impact of Head Start are much, much greater for kids who otherwise who would have stayed at home than for kids who otherwise would have been in the center. And the other study which looks-- he looks at it sort of, in one of his papers, kind of in passing. But the study looks at most closely is one by Avi Feller and Lindsay Page and others, where they look at it very, very closely and they compare impacts of Head Start on the PPVT, in particular the receptive vocabulary measures, so it's a cognitive thing and a social-emotional thing. But they get really big impacts on the PPVT for kids who other-- who were assigned to Head Start would have gone to Head Start and who would not assigned to Head Start would have gone-- would have stayed at home. Gets big impacts for them, probably 0.3-- I forgot the exact the numbers effect size, and gets virtually nothing for the kids who have assigned to Head Start would have gone to Head Start. And if not assigned to Head Start would have gone to another center-based care. So that's a really important theme. And I'll just add one note. The impact variation findings as we try to think about them, one of the one the reasons they're hard to predict is that-- you're looking at Head Start centers across the United States in the Head Start Impact Study. And an impact is a comparison of what was the outcome under Head Start versus what was the outcome under what the alternative was if you didn't get the offer at Head Start. OK? Now, impact variation, what the impacts are the difference between those two outcomes. Impact variation or how that difference changes across this country, and there are two sources to that variation. One, is how does Head Start itself, the program Head Start vary across the country, that will affect the way in which its impacts vary across the country. But what most people aren't thinking about and aren't looking about as they're trying to interpret the results of this study, how do the effects-- the effectiveness of the counter factual or the alternative outcomes vary across the country. And I think one of the reasons it's been so hard to predict the impact of Head Start, is because you have to take into account both Head Start and its alternative, and there aren't good measures on that at all in the Head Start Impact Studies.

>> OK. Great. I want to kind of, you know, combine kind of two or three things that came up into some of the questions. One is that a big issue within the early childhood area of fade out and- then how that relates to the early schooling experience, K-3. So, I guess maybe one part of that is-- I assume we are hoping, but do we have any evidence on there-- whether there is less fade out in the high quality, meaning high initial gain places. And then second, just more broadly are there ways that we need to think about restructuring K-3 for example, that could interact with some of the early childhood work that's now happening?

>> So, I'll quote Greg Duncan's slide for those of you who were able to attend that talk where it said, "Fade out is a mess," so I think it is a mess and that's a very good message around what we know about this. But as far as the gains lasting longer if you're in a higher quality program, I think a good example of this recently is that in the Tennessee program, which is lower quality compared to, say, the Tulsa program, there is some remaining benefits at the end of third grade on math for the Tulsa program, and for the Tennessee program, it actually looks like the effects may be negative. And so, based on the observational quality measures, we have of those two context, we are seeing kind of a different pattern, and we see from older studies that programs that on the basis of their inputs because we didn't have this observational quality measures. So that's sort of an important point that we don't know the instructional quality of the older studies the way we do now that they-- The ones with higher quality did maintain impacts on some measures of academic achievement along the way, right? So there is a pattern that's not clear, so again it's the-- the fade out is a mess, is where we really are but I'd there is some suggestion that fade out may be tempered by the size of initial impact in particular. There has to be some thing to last, right? So you need a larger initial impact and so that comes from a higher quality program.

>> Just to support that in a metaanalysis that Greg and Katherine Magnuson and I and Holly Schindler had been involved on. The rate of fade out which is about 0.02 effect size per year didn't interact with the site-- the initial, immediate posttest effect. So, I think that suggests that yes, the larger the initial boost perhaps the-- if the fade out rate is not different, then it will simply take longer to get down to kind of virtual equivalence or convergence. There's many, many issues exactly like Howard said, you have to think about what the "treatment condition" is and the fact that the control group in the United States now no longer receives "nothing". Like, all kids are learning no matter what context they're in and the vast array of settings that they're in, in these comparison or control groups makes that question really quite difficult to think about. And then I think it is very important to think about what are the-- what exactly is going on with instruction in these later years in kindergarten, first grade, second grade, third grade for kids who did not have that particular preschool experience and those who did. And so teachers could be doing all kinds of-- and we're not exactly sure how they're targeting certain aspects of instruction to certain subgroups of children for example. So we-- That's the part where it really is just a big area for many of you to write wonderful dissertations.

>> I've been told we have time for just one more question. So, politically and policy-wise, how do we make some of this happen? Kind of what are some of the most important changes, policy changes or resources or what had-- what would you recommend that's kind of in the top of your list that could be done to further-- some work that you think is useful?

>> Oh, I'm going to [inaudible]. I'm not sure. Sorry.

>> Sure.

>> I think the quality rating system approach which I personally feel we shouldn't give up on it because I think if there are-- if we do end up with more sensitive measures of quality that are also feasible to scale within large-scale monitoring systems, that would help with the fact that there is this information problem in the United States. I think we also need to create the kinds of messages around what quality means and its importance both for policy makers and the public. I feel like we have an effective message for whether to invest in early childhood that the brain science-- these messages around the value of early investment have really gotten through-- I would say worldwide to increase investments in early childhood education. But if we're not going to kind of, on the global side, replicate the problems with universal access to primary education, which produce not a lot of great learning impacts. We need to message quality in some way and then that has to be linked to measurement of quality at scale that can be embedded within these monitoring systems and information-based policies.

>> So I think that's a really good point and the other thing is, I think that the very low quality of experiences very young children have, is one kind of high leverage area. I think the places that toddlers in this country are getting taken care of are of extremely low quality, and I think that's some of that has to do with the fragmentation. Some of that has to do with the very low levels of regulations for family child care homes where lots of toddlers are spending their time. So I think, improving the quality, they are changing the regulations but also bringing some more access to more highly regulated centers for young kids is important and also helping parents navigate the hardship of kind of linking the complex systems. And so parents are trying to do a lot and they need systems that work together, and there's kind of a lot challenges around finding a full day of care for your child, and oftentimes the place that's going to work for your life is not the one that's going to work for your child's development and the fact that those things are so at odds, I think is a real challenge. So trying to think about kind of especially women and women's work and single moms and how their lives can be supported in a way that also supports their kids is a really important piece.

>> OK.

>> And I would just add that I am going to be very curious to see ESSA hit the ground and to see where within that as the various rules get negotiated from state to state. There's a big emphasis on state and local decision making and there'll probably be variation in the capacity that folks have to take on the flexibility that offers around adopting different models. But I think that is something that has arrived that will be really interesting to watch around how those-- that flexibility works or doesn't work in promoting equality.

>> OK. Well, I'd like to thank our panelists for a great discussion and I hope to see you at the next event.

[ Applause ]