West Coast Stat Views (on Observational Epidemiology and more)

Tuesday, August 6, 2013

At this point, seeing even inaccurate coverage of OTA television seems like a triumph

This Marketplace story does leave a bit to be desired in terms of accuracy and completeness. In the audio version, the reporter, Sally Herships, like almost everyone covering this story, seems confused by digital versus amplified antennas and TV antennas versus satellite leading her to greatly overstate the price range of the equipment -- think under $10 if you don't need amplification (if not cheaper) -- and she failed to mention that the picture quality is actually better for free TV or that, in the region being discussed, you can get close to one hundred channels over the air or that with an adapter you can watch and digitally record those channels on you laptop.

All of that aside, this is still a story about doing without cable (or at least certain channels on cable), airing on one of the biggest and most respected financial news programs and it lists rabbit ears as the first option. I've been following this story closely for years now and I honestly can't recall this happening before. There have been a few voices in the wilderness on the subject -- Rajiv Sethi being perhaps the first and most notable -- but that was the blogosphere. A story like this in mainstream media is a very promising development, particularly given that the last time I heard the topic broached on Marketplace it was immediately dismissed by their guest expert (from the NYT, of course) who claimed that you can only get "a handful of channels" with an antenna in LA.

The story digital broadcasting has always come down to the question of whether the superior technology could establish a foothold before regulatory capture and the huge imbalance of hype drove it out of business. In that context, this is a good day for the medium.

Metablogging -- stag hunts, misalignment, principal agents and all that jazz

Shortly after Josh Marshall posted this analysis of recent events in the GOP, Joseph called me up to compare reactions. We've been having this conversation for so many years that much of it has devolved into a self-referential shorthand. As an illustration, at one point, after a fairly long-winded comment by me, Joseph simply said "Stag hunt." and tossed the ball back to me and we moved on to the next topic.

We agreed (with, I assume, fingers crossed on both ends of the line) that we'd write some posts on the subject, but I'm starting to think that it might be more useful to step back for a minute and talk about how we've been framing the question of what's going on in the Republican Party (politically, not socially or in terms of policy. Those are entirely different metaposts).

For years now, the two of us have been talking about the post-Tea Party GOP in terms of a multi-player stag hunt. Over the past few years the stakes (particularly the costs of failure) have increased. At the same time, participation rates required to take down the stag have also increased. As a result, progressively smaller groups have gained the power to kill the enterprise. (In a different conversation Joseph pointed out that, in a military context, shooting deserters is also a predictable result of this situation.) We could dig deeper into examples and implications (particularly with respect to the trade off between the power of an alliance vs. its stability) but for now I want to limit the discussion to framing.

(there might also be a place here to talk about symmetry breaking, but I'd need to give that some thought first.)

Another way of looking at the story we've found useful is to look at misalignment of interests, especially what looks to us two non-economists as a particularly nasty two-level principal agent problem where a small group of big donors determine the pool of viable candidates and a relatively small but coherent subgroup of the primary voters make the purchasing decisions for the entire party. You'll notice that, like the stag hunt frame, under this scenario small groups can acquire disproportionate power.

And of course there's the mandatory Influence reference, framing the story in terms of social psychology. If you check out the chapters on commitment and consistency, social proof, and scarcity you'll find all sorts of applicable discussions of the ways groups united by a common belief system deal with ideological challenges and the loss of dominance.

Nothing particularly fresh or profound here, but these idea have proven a pretty good framework recently. I'm not saying they should be the basis of the standard narrative -- I'm not sure there should be a standard narrative -- but they do come in handy. More importantly, I think you can make the case that too little of the public discourse is spent examining underlying assumptions and asking about the different ways to frame our questions.

The real challenge of Libertarian populism

According to Wikipedia, the night-watchman state (a foundation of Libertarianism) is defined as:

A night-watchman state, or a minimal state, is variously defined by sources. In the strictest sense, it is a form of government in political philosophy where the state's only legitimate function is the protection of individuals from assault, theft, breach of contract, and fraud, and the only legitimate governmental institutions are the military, police, and courts.

The bolding was added by me.

I think the real challenge of Libertarianism is not issues like Libertarian populism but rather the requirement that a Libertarian state protect against fraud. I think that this is a more important the weaker that government gets, the more critical each function becomes. Yet modern corporations are increasingly being allowed to get away with Fraud (and with changing of contracts). These are huge problems because if you are going to enforce contracts against the politically weak, you also need to enforce them against the politically strong or you really do get feudalism.

Trials like that of the S&P ratings schemes (where they argued ratings were puffery) are critical because it may be the case that a reasonable person might well think that misrepresentation was a form of fraud. Saying that "lies=marketing" seems to be a rather obvious attempt to evade the antifraud portion of the states function.

In the same sense, changing the rules under which a corporation operates can be an extremely dodgy move. Changing the rules can be okay, but the optics are terrible when a major corporation shifts the rules when an initial attempt to do something fails. It doesn't matter that the main victim is Carl Icahn. who is hardly at risk of poverty.

To me, this is really the central issue of the modern Libertarian approach. It isn't impossible that you could have a Libertarian populism, but I think the backbone of this approach would have to be that the same rules apply to everyone (even if they have a lot of money).

Monday, August 5, 2013

Price discrimination at Ralph's

This is a new one on me.

I was in the grocery store last night when the thought hit me that I hadn't had a root beer float in a long time, too long in fact, but as I walked back and forth along the soda aisle I could not find any cheap root beer. There were various Coke and Pepsi products but to get a reasonable price you had to buy four (I don't drink much soda -- eight liters is a bit much). There were also expensive specialty brands, but no house brands. Annoyed, I grabbed a bottle of A&W and went on with my shopping only to see the store brands I was looking for one aisle over mixed in rather incongruously with the sports drinks.

Normally you'd expect to see the house brands near the corresponding name brand products, but I suspect that the previously mentioned "buy four" promotions prompted the shift, that given the choice between paying the inflated list price or dealing with the inconvenience of buying far more than they needed, too many customers were saying to hell with it and grabbing a Shasta or a Big K which had no restrictions and was even cheaper than the discounted name brands.

Given the competition from places like Wal-Mart and Food-4-Less and the 99 cents store, stocking cheap store brands can keep price sensitive customers coming back but making those brands hard to find can keep the less price sensitive customers spending more than they have to.

Sunday, August 4, 2013

Worst Statistic of the Day

Via Lawyers, Guns, and Money comes the piece most detached from reality I've seen this year:

When Josh McFarland graduated from Stanford he owed $40,000 in student loans and couldn't fathom a way he'd ever pay it off and have a future for himself - not unusual for the typical young adult these days. Then he went to work for Google.

As a product manager, he got stock options and cashed them in over the five years he worked there. He married a fellow Google employee, so she had stock too. Then she moved on to Yelp, and he quit to launch TellApart, which provides technology solutions for e-commerce sites. Now 33, McFarland has a 3-year-old and a newborn and no longer has to think about his student loan: His company has $17.75 million in venture capital investment. While he doesn't consider himself retire-now rich, his piece of the company affords him what he calls "breathing room" and what other people might call wealth.

McFarland is on the starting end of Generation Y, the cohort born in the United States after 1980 that is typically portrayed as saddled with massive student debt, underemployed and underpaid. More than a third of the 80 million group of so-called millennials live with their parents, according to the Pew Research Group.

But McFarland is part of the sizeable minority that is doing quite well: 12 million Gen Y-ers make more than $100,000, according to the Ipsos MediaCT's Mendelsohn Affluent Survey. Many of them, in technology fields, live frugal work-based lifestyles and are not saddled with the six-digit student debt held by doctors and lawyers.

Paul Campos researched this statistic and discovered where it came from:

The source of the wrongheaded statistic appears to be this: the cited survey claims 59 million Americans live in households with incomes of $100,000 plus, and that 20% of “affluent consumers” are in the Gen Y cohort (this appears to mean that 20% of the adults who live in households with $100,000+ incomes are of this age). The story’s author then did some very bad math to generate the claim that “12 million Gen Y-ers make more than $100,000.”

But that is totally different than these people making > $100,000 per annum. That could also be people who are utterly unemployable living in their parent's basement. Or a trophy wife/husband of a hedge-fund billionaire. Or it could be 5 young adults sharing a house together, all making $22,000 a year as Starbucks' employees.

At the very least it uses a highly no-representative sample and portrays the wealth of young adults in a completely misleading way. This does not add to the conversation and it appears in the New York Times. If only they had a prominent economist on staff who could consult over this type of data . . .

Saturday, August 3, 2013

Weekend blogging -- Patents and product testing

Of course, these days I'm sure someone has patented the general idea of making a death ray. Perhaps that's an upside of having trolls.

Friday, August 2, 2013

"An astonishing act of statistical chutzpah" -- details on the Tony Bennett scandal

Jordan Ellenberg at Slate and Anne Hyslop at the New America Foundation have posts up explaining exactly how the Indiana Department of Education managed to change the grade of an influential donor's charter school from a C to an A. It appears to be your basic DATA COOKING 101. After seeing results you don't like, go back, try different weightings and look for excuses to drop bad scores then apply these changes selectively.

The 'selectively' part is particularly important and hasn't gotten the attention it deserves:

Two Indianapolis Public Schools might never have been taken over by the state if then-Superintendent of Public Instruction Tony Bennett had offered the district the same flexibility he granted a year later to the Christel House Academy charter school.

The issue was similar in both cases. Christel House had recently added ninth and 10th grades, and IPS’ Howe and Arlington had added middle school grades. The students who filled those seats posted poor enough scores to drag down the schools’ overall ratings.

In the case of Christel House, emails unearthed by The Associated Press show Bennett’s staff sprung into action in 2012 when it appeared scores from the recently added grades could sink the highly regarded school’s rating from an A to a C. Ultimately, the high school scores were excluded and the school’s grade remained an A.

But in 2011, after IPS’ then-Superintendent Eugene White demanded Bennett consider the test scores of high school students separately from those of middle school students so the high schools could avoid state takeover, Bennett was unmoved.

As for the specifics, Hyslop goes into more detail (and has a wonderfully apt movie quote to make her point), but Ellenberg probably does the better job summing it up:

Here’s where Bennett’s team found the loophole big enough to drive a charter school through. A normal person would do exactly what Chief Accountability Officer Jon Gubera did—give Christel the weighted average of its elementary/middle school score, according to the rules for elementary/middle schools, and its high school score, according to the rules for high schools.

But Bennett had a better idea. Christel was, technically speaking, not a high school, so the statutory formula for the high school grades didn’t apply. But it also didn’t have all four high school measures, so, he argued, the rules for combined schools didn’t apply either. There were just 13 schools in the state that had both middle school and high school grades but no seniors. For these schools, Bennett reasoned, the Indiana education poobahs should have a free hand to set the grades however they pleased. You can guess what happened next: Bennett ruled that the ninth- and 10th-graders in these schools didn’t count at all. So it was that the offending algebra grades vanished in a puff of bureaucratic smoke...

This was an act of astonishing statistical chutzpah. Suppose the syllabus for my math class said that the final grade would be determined by averaging the homework grade and the exam grade, and that the exam grade was itself the average of the grades on the three tests I gave. Now imagine a student gets a B on the homework, gets a D-minus on the first two tests, and misses the third. She then comes to me and says, “Professor, your syllabus says the exam component of the grade is the average of my grade on the three tests—but I only took two tests, so that line of the syllabus doesn’t apply to my special case, and the only fair thing is to drop the entire exam component and give me a B for the course.”

Ellenberg then made an observation that echoed some of my earlier points about how the mindset of the reform movement can enable these ethical lapses:

The saddest part is that I’m guessing Bennett sincerely felt he was doing the right thing. In his mind, he knew Christel was a great school, so if the scores said otherwise, the scores had to be wrong. In this respect, ironically, he ends up echoing his policy opponents, adopting the position that a mechanistic testing and scoring procedure can’t be allowed to override firsthand knowledge about teachers and schools.

The Op-ed no one wanted to print

When it comes to the education beat, this guy has one hell of a resume.

John Merrow began his career as an education reporter with National Public Radio in 1974, with the weekly series, “Options in Education.” In 1984, Merrow branched out into public television. He served as host of The Merrow Report, an award-winning documentary series, and currently is the Education Correspondent for PBS NewsHour. Merrow’s work has taken him from community colleges to kindergarten classrooms, from the front lines of teacher protests to policy debates on Capitol Hill. His varied reporting has continually been on the forefront of education journalism.

Of course, all of these qualifications don't necessarily make a person right, but they generally do make a person publishable. That's what makes this story curious, because Merrow can't seem to get anyone to publish the following op-ed. This is the second time in a row that he's found newspapers and magazines reluctant to publish reporting on Micelle Rhee.

As mentioned before, Rhee has gone from feted to indefensible so quickly that it's difficult for most journalists and pundits to cover her current activities without looking comically gullible for swallowing her previous rhetoric. (And journalists don't like looking gullible.)

CAVEAT EMPTOR: MICHELLE RHEE’S EDUCATION REFORM CAMPAIGN

"Today, too many of America’s children are not getting the quality education they need and deserve. StudentsFirst is helping to change that with common sense reforms that help make sure all students have great schools and great teachers." (StudentsFirst press release, emphasis added)

Michelle Rhee created StudentsFirst after leaving her post as Chancellor of Washington, DC’s Public Schools in the fall of 2010. She announced her intentions on “Oprah” that December: to fix America’s schools by enrolling one million members and raising one billion dollars.[2]

Easily America’s most visible education activist, she has been crisscrossing the country lobbying for change and donating money to candidates whose policies she supports. StudentsFirst claims to have helped pass 110 ‘student-centered policies’ in 18 states.

Because Ms. Rhee is trying to persuade the rest of the country to do as she did in Washington, it’s worth asking what her ‘common sense reforms’ accomplished when she had free rein to do as she wished.

She was definitely in charge. Her boss, a popular new mayor, told his Cabinet that trying to block his Chancellor was a firing offense. The business community, a public fed up with school failure, and the editorial pages of The Washington Post were enthusiastic supporters. Moreover, she had virtually no opposition: the local school board had been abolished when the Mayor took over, and the teachers union, reeling from its own financial scandals, had an untested rookie president. She knew how lucky she was.

"I’m living what I think education reformers and parents throughout this country have long hoped for, which is, somebody will just come in and do the things that they felt was in the best interest of children and everything else be damned. (Interview, fall 2007)"

She lived that dream for 40 months. She opened schools on time, added social workers, beefed up art, music and physical education, and dramatically expanded preschool programs. The latter may represent her greatest success, because children who began their schooling in the expanded preschool program tend to do well on the system’s standardized test in later years.

Ms. Rhee made her school principals sign written guarantees of test score increases. It was “Produce or Else” for teachers too. In her new system, up to 50% of a teacher’s rating was based on test scores, allowing her to fire teachers who didn’t measure up, regardless of tenure. To date, nearly 600 teachers have been fired, most because of poor performance ratings. She also cut freely elsewhere–closing more than two-dozen schools and firing 15% of her central office staff and 90 principals.

When Ms. Rhee departed in October 2010, her deputy, Kaya Henderson, took over. She has stayed the course for the most part, although test scores now make up–at most–35% of a teacher’s rating score.

Some of the bloom came off the rose in March 2011 when USA Today reported on a rash of ‘wrong-to-right’ erasures on standardized tests and the Chancellor’s reluctance to investigate. With subsequent tightened test security, Rhee’s dramatic test scores gains have all but disappeared. Consider Aiton Elementary: The year before Ms. Rhee arrived, 18% of Aiton students scored proficient in math and 31% in reading. Scores soared to nearly 60% on her watch, but by 2012 both reading and math scores had plunged more than 40 percentile points.

But it’s not just the test scores that have gone down. Six years after Michelle Rhee rode into town, the public schools seem to be worse off by almost every conceivable measure.

For teachers, DCPS has become a revolving door. Half of all newly hired teachers (both rookies and experienced teachers) leave within two years; by contrast, the national average is understood to be between three and five years. Veterans haven’t stuck around either. After just two years of Rhee’s reforms, 33% of all teachers on the payroll departed; after 4 years, 52% left.

It has been a revolving door for principals as well. Ms. Rhee appointed 91 principals in her three years as chancellor, 39 of whom no longer held those jobs in August 2010. Some chose to leave; others, on one-year contracts, were fired for not producing quickly enough. Several schools are reported to have had three principals in three years.

Child psychiatrists have long known that, to succeed, children need stability. Because many of the District’s children face multiple stresses at home and in their neighborhoods, schools are often that rock. However, in Ms. Rhee’s tumultuous reign, thousands of students attended schools where teachers and principals were essentially interchangeable parts, a situation that must have contributed to the instability rather than alleviating it.

Although Ms. Rhee removed about 100 central office personnel in her first year, the central office today is considerably larger, with more administrators per teachers than any of the districts surrounding DC. In fact, the surrounding districts reduced their central office staff, while DC’s grew. The greatest growth in DCPS over the years has been in the number of central office employees making $100,000 or more per year, from 35 when she arrived to 99 at last count.

Per pupil expenditures have gone up sharply, from $13,830 per student to $17,574, an increase of 27%, compared to 10% inflation in the Washington-Baltimore region. So have teacher salaries; DC teachers now earn on average more than their counterparts in nearby districts in Virginia and Maryland.

Enrollment declined on Ms. Rhee’s watch and has continued under Ms. Henderson, as families continue to enroll their children in charter schools or move to the suburbs. The year before she arrived, DCPS had 52,191 students. In school year 2012-13 it enrolled about 45,000, a loss of roughly 13%.

Even students who have remained seem to be voting with their feet, because truancy in DC is a “crisis” situation, and Washington’s high school graduation rate is the lowest in the nation. The truancy epidemic may be the most telling data point of all, because if young people in this economy are not going to school, something is very wrong. They are not skipping school to work–because there are no jobs for unskilled youth.

Ms. Rhee and her admirers point to increases on the National Assessment of Educational Progress, an exam given every two years to a sample of students under the tightest possible security. And while NAEP scores did go up, they rose in roughly the same amount as they had under her two immediate predecessors, and Washington remains at or near the bottom on that national measure.

The most disturbing effect of Ms. Rhee’s reform effort is the widening gap in academic performance between low-income and upper-income students, a meaningful statistic in Washington, where race and income are highly correlated. On the most recent NAEP test (2011) only about 10% of low-income students in grades 4 and 8 scored ‘proficient’ in reading and math. Since 2007, the performance gap has increased by 29 percentile points in 8th grade reading, by 44 in 4th grade reading, by 45 in 8th grade math, and by 72 in 4th grade math. Although these numbers are also influenced by changes in high- and low-income populations, the gaps are so extreme that is seems clear that low-income students, most of them African-American, generally did not fare well during Ms. Rhee’s time in Washington.

English Language Learners in Washington’s schools are also struggling. Title III of ESEA requires progress on three distinct measures: progress, attainment and what ‘No Child Left Behind’ calls ‘adequate yearly progress.’ DC failed on two out of three last year.

DC doesn’t fare well in national comparisons either. Between 2005 and 2011, black 8th graders in large urban districts gained five points in reading, while their DCPS counterparts lost two points, according to a study by the DC Institute of Public Policy released this spring. Between 2005 and 2011 in large, urban districts, Hispanic eighth-graders gained six points in reading (from 243 to 249), black eighth-graders gained five points (from 240 to 245), and white eighth-graders gained three points (from 270 to 273). In District of Columbia Public Schools, however, Hispanic eighth-graders’ scores fell 15 points (from 247 to 232), black eighth-graders’ scores fell two points (from 233 to 231), and white eighth-graders’ scores fell 13 points (from 303 to 290).

The states that have adopted her approach, and others now being lobbied, might want to make their own data-driven decisions.

Thursday, August 1, 2013

A final (?) news wrap-up on the Bennett story

As previously mentioned:

INDIANAPOLIS (AP) — Former Indiana and current Florida schools chief Tony Bennett built his national star by promising to hold “failing” schools accountable. But when it appeared an Indianapolis charter school run by a prominent Republican donor might receive a poor grade, Bennett’s education team frantically overhauled his signature “A-F” school grading system to improve the school’s marks.

You can get Joseph's reaction here and mine here and here.

Despite considerable support from the reform movement, you can add another 'former' to that paragraph.

Bennett said he resigned “because I don’t believe it would be fair to be distracted” by what he characterized as “malicious and unfounded” reports.

Just yesterday, Gov. Scott told Channel 5 in West Palm Beach that Bennett is “doing a great job.

In what was already a bad news day for Bennett:

In June of 2011, Tony Bennett, then Indiana’s superintendent of public instruction, picked a for-profit education company in Florida to run a group of Indianapolis public schools.

The company, Charter Schools USA, set up operations in Indianapolis soon after the announcement and officially began running Manual High School, T.C. Howe High School and Emma Donnan middle school in the late summer of 2012. Millions of Indiana tax dollars have since flowed to the company, which has received many good reviews for its work in Indianapolis.

But a recent hiring decision by Charter Schools USA is sure to raise eyebrows and questions about conflicts of interest, particularly now that Bennett is embroiled in a massive controversy centering on special treatment given to certain Indiana schools during his tenure.

The decision: Charter Schools USA earlier this year hired Tony Bennett’s wife, Tina, as a regional director based in Florida, where Tony Bennett was hired late last year as commissioner of education. And, so, the bottom line is this: Tina Bennett is now earning a paycheck from the company her husband hand-picked to take over schools in Indiana, a decision that was very good for the company’s financial fortunes.

It’s important to note that Tina Bennett is a longtime educator, a former school administrator and counselor. She is also an advocate of the type of school choice efforts that Charter Schools USA is built on. In Indiana, she faced criticism and sometimes cruel treatment for taking a job with education groups tied to her husband’s former office. But it’s understandable that she would seek work in the education field.

To provide some context (and a bit of schadenfreude) for Bennett's fall, here's a reminder of where Bennett ranked in the reform firmament.

Value added testing without a gold standard outcome

From a Megan Pledger comment on StatChat comes this paper (pdf) on value added testing models for evaluating teachers. The following concerns were brought up:

In the real world of schools, data is frequently missing or corrupt. What if students are missing past test data? What if past data was recorded incorrectly (not rare in schools)? What if students transferred into the school from outside the system?

The modern classroom is more variable than people imagine. What if students are team-taught? How do you apportion credit or blame among various teachers? Do teachers in one class (say mathematics) affect the learning in another (say science)?

Every mathematical model in sociology has to make rules, and they sometimes seem arbitrary. For example, what if students move into a class during the year? (Rule: Include them if they are in class for 150 or more days.) What if we only have a couple years of test data, or possibly more than five years? (Rule: The range three to five years is fixed for all models.) What’s the rationale for these kinds of rules?

Class sizes differ in modern schools, and the nature of the model means there will be more variability for small classes. (Think of a class of one student.) Adjusting for this will necessarily drive teacher effects for small classes toward the mean. How does one adjust sensibly?

While the basic idea underlying value-added models is the same, there are in fact many models. Do different models applied to the same data sets produce the same results? Are value-added models “robust”?

Since models are applied to longitudinal data sequentially, it is essential to ask whether the results are consistent year to year. Are the computed teacher effects comparable over successive years for individual teachers? Are value-added models “consistent”?

A lot of these concerns have been independently voiced by Mark P. However, what is especially concerning is the idea that we could iterate through these assumptions to find a school ranking that satisfies some prior. This can be good under some circumstances -- Thomas Lumley gives an example of a model that clearly mixed up rankings of some kind of sports team (this isn't my area of expertise so I apologize that I don't recognize the teams or sport involved). But it does show how difficult these models are, even with the best faith involved. Still, in the case of Dr. Lumley's example there is a universal outcome that has broad agreement (does this team win games) that is being predicted. In education we lack this very clean outcome which is where it gets tricky -- in a sense we are modeling a latent variable (student outcomes).

All of this suggests that we should be cautious about these models and perhaps this would be an appropriate time to put some serious effort into student outcomes ascertainment so that it will be easier to calibrate these statistical models (making the outcome the test score seems clever but merely hides the problem rather than solving it unless we are confident that the score is a very good measure of outcomes).

Wednesday, July 31, 2013

Stories I should probably be writing about

Thoreau has a glowing review of this book on pedagogical fads. It looks interesting though given the cost and the quantity I assume they are writing them out by hand.

The Bennett scandal continues to reverberate both in Indiana and Florida.

Democracy Prep and its founder and superintendent Seth Andrew have been media darlings but I'm seeing some things that trouble me, both about the school and the founder, particularly when you read between the lines.

Kevin Drum addresses our old friend, peer effects.

Take a look at this school designed largely to give master's degrees to people who came through Teach for America and similar programs. As you might expect, I see some connections between this and the previously mentioned issue of grooming TFAers for leadership positions.

Which segues nicely into lapsed TFAer Gary Rubinstein's blog and its informative series of posts on a recent visit to a KIPP school in New York. He also often addresses the previously mentioned TFA cultural issues that concern me and many of his other comments (like this one) track with my experience almost perfectly.

I'm a big fan of Kaiser Fung, but the Moneyball analogy strikes me as a bad framework for an analytic approach to education reform, bad enough to cause real damage. (I see from the queue that Joseph is planning to address some of these issues tomorrow)

And, to take a break from the education beat, there's this post from Naked Capitalism on a shadow credit reporting system.

p.s. Should have included a link to this Washington Post interview with "the world’s most famous teacher." It nicely lays out the tension between some of our best veteran teachers (in this case one of the major models for KIPP) and the education reform movement.

General versus particular cases

Andrew Gelman did a very interesting article in Slate on how being overly reliant on statistical significance can lead to spurious findings. The authors of the study that he was critiquing replied to his piece. Andrew's thoughts on the response are here.

The led to two thoughts. One, I am completely unimpressed by claims that a paper being in a peer-reviewed journal -- that is a screen but even good test have false positives. All this convinces me of is that the authors were thoughtful in the development of the article, not that they are immune to problems. But this is true of all papers, including mine.

Two, I think that this is a very tough area to take a single example from. The reason is that any one paper could well have followed the highest possible level of rigor, as Jessica Tracy and Alec Beall claim they have done. That doesn't necessarily mean that all studies in the class have followed these practices or that there were not filters that aided or impeded publication that might enhance the risk of a false positive.

For example, I have just finished publishing a paper where I had an unexpected finding that I wanted to replicate (that there was an association was a priori, the direction was reversed from the a priori hypothesis). I found such a study, added additional authors, added additional analysis, rewrote the paper to be a careful combination of two different cohorts, and redid the discussion. Guess what, the finding did not replicate. So then I had the special gift of publishing a null paper with a lot of authors and some potentially confusing associations. If I had just given up at that point, the question might have been hanging around until somebody else found the same thing (I often used widely available data in my research) and published it.

So I would be cautious about multiplying the p-values together for a probability of a false positive. Jessica Tracy and Alec Beall:

The chance of obtaining the same significant effect across two independent consecutive studies is .0025 (Murayama, K., Pekrun, R., & Fiedler, K. (in press). Research practices that can prevent an inflation of false-positive rates. Personality and Social Psychology Review.)

I suspect that this would only hold if the testable hypothesis was clearly stated before either study was done. It also presumes independence (it is not always obvious that this will hold as design elements of studies may influence each other) and that there isn't a confounding factor involved (that is causing both the exposure and the outcome).

Furthermore, I think as epidemiologists we need to make a decision about whether these studies are making strong causal claims or advancing a prospective association that may led to a better understanding of a disease state. We often write articles speaking in the later mode but then lapse into the former when being quoted.

So I guess I am writing a lot to say a couple of things in conclusion.

One, it is very hard to pick a specific example of a general problem when it is possible that any one example might happen to meet the standards required for the depth of inference being made. This is very hard to ascertain within the standards of the literature.

Two, the decision of what to study and what to publish are also pretty important steps in the process. These things can have a powerful influence on the direction of science in a very hard to detect manner.

So I want to thank Andrew Gelman for starting this conversation and the authors of the paper in question for acting as an example in this tough dialogue.

The other side of the ethical failures of the education reform movement

There's an old denominational joke that ends with the punchline "Just don't let them see you. They think they're the only ones up here."

As mentioned before, the culture of the education reform movement is exceptionally strong and cultural identity plays a major role in the lives of movement reformers, particularly those associated with certain institutions like TFA and KIPP. This isn't necessarily a bad thing. There are a lot smart people out there working very hard to improve education because of those cultural forces. Unfortunately, these forces can also make the movement prone to blind spots, often including the belief that they're the only ones up here.

Here's a pertinent passage from Gary Rubinstein, himself a lapsed TFAer.

The KIPP high school has a large area in the middle with a lot of tables, almost like a coffee shop. I went out to get lunch at the nearby Fairway and came back and sat at one of those tables to eat. At the table next to me I overheard a discussion between a KIPP administrator and a teacher. Most of the KIPP administrators, like this woman was, are young and white, as are most of the teachers. This teacher was black and seemed to be in her late 40s. The conference was related to some sort of recommendation letter, maybe for some academic program, that the older teacher was writing for one of her former students. I’m not sure who initiated this discussion, but the administrator was explaining that the letter should be re-written. The issue was that this teacher had been a bit too ‘honest’ in the letter and it would hurt the chances of this student getting into the program. Now I’ve written many recommendation letters, and of course you want to put the student in the best possible light, so I’m not saying that the administrator was wrong in suggesting that this teacher change the letter. I’m just writing about this since some of the things said in the discussion were revealing.

Apparently this student had a bad attitude and failed the course. The teacher had written about this so the administrator explained to this teacher that, yes, the student had failed, but that a lot of students fail that course (I think it was Geometry). Also, it was important that the teacher understand that getting a 60 in that course at the KIPP school was like getting a 90 in most other schools since, I guess she felt like she knew, the other neighborhood schools have extreme grade inflation. The conference was resolved with the teacher agreeing to rewrite the letter keeping these things in mind. I found it interesting that a lot of students fail this course since the media would have us believe that after being in KIPP from 5th grade to 11th grade, students there wouldn’t be failing that much. Also, the assumption that the ‘other’ schools have such low expectations that a 90 there is like a 60 at KIPP, I don’t know if she how she can be so confident about that claim.

This anecdote is troubling on any number of levels, not the least of which is fact that KIPP 60 = Other school 90 is highly debatable (there are a lot of open questions about how to interpret KIPP's numbers but I doubt even the most favorable reading would support the assertion that a D- at KIPP was equivalent to an A- elsewhere), but even if we stipulate to that part, we are still left with all sorts of concerns.

This is, after all, a case of an administrator in a fairly public setting pressuring a teacher to give a student a more favorable evaluation. That's a dangerous line, particularly when you take into account the fact that getting more students accepted into prestigious programs generates good press for KIPP, helps the administrator's career track and may well figure into funding.

There's nothing new about incentives that encourage teachers to lower standards (or about having administrators play the devil on the shoulder), but the reform movement has greatly raised the stakes, More importantly, they've provided a belief system that make it easier to justify cutting corners and ignore conflicts of interest. Minor lies are OK in a recommendation letter because your students are held to higher standards; mass dumping of students is OK because the better your school does the more schools will adopt your superior model; cooking the books to make your flagship school look good is OK because there must be something wrong with a metric that makes the school look bad.

Tuesday, July 30, 2013

The looting phase of education reform and the other Tony Bennett

[In response to Joseph's prod]

I realize regular readers must be getting tired of these stories (new readers can see why by searching this blog for "looting"), but it looks like we have to go over this one more time. When it comes to metric-based education reform:

1. There are numerous easy and effective ways of gaming the system;

2. There are huge financial and political incentives for gaming the system;

3. There are powerful advocates across the political spectrum (from David Brooks to Jonathan Chait and Matthew Yglesias) who can be relied upon to provide ample cover for those who game the system.

Under these circumstances, it would be shocking if we weren't seeing extensive cooking and out-and-out fraud. Still, even by the standards we've come to expect, this is really something.

From a truly impressive piece of investigative journalism by Tom LoBianco:

INDIANAPOLIS (AP) — Former Indiana and current Florida schools chief Tony Bennett built his national star by promising to hold “failing” schools accountable. But when it appeared an Indianapolis charter school run by a prominent Republican donor might receive a poor grade, Bennett’s education team frantically overhauled his signature “A-F” school grading system to improve the school’s marks.

Emails obtained by The Associated Press show Bennett and his staff scrambled last fall to ensure influential donor Christel DeHaan’s school received an “A,” despite poor test scores in algebra that initially earned it a “C.”

“They need to understand that anything less than an A for Christel House compromises all of our accountability work,” Bennett wrote in a Sept. 12 email to then-chief of staff Heather Neal, who is now Gov. Mike Pence’s chief lobbyist.

The emails, which also show Bennett discussed with staff the legality of changing just DeHaan’s grade, raise unsettling questions about the validity of a grading system that has broad implications. Indiana uses the A-F grades to determine which schools get taken over by the state and whether students seeking state-funded vouchers to attend private school need to first spend a year in public school. They also help determine how much state funding schools receive.

...

Bennett, who now is reworking Florida’s grading system as that state’s education commissioner, reviewed the emails Monday morning and denied that DeHaan’s school received special treatment. He said discovering that the charter would receive a low grade raised broader concerns with grades for other “combined” schools — those that included multiple grade levels — across the state.

“There was not a secret about this,” he said. “This wasn’t just to give Christel House an A. It was to make sure the system was right to make sure the system was face valid.”

However, the emails clearly show Bennett’s staff was intensely focused on Christel House, whose founder has given more than $2.8 million to Republicans since 1998, including $130,000 to Bennett and thousands more to state legislative leaders.

Bennett estimated that 12 or 13 schools benefited, not just Christel House, but the emails show DeHaan’s charter was the catalyst for any changes.

“The fact that anyone would say I would try to cook the books for Christel House is so wrong. It’s frankly so off base,” Bennett said in a telephone interview Monday evening.

Bennett rocketed to prominence with the help of former Indiana Gov. Mitch Daniels, former Florida Gov. Jeb Bush and a national network of Republican leaders and donors, such as DeHaan. Bennett is a co-founder of Bush’s Chiefs for Change, a group consisting mostly of Republican state school superintendents pushing school vouchers, teacher merit pay and many other policies enacted by Bennett in Indiana.

...

But trouble loomed when Indiana’s then-grading director, Jon Gubera, first alerted Bennett on Sept. 12 that the Christel House Academy had scored less than an A.

“This will be a HUGE problem for us,” Bennett wrote in a Sept. 12, 2012, email to [then-chief of staff Heather] Neal.

Neal fired back a few minutes later, “Oh, crap. We cannot release until this is resolved.”

By Sept. 13, Gubera unveiled it was a 2.9, or a “C.”

A weeklong behind-the-scenes scramble ensued among Bennett, assistant superintendent Dale Chu, Gubera, Neal and other top staff at the Indiana Department of Education. They examined ways to lift Christel House from a “C” to an “A,” including adjusting the presentation of color charts to make a high “B” look like an “A” and changing the grade just for Christel House.

It’s not clear from the emails exactly how Gubera changed the grading formula, but they do show DeHaan’s grade jumping twice.

...

Bennett said Monday he felt no special pressure to deliver an “A” for DeHaan. Instead, he argued, if he had
paid more attention to politics he would have won re-election in Indiana.

Yet Bennett wrote to staff twice in four days, directly inquiring about DeHaan’s status. Gubera broke the news after the second note that “terrible” 10th grade algebra results had “dragged down their entire school.”

...

When Bennett requested a status update Sept. 14, his staff alerted him that the new school grade, a 3.50, was painfully close to an “A.” Then-deputy chief of staff Marcie Brown wrote that the state might not be able to “legally” change the cutoff for an “A.”

“We can revise the rule,” Bennett responded.

Over the next week, his top staff worked arduously to get Christel House its “A.” By Sept. 21, Christel House had jumped to a 3.75. Gubera resigned shortly afterward.

This is a big story for a number of reasons.

There's the scale of the thing.

There's the funding aspect; assuming something of a zero-sum arrangement, some schools had to be cheated out of some of the money that was coming to them.

There's the seemingly complete lack of integrity on the part of the Indiana Department of Education. Pressure to change a grading formula is one of the most common ethical challenges educators face. We all know the right thing to do in this situation, but it appears from the emails that no one in power seriously tried to hold the ethical line.

There's Bennett's position in the reform movement. Under his watch, Florida is pushing one of the most extreme reform agendas. Perhaps more troubling, even before the Indiana revelations came out, the Florida Department of Education had already been accused of cooking charter school results since he arrived.

Monday, July 29, 2013

Paging Mark P

Mark Thoma posted this today.

I think the whole idea of charter schools has some merit as a means of educational experimentation. But if this sort of cheating occurs, it makes it impossible to trust the data and that removes most of the benefit of being able to "let a thousand flowers bloom in the hopes that one will be especially amazing". It could be an isolated incident, but even a single case this egregious makes it much harder to trust that education reform is adhering to strict metrics.

[My response is here -- Mark P]