West Coast Stat Views (on Observational Epidemiology and more)

Thursday, February 10, 2011

I really ought to do a post on estimation and developing mathematical intuition one of these days

Till then, check out this decidedly cool offering from MIT's reliably cool Open Courseware (via DeLong):

18.098 / 6.099 Street-Fighting Mathematics

This course teaches the art of guessing results and solving problems without doing a proof or an exact calculation. Techniques include extreme-cases reasoning, dimensional analysis, successive approximation, discretization, generalization, and pictorial analysis. Applications include mental calculation, solid geometry, musical intervals, logarithms, integration, infinite series, solitaire, and differential equations. (No epsilons or deltas are harmed by taking this course.) This course is offered during the Independent Activities Period (IAP), which is a special 4-week term at MIT that runs from the first week of January until the end of the month.

The Principal Effect -- repost

[I'm working on a post about this New York Times story about not holding principals accountable for failing schools so I thought I'd use this to get the discussion started.]

When it comes to education reform, you can't just refer to the elephant in the room. It's pretty much elephants everywhere you look. There is hardly an aspect of the discussion where reformers don't have to ignore some obvious concern or objection.

The elephant of the moment is the effect that principals and other administrators have on the quality of schools. Anyone who has taught K through 12 can attest to the tremendous difference between teaching in a well-run and a badly-run school. Even the most experienced teacher will find it easier to manage classes, cover material, and keep students focused. All of those things help keep test scores up, as does the lower rate of burn out. For new teachers, the difference is even more dramatic.

On top of administrator quality, there is also the question of compatibility. In addition to facing all the normal managerial issues. teacher and and principal have to have compatible educational philosophies.

As we've mentioned more than once on this site, educational data is a thicket of confounding and aliasing issues. That thicket is particularly dense when you start looking at teachers and principals and, given the concerns we have about the research measuring the impact of teachers on test scores, I very much doubt we will ever know where the teacher effect stops and the principal effect starts.

In the center, the National Review. On the right, the New Republic

Jim Manzi has an excellent column discussing proposed teacher evaluation metrics from a business perspective, a column that raises some of the same questions that teacher's unions have brought up. There's nothing particularly surprising about that -- Manzi is an intelligent man with a well known independent streak. He's not going to disagree with a position just because he's a conservative.

Jonathan Chait dismisses Manzi's points with some sweeping generalities, completely ignores his point about fairness to the evaluated and ends up being significantly less sympathetic to the concerns of labor than Manzi. Sadly this not surprising either. Chait is one of the most brilliant pundits we have but on the topic of education he combines intense feelings with an apparent lack of knowledge of the important research in the field. This has caused him to embrace certain popular narratives even when they lead him to conclusions that contradict his long standing values.

But as unsurprising as the parts may be, when you put them together the strangeness of the current education debate just sweeps over you. Formerly right-wing positions like privatizing large numbers of schools or denying unions the right to protect workers from unfair termination are now dogma for much of the left. It has reached the point where when a writer for the National Review suggests, as part a larger analysis, that teachers can have legitimate concerns about the reliability of the metrics used to evaluate them, the voice of the New Republic dismisses the possibility without even feeling the need to make an argument.

Even without the political role reversal, Chait's response is strange and oddly disengaged. Judge for yourself.

[I'm presenting these out of order for reasons that will obvious]

1. You need some system for deciding how to compensate teachers. Merit pay may not be perfect, but tenure plus single-track longevity-based pay is really, really imperfect. Manzi doesn't say that better systems for measuring teachers are futile, but he's a little too fatalistic about their potential to improve upon a very badly designed status quo.

Argument by modifier with not one but two 'really's and a 'very' to sell the point. What he doesn't give is any kind of supporting evidence whatsoever. With millions of teachers and a small but thriving industry of think tanks digging up damning anecdotes, you can always find something negative to say, but Chait doesn't even bother coming up a bad argument.

There's an odd, listless quality to the entire post. Chait is normally an energetic and relentless debater. Here he just goes through the motions. He doesn't even bother to proof his prose (I'm pretty sure he either meant to say "the search...is futile"). He also makes a huge jump from the specific techniques Manzi is focusing on to "better systems." I'm pretty sure that Manzi believes better systems can improve the status quo; he just questions how big a role value-add metrics will play in those systems.

As for the case for longevity vs. value-added, I'll let Donald Rubin take it from here:

We do not think that their analyses are estimating causal quantities, except under extreme and unrealistic assumptions.

This is not to say that there isn't a case to be made for merit pay. I don't have any problem with rewarding teachers who do exceptional work, but the methods being discussed here are simply not the way to do it.

Chait's third point runs along similar lines:

3. In general, he's fitting this issue into his "progressives are too optimistic about the potential to rationalize policy" frame. I think that frame is useful -- indeed, of all the conservative perspectives on public policy, it's probably the one liberals should take most seriously. But when you combine the fact that the status quo system is demonstrably terrible, that nobody is trying to devise a formula to control the entire teacher evaluation process, and that nobody is promising the "silver bullet" he assures us doesn't exist, his argument has a bit of a straw man quality.

More argument by adverb and a strange double straw man (straw-straw man? straw straw man man?) continued from the soon-to-be-discussed point 2. The first 'nobody' is doubtful; Chait seems to jump from the fact that no state currently bases evaluations primarily on value-added metrics to the conclusion that no one is even looking into the possibility. The second 'nobody' is just plain wrong; many reform movement followers have so much faith in the silver bullet status of value-added metrics that they have seriously proposed firing more than half of our teachers based on that one number.

But the weirdest part came in point 2.

2. Manzi's description...
evaluating teacher performance by measuring the average change in standardized test scores for the students in a given teacher’s class from the beginning of the year to the end of the year, rather than simply measuring their scores. The rationale is that this is an effective way to adjust for different teachers being confronted with students of differing abilities and environments.
..implies that quantitative measures are being used as the entire system to evaluate teachers. In fact, no state uses such measures for any more than half of the evaluation. The other half involves subjective human evaluations.

Argument by ellipses. Take a look at the whole paragraph:

Recently, Megan McArdle and Dana Goldstein had a very interesting Bloggingheads discussion that was mostly about teacher evaluations. They referenced some widely discussed attempts to evaluate teacher performance using what is called “value-added.” This is a very hot topic in education right now. Roughly speaking, it refers to evaluating teacher performance by measuring the average change in standardized test scores for the students in a given teacher’s class from the beginning of the year to the end of the year, rather than simply measuring their scores. The rationale is that this is an effective way to adjust for different teachers being confronted with students of differing abilities and environments.

Manzi explicitly says "widely discussed attempts." Now, for the sake of comparison, check out the New York Times' similar wording:

A growing number of school districts have adopted a system called value-added modeling to answer that question, provoking battles from Washington to Los Angeles — with some saying it is an effective method for increasing teacher accountability, and others arguing that it can give an inaccurate picture of teachers’ work.
The system calculates the value teachers add to their students’ achievement, based on changes in test scores from year to year and how the students perform compared with others in their grade.

Manzi was perfectly clear with his wording and used language consistent with the New York Times' coverage. It was only by excerpting his paragraph mid-sentence that Chait was able to get even the suggestion of a distortion.

I have somewhat mixed feelings Manzi's business-based approach. There are certain aspects of education that are, if not unique, then at least highly unusual and you have to be careful when drawing analogies (obviously the subject for another, much longer post). That said, all of his points about the way evaluations work are valid and useful.

This is not a bad place to start the debate.

[You can read Jim Manzi's somewhat bewildered reaction to Chait's column here.]

On the off chance that you ever wondered what "The Love Song of J Alfred Prufrock" would sound like if written by Rudyard Kipling...

Today would, amazingly enough, seem to be your lucky day.

Wednesday, February 9, 2011

Jim Manzi has some smart things to say about teacher evaluations

From the National Review (via Chait, but more on that later)

This seems like a broadly sensible idea as far as it goes, but consider that the real formula for calculating such a score in a typical teacher value-added evaluation system is not “Average math + reading score at end of year – average math reading score at beginning of year,” but rather a very involved regression equation. What this reflects is real complexity, which has a number of sources. First, at the most basic level, teaching is an inherently complex activity. Second, differences between students are not unvarying across time and subject matter. How do we know that Johnny, who was 20 percent better at learning math than Betty in 3rd grade is not relatively more or less advantaged in learning reading in fourth grade? Third, an individual person-year of classroom education is executed as part of a collective enterprise with shared contributions. Teacher X had special needs assistant 1 work with her class, and teacher Y had special needs assistant 2 working with his class — how do we disentangle the effects of the teacher versus the special ed assistant? Fourth, teaching has effects that continue beyond that school year. For example, how do we know if teacher X got a great gain in scores for students in third grade by using techniques that made them less prepared for fourth grade, or vice versa for teacher Y? The argument behind complicated evaluation scoring systems is that they untangle this complexity sufficiently to measure teacher performance with imperfect but tolerable accuracy.
Any successful company that I have ever seen employs some kind of a serious system for evaluating and rewarding / punishing employee performance. But if we think of teaching in these terms — as a job like many others, rather than some sui generis activity — then I think that the hopes put forward for such a system by its advocates are somewhat overblown.

There are some job categories that have a set of characteristics that lend themselves to these kinds of quantitative “value added” evaluations. Typically, they have hundreds or thousands of employees in a common job classification operating in separated local environments without moment-to-moment supervision; the differences in these environments make simple output comparisons unfair; the job is reasonably complex; and, often the performance of any one person will have some indirect, but material, influence on the performance of others over time. Think of trying to manage an industrial sales force of 2,000 salespeople, or the store managers for a chain of 1,000 retail outlets. There is a natural tendency in such situations for analytical headquarters types to say “Look, we need some way to measure performance in each store / territory / office, so let’s build a model that adjusts for inherent differences, and then do evaluations on these adjusted scores.”

I’ve seen a number of such analytically-driven evaluation efforts up close. They usually fail. By far the most common result that I have seen is that operational managers muscle through use of this tool in the first year of evaluations, and then give up on it by year two in the face of open revolt by the evaluated employees. This revolt is based partially on veiled self-interest (no matter what they say in response to surveys, most people resist being held objectively accountable for results), but is also partially based on the inability of the system designers to meet the legitimate challenges raised by the employees.

I found the point about techniques that hurt futures performance particularly good. When I was teaching, how well a class would go was greatly influenced by how well previous teachers had done their jobs. Did the students understand the foundations? Did they have a good attitude to the material? Good work habits and study strategies?

Teachers want reliable evaluations not just because they want to be rewarded for good work but also because they want to see incompetent teachers identified so that those teachers can be encouraged to do better, given training to improve their performance or, should the first two fail, fired. What they object to is having their fates rest on a glorified roll of the dice.

Michael Hiltzik on the Texas Miracle

Lots of good stuff in this comparison of the surprisingly similar fiscal woes of my native and adopted states. In particular, the following passage caught my eye:

Curiously, Texas' reputation as a low-tax, business-friendly state survives although its state and local business levies exceed California's as a percentage of each state's business activity (4.9% versus 4.7% in 2009, according to a report by the accounting firm Ernst & Young). What's different is that Texas business taxation relies more on property, sales and excise taxes and government fees than California, which relies on taxing corporate income.
Of course, one reason many business owners and executives favor Texas over California is that the Lone Star State doesn't have a personal income tax — a big deal when you're pulling in a Texas-size paycheck.
But self-interest aside, what's at stake from fiscal policy in both states is the same — the services and programs that really matter to business owners, such as functioning schools, high-caliber universities and serviceable transport infrastructure.
Even more important are the measures that point to public well-being. In many categories, California and Texas are closer together than either state's residents would probably find comforting.
But here are a few where they're not: Texas ranks 49th in the nation (that is, third worst) in teen births; California 22nd. In providing prenatal care to expectant mothers, Texas is dead last, California eighth. Texas ranks 34th in median family income, with $47,143; California 13th, at $56,852. This is the harvest of its "superior policies," and given the current budget crisis, it's bound to get worse. Miraculous.

What do you do when things are tight?

I was reading two different pieces today and I thought that they had a really interesting link between the two of them.

From Dana Goldstein:

While we're on the subject of Wisconsin, I find Scott Walker sort of terrifyingly simple-minded but charismatic. His education platform is basically Race to the Top plus vouchers while somehow massively cutting education budgets. (Huh?)

From Mark Thoma:

Local school districts have cut 154,000 education jobs since August 2008.

So my question is this: why is the push for excellence being connected with schemes to reduce manpower costs? If the argument is that education is a key priority then why are we not increasing funding for education? Instead we have the odd situation where the state wants education to improve while cutting expenses.

Usually when this contradiction shows up, the government is seeking cover for the decision to cut services. If cutting expenses also results in better outcomes than we are all better off, right? Or it could be an attempt to remove the more senior (and thus higher paid) teachers to minimize the impact of budgetary decisions that have already been made. But that is a different conversation, isn't it?

Now consider another area that the state runs that is in a similar position, namely the military. Is anybody seriously arguing that some soldiers do not pull their weight? That we could be more effective with a smaller force? After all, wasn't there a movie (Rambo, for example) where a single heroic special forces soldier was more effective than a brigade? But if the administration began talking about waste and cost effectiveness then you would be certain what they really wanted was cover for cuts. Now imagine they talked about those lazy soldiers who re-enlisted or who were only interested in rewards? Who needs a veterns administration when soldiers are fighting for principle and principle alone?

Would such cuts make sense? Either for the military or for education it is a matter of opportunity cost. But maybe the best conversation to have is one about the trade-off between the options. Taxes hurt economic growth but lack of education or defence can both lead to fairly bad long term outcomes. I am not sure where the balance is but I'd prefer to have the conversation openly. Pretending that test scores plus cuts will somehow improve education seems odd.

More efficient models of defense and education may both exist, but then the optimal path seems to be to show the efficiencies first and implement the cuts second.

New rule

Anyone who reprints one of Piet Hein's Grooks gets an automatic link.

An experiment in blogging -- the conclusion [reposted]

[I'm about finished with a longer post that refers to this topic so I decided to do a repost. I apologize to long time readers.]

When assessing a statement, sometimes it's useful to rephrase it in a more general way and see how well it holds up. I tried that with a passage I found in a popular blog (one of the very few I read every day). Where the author had referred to members of a specific profession I substituted in the word 'employees' except when talking about unions ('employees unions' seemed redundant). I also changed a couple of words for consistency, but other than that the passage was exactly the same.

The resulting paragraph (seen below) was much more extreme than I had expected and it got me to thinking, how would people react to this passage if they encountered it without all the baggage? I decided to post the generalized version with a brief explanatory note then give people a couple of days to think about it before filing in the details.

Here's the generalized passage:

If you concede that employers need to be able to fire bad employees, then you can't fully defend the role of the unions. You can defend the concept of unions, and you can believe that some of the things unions do, like bargain for higher aggregate wages, help society. But most unions demonstrably make it very difficult to fire bad employees. That is currently a core function of unions, and something that must change. You're also going to need higher salaries to attract a better caliber employee into the workforce, and that's something unions could potentially help. But being "treated like professionals" has to mean both the opportunity to earn a good living if you do well and the potential to be fired if you fail.

And here is the passage Jonathan Chait (that's right, Jonathan Chait) originally posted in his blog:

If you concede that principals need to be able to fire bad teachers, then you can't fully defend the role of the unions. You can defend the concept of unions, and you can believe that some of the things unions do, like bargain for higher aggregate wages, help education. But most teachers unions demonstrably make it very difficult to fire bad teachers. That is currently a core function of teachers unions, and something that must change. You're also going to need higher salaries to attract a better caliber teacher into the profession, and that's something unions could potentially help. But being "treated like professionals" has to mean both the opportunity to earn a good living if you do well and the potential to be fired if you fail.

There are obviously two possible responses Chait could make here (three if you count ignoring it entirely). He could say he agrees with the general statement or he could argue that teachers are a special case and should be granted less union protection than, say, policemen.*

Ironically, the more defensible position Chait can take here is the extreme one, namely that unions should not do anything to discourage employers from firing their members. It's not a position that most readers of the New Republic would embrace but, as a statement of personal belief, it is extraordinarily difficult to rebut.

If he tries to explain why teachers constitute a special case, he will have to deal with the data and in this particular debate, the numbers are not his friends. (It's worth remembering that Diane Ravitch started out on Chait's side. Her road to Damascus came when she realized she could no longer reconcile those views with what she was seeing in the research findings.)

Jonathan Chait can be a formidable debater but he has shown himself to be largely ignorant of the research behind these issues (no one at TNR even knew enough about PISA to catch the bait and switch in the intro to Waiting for Superman and in the education debate that's about as slow as the pitches get).

He'll be trying to punch holes in the findings of institutions like EPI and Rand and big guns in the field like Donald Rubin. He'll have to show precipitous educational decline without resorting to the aforementioned PISA (good test but absolutely meaningless in this context). He'll have to explain why schools that use his policies are more likely to underperform than to outperform unionized schools. He'll have to justify firing people based on metrics so volatile that a third of teachers in the top 20% could find themselves in firing range the next year, metrics based on data so confounded that "students’ fifth grade teachers were good predictors of their fourth grade test scores."

This is one time the smart money is on the other guys.

* Yes, we fire policemen. What we don't do is is fire policemen based on unreliable metrics that are largely outside of the officers' control and are easily manipulated by their superiors

An experiment in blogging -- reposted

[I'm about finished with a longer post that refers to this topic so I decided to do a repost. I apologize to long time readers.]

This will just take a minute of your time.

What follows is a passage from a popular blog, rewritten slightly to make it more general but otherwise unchanged. I'll post the original quote with some comments Monday or Tuesday. [later today for the repost -- Mark]

I'd appreciate it if you would take a look at this and give some thought both to the arguments proposed and to the larger belief system they suggest, then come back in a few days and see what effect learning the context has had on your initial impressions.

Thanks.

If you concede that employers need to be able to fire bad employees, then you can't fully defend the role of the unions. You can defend the concept of unions, and you can believe that some of the things unions do, like bargain for higher aggregate wages, help society. But most unions demonstrably make it very difficult to fire bad employees. That is currently a core function of unions, and something that must change. You're also going to need higher salaries to attract a better caliber employee into the workforce, and that's something unions could potentially help. But being "treated like professionals" has to mean both the opportunity to earn a good living if you do well and the potential to be fired if you fail.

I welcome comments but please don't include the source of the passage. Obviously that would undercut the point of the experiment.

Tuesday, February 8, 2011

Catching up with Dana Goldstein

I'm going to try to comment on each of these individually later. Feel free to beat me to the punch.

How Politically Astute is Michelle Rhee?

The Revival of the Private School Voucher Movement

Wisconsin Teachers' Union Prepares for Battle with GOP Gov

If I'm going to compare H-1Bs and serfdom, I should at least find a British guy for the video

The Daily Show With Jon Stewart

Mon - Thurs 11p / 10c

Olivers on the Strike

www.thedailyshow.com

Daily Show Full Episodes

Political Humor & Satire Blog

The Daily Show on Facebook

"If you don't do well in school, your descendants could grow up to be Morlocks"

OK, maybe it's not that bad, but Berkeley professor Claude Fischer still paints a grim picture. (via Thoma, of course.)

Degree inequality

It is now generally understood that economic inequality has expanded greatly since about 1970. (Well, there are exceptions. For a couple of decades, some commentators denied that economic inequality was growing, claiming that it was all a statistical illusion. A few holdouts against reality may remain.) Now the debate has shifted to what – if anything at all – should be done about inequality.

Most of that discussion has been about income inequality. Between 1979 and 2007, the one-fifth of American households with the highest income experienced a roughly 100% increase in their annual, inflation-adjusted, after-tax income (280% [!] for the highest one percent of households); the middle one-fifth got about 25% more income; and the poorest one-fifth got about 15% more (see pdf). For wealth – property, stocks, and the like – the gap is enormously greater and has also widened over the last few decades.

Less discussed is the widening college degree gap. Yet its implications go considerably beyond money, to widening differences in life experiences and ways of life. (I draw in particular on the work of my colleague, Michael Hout, notably here [pdf], and on two books we wrote together, here and here.)

Fischer follows this with a number of troubling statistics. I found this part particularly striking:

Even is more happening along the education gap: Increasingly, college graduates marry college graduates and live among college graduates. Increasingly, Americans group by education and their ways of life diverge by education.

Although the trends are complex (see here), Americans today are likelier to marry people of the same educational level as themselves than was true decades ago. Some of this development results from educated men increasingly marrying educated women; for example, the lawyer who married his secretary is now a lawyer who marries another lawyer. And some of this change is due to poorly-educated men becoming ineligible as spouses; drop-outs can no longer support families on brawn alone.

Then there is residential separation: A study by Thurston Domina (pdf) shows that college graduates are concentrating in some metropolitan areas (San Francisco and Raleigh-Durham, for example) and seem to be avoiding others (Indianapolis and Las Vegas, for example) and also that neighborhood segregation by college education grew substantially between 1970 and 2000. It grew faster than segregation by income, even as segregation by race declined. Another study documents how the highly-educated are concentrating in the downtowns of the most booming cities. And a recent story reported that these degree-holders are starting to raise their children in center cities — even in Manhattan. Thus, enclaves of the highly-educated are growing in chic, gentrified, non-smoking neighborhoods, while the less educated move to the scraggly, sprawling suburbs of stagnating cities.

What is less clear, although certainly plausible, is that this widening separation carries along with its economic and social divisions, a widening gap in values and ways of life: two different Americas, divided educational attainment.

I would find the use of just desserts as a justification for policy more palatable if we weren't seeing an alarming decline in social mobility.

Krugman does stat 101

Nothing fancy, no big insights, just a couple of bell curves, but this post by Paul Krugman does a nice job presenting weather trends and extreme events in terms of probability. He makes it simple but not overly simplified. We need more of this.

Krugman also deserves bonus points for describing economist's practice of putting the independent variable on the wrong axis as a QWERTY problem.

Semi-serfdom

What are going to do with Paul Krugman? He writes clear and insightful articles on complex economic issues, then he gives them titles like "Serf's Up" (circa 2003):

Here's the puzzle. In Europe circa 1100, with population scarce, serfdom was useful to the ruling class. By 1300 it wasn't, and had been allowed to drift away. But after 1348 it should have been worthwhile again. Yet it wasn't effectively reimposed. There were attempts to restrain wages and limit labor mobility, as well as attempts to tax the peasants (Wat Tyler's rebellion fits into all this.) But all-out feudalism didn't return. Why?

And an even bigger question: why hasn't indentured servitude made a comeback in the modern era? Yes, I know, human rights and all that - but if it was profitable to have indentured servants in the modern world, I'm sure that Richard Scaife's think tanks would have no trouble finding justifications, and assorted Christian groups would explain why it's God's will.

Though the analogy's not perfect, we do have an institution that restrains wages and limits labor mobility. It's called an H-1B visa and though I know of no Christian groups trying to explain why it's God's will, it certainly has plenty of think tanks finding justifications for it.