Wednesday, February 16, 2011

Forget Jeopardy, show me a computer that can play Eleusis

The odd thing about the much publicized Jeopardy match between humans and IBM's Watson is how differently both sides are challenged by the game. Arguably the hardest part for the human players, acquiring and retaining information, is trivial for the computer while certainly the hardest part for Watson, understanding everyday human language, is something almost all of us master as young children.

Natural language processing continues to chug along at a respectable pace. Things like Watson and even Google Translate represent remarkable advances. Still, they hardly seem like amazing advances in artificial intelligence. I'm not going to worry about the rise of the machines until they start beating us at games like Robert Abbott's Eleusis.

Abbott's game (old Eleusis -- you can buy a booklet of rules for the updated game from Mr. Abbott himself) made its national début in the Second Scientific American Book of Mathematical Puzzles and Diversions by Martin Gardner. It's easy to play but a bit complicated to score (not unnecessarily complicated -- there's a real flash of insight behind the process).

The dealer (sometimes referred to as 'Nature' or 'God' for what will be obvious reasons) writes a rule like "If the card on top is red, play a black card. If the card on top is is black, then play a red card." on a piece of paper then folds it and puts it away. The dealer then shuffles the deck, randomly selects a card, puts it face up in the center of the table then deals out the rest evenly to the players (the dealer doesn't get a hand). If the number of cards isn't divisible by the number of players the extra cards are put aside.

The first player selects any card from his or her hand and puts it on top of the starter card. Based on the hidden rule, the dealer says 'right' and the card stays on the pile or says 'wrong' and the card (called a mistake card) goes face up in front of the player. The players continue in turn

The object for players is to have as few mistake cards as possible. The object for the dealer is to have the largest possible range in players' scores.

At the end of the first hand, the score is calculated for the dealer. The scoring method is clever but a bit complicated. For n players (excluding the dealer), have each player count his or her mistake cards then multiply the smallest number by n-1 and subtract the product from the total number of mistake cards in front of the other players. For example, if there were four players with 7, 2, 9 and 8 mistake cards, you would multiply 2 (the lowest) times 3 (n-1) and subtract that from 24 (the sum of the rest).

In the second stage, the players take turns selecting cards from their mistake pile (leaving them face up so that other players can see what has been rejected). Play continues until someone goes out or until the dealer sees that no more cards can be accepted. At that point the rule is revealed.

Players' score are then calculated with a formula similar to the one used for the dealer: each player multiplies his or her mistake cards by n-1 then subtracts the product from the total of the other players' mistake cards. If the difference is negative, the score is zero. The player who goes out first or who has the fewest cards if no one goes out gets an additional six points.


While most 'new' games are actually collections of old ideas with new packaging, Abbott managed to come up with two genuinely innovative ideas for Eleusis: the use of induction and the scoring of the dealer. As someone who has spent a lot of time studying games, I may be even more impressed with the second. One of the fundamental challenges of game design is coming up with rules that encourage strategies that make the game more enjoyable for all the players. In this case, that means giving the dealer an incentive to come up with rules that are both challenging enough to stump some of the players and simple enough that someone will spot the pattern.

Eleusis has often been used as a tool for teaching the scientific method. You recognize a pattern, form a hypothesis, and test it. Gardner discusses this analogy at length. At one point, he even brings William James and John Dewey into the conversation.

The New York Times said that Robert Abbott's games were "for lovers of the unfamiliar challenge." Any AI people out there up to that challenge?

Tuesday, February 15, 2011

Notice anything funny?

Felix Salmon makes the catch.

Hint: always check out the x-axis.

"Looking under the lamp post"

People are hard-wired for convergent behavior. We instinctively imitate, we constantly reset our social norms and we love to feel included. One of the best ways to achieve this feeling of inclusion is to talk about what everyone else is talking about.

When you combine this natural desire to join the conversation with the constant pressure on journalists to come up with an angle for every story, the result is a natural tendency to converge on a standard narrative, particularly if this narrative plays off another hot topic.

Case in point, the recent events in Egypt have been repeatedly described using social networks and references to Facebook, Wikipedia and Twitter. The phrase 'egypt "revolution 2.0"' produces almost ninety thousand results (If you hear the term 'two point oh' to describe non-sarcastically anything other than a new product release, you can be pretty sure the speaker is trying to feel included).

But did revolutionaries friending each other and texting on their cell phones really contribute to the fall of the government? According to this excellent post over at Whimsley (via Thoma, of course), the answer is yes, but not as much as you might think.

The easiest people to talk to

Most obviously, it is much easier to talk to English speaking participants than non-English speakers. English speakers are far more likely to be part of the one-fifth or so of the country that has access to the Internet. (World Bank Development Indicators). And it is easy to contact people over the Internet, so we hear from people who are on the Internet. It is easy to follow Twitter feeds, so we hear Egyptian tweets.

The easiest story to tell

It isn't just the sources, though. The Facebook Revolution narrative is an interesting story to tell to a contemporary Western audience. For us, a story built around the familiar yet novel world of Facebook and social media is an easy way into the Egyptian rebellion. How many of us know much about the specifics of Egypt's history, its recent past, or the economic sources of discontent? It is a much quicker and lighter story to say "look at the Facebook page." We can even go and look at it ourselves (>>). Talking about strikes is more likely to lose an audience.

So every time prominent activist Wael Ghonim is mentioned, he is described as a "Google executive Wael Ghonim" even though he has explicitly said that "Google has nothing to do with this" (>>). Do we hear the employer of any of the other leaders? April 6 Movement founders Asmaa Mahfouz, Ahmed Maher and Ahmed Salah are commonly described as "activists". It is possible to track down Maher's occupation as a "civil engineer", but with no employer. The discrepancy is glaring, and so Google gets to be associated with the uprising, adding to the digital tone of the story.

Underreported players

As people look back for the roots of the rebellion, the April 6 Movement and the We Are Khaled Said Facebook page have received much of the attention. But there are other strands that fed into the protests. The April 6 Movement was created to commemorate an industrial strike, after all, at a textile factory. There have been more than 3,000 separate labour protests in Egypt since 2004 according to a report by the AFL-CIO. The Kefaya movement is considered by some experts to be a central organizer of the January 25 protests, along with Mohamed ElBaradei's organization (two-minute video with Samer Shehata).

An interesting perspective

Matt Yglesias makes a good point:

And on the politics, it’s a mess. Right now we have conservatives simultaneously calling for huge spending cuts and also getting the line’s share of old people’s votes even while the vast majority of non-security spending is on old people. In essence, by first separating the domestic budget into “discretionary” and “entitlement” portions and then dividing the entitlement programs up into “what today’s old people get” versus “what tomorrow’s old people will get” the political class has created a large and vociferously right-wing class of people who are completely immune from the impact of their own calls for fiscal austerity. In my view, that reality is the biggest driver of our current political dysfunction.


I had not thought about things like this but it is a really good point. I dislike the idea of revising benefit levels because people plan their lives around these benefits and it seems unfair to change things mid-stream.

However, I had completely overlooked the political point involved. Social Security, Medicare and CHIP are 41% of the budget. Veterns and retirees are another 7%. This makes about half of the budget being focused on people over 55/60 years of age.

So I think I agree that this is a better point than the moral one. The conversation about the budget becomes a lot more sane if there is not a "protected class" of citizens. It's not a conclusion I like but I think it might be correct.

"Human see; human do."

There was a fascinating interview on NPR's Fresh Air earlier today. I particularly enjoyed this section:
If you're just joining us, we're speaking with V.S. Ramachandran. He is a behavioral neurologist and author of the new book "The Tell-Tale Brain: A Neural Scientist's Quest for What Makes Us Human."

You write a lot about mirror neurons and the role that they played on our evolution. You want to just tell us a little bit about that?

Dr. RAMACHANDRAN: Well, mirror neurons were not discovered by us, obviously. They were discovered by Giacomo Rizzolatti in Parma, Italy, and his colleagues. And what they refer to is in the front of the brain, the motor and pre-motor cortex, there are neurons that issue commands to your hands and other parts of your body to perform specific actions, semi-skilled actions, skilled actions or even non-skilled actions. So these are motor-command neurons which orchestrate specific sequence of muscle twitches for you to reach out and grab a peanut, for example, or put it in your mouth.

What Rizzolatti and his colleagues found was some of these neurons, as many as 20 percent or 30 percent, will fire not only when - let's say I'm measuring mirror neuron activity in your brain. So when you reach for a peanut, these neurons fire. But the astonishing thing is these neurons will also fire when you watch me reaching for a peanut so these are promptly dubbed mirror neurons for obvious reasons. So it's as though your brain is performing a virtual reality simulation of what's going on in my brain, saying, hey, the same neuron is firing now when he's doing that as would fire when I reach out and grab a peanut, therefore, that's what that guy's up to.

He's about to reach out and grab a peanut. So it's a mind-reading neuron. It's essential for you seeing other people as intentional beings who are about to perform certain specific intended actions.

DAVIES: And that might have helped us learn from one another and thereby advanced culturally far beyond our...

Dr. RAMACHANDRAN: That's correct. That's the stuff - that's kind of an obvious behind-site, but that's the claim I made, oh, about 10 years ago in a website run by Brockman called "Edge." And what I pointed out was - and others have pointed this out, too, is that mirror neurons obviously are required for imitation and emulation. So if I want to do something complicated that you're doing and I want to imitate it, I have to put myself in your shoes and view the world from your standpoint. And this is extremely important.

It seems like something trivial, you know, mimicry, but it's not. It's extremely important because imitation is vital for certain types of learning, rudimentary types of learning. These days you learn from books and other things, but in the early, early days when hominids were evolving, we learned largely from imitation. And there's a tremendous acceleration of evolution illusionary process. What I'm saying is maybe there are some outliers in the population who are especially smart simply because of genetic variation, who have stumbled, say, accidentally on an invention, like fire or skinning a bear.

Without the mirror neuron system being sophisticated, it would have died out, fizzled out immediately. But with a sophisticated mirror neuron system, your offsprings can learn that technique by imitation so it spreads like wild fire horizontally across a population and vertically across generations. And that's the dawn of what we call culture and therefore, of civilization.

Monday, February 14, 2011

Google can make you disappear

The SEO's may have it coming but this is still creepy:

Interviewing a purveyor of black-hat services face-to-face was a considerable undertaking. They are a low-profile bunch. But a link-selling specialist named Mark Stevens — who says he had nothing to do with the Penney link effort — agreed to chat. He did so on the condition that his company not be named, a precaution he justified by recounting what happened when the company apparently angered Google a few months ago.

“It was my fault,” Mr. Stevens said. “I posted a job opening on a Stanford Engineering alumni mailing list, and mentioned the name of our company and a brief description of what we do. I think some Google employees saw it.”

In a matter of days, the company could not be found in a Google search.

“Literally, you typed the name of the company into the search box and we did not turn up. Anywhere. You’d find us if you knew our Web address. But in terms of search, we just disappeared.”

The company now operates under a new name and with a profile that is low even in the building where it claims to have an office. The landlord at the building, a gleaming, glassy midrise next to Route 101 in Redwood City, Calif., said she had never heard of the company.

USA Today has some bad graphs but at least it's not the New York Times

The following quote was included in one of Andrew Gelman's recent posts:
Is this the worst infographic ever to appear in NYT? USA Today is not something to aspire to.
This strikes me as deeply unfair to USA Today. The paper has certainly run its share of bad graphs but these take things to a new level. It is as if the NYT used illustrations from "How to Lie with Statistics" as a starting point and then tried to top them.

Here's the "View of the U.S." where the lower the icon is, the higher its approval.



And here's the "U.S. Pakistan Policy" where the scrolls are arranged so you can't really compare their sizes (I initially thought they were going for some depth effect).

And here's the "Greatest Threat" which takes Huff's height/volume examples to the next level by using images of different shapes and densities.

Finally there's this amazing piece of work:

Just glancing at this you would probably conclude that the amount of blue in the circles corresponds to percentage in agreement. For example, looking at the middle circle you'd assume that almost all of those surveyed were in disagreement. You'd be wrong. More agreed than disagreed. (This was also noted by one of the commenters on Gelman's site.)

While they don't quite match this, these graphs may be the worst we've seen from a major paper in recent memory.




[adapted in part from a comment I left on Andrew Gelman's site]

Great moments in metawork

As a footnote to this post, I once spent an entire meeting (at a corporation that shall remain nameless) writing a team mission statement based on the intro to Star Trek. It consisted of lines like this:

"To seek out new data and new analytic techniques."

The attendees were all experienced modellers and data miners, some fairly high ranking with commensurate salaries. Everyone in that room had something else they needed to be doing and, except for the senior manager present, I doubt that anyone present saw any real value in the exercise. Still, word had come down from the top that every distinct subgroup in the company needed its own mission statement so there we were, boldly splitting that famous infinitive one more time.

On the bright side, at least this was one time we didn't have to have a pre-meeting.

"The Economics of Blogging and The Huffington Post"

After the election season, my regular visits to FiveThirtyEight tapered off then simply came to a stop.

That might have been a mistake on my part.

(thanks again to Felix Salmon)

Concerns with data driven reform

Dead Dad has a post on Achieving the Dream, which is intended to improve outcomes at community colleges. Two of his commentators had really interesting insights. Consider mathguy:

Consider the effect of No Child Left Behind. I've seen a noticeable decline in basic math skills of students of all levels in the last 5 years. Every year, I will discovered a new deficiency that was not seen from the previous years (we are talking about Calculus students not able to add fractions). Yet NCLB was assumed to be "working" since the scores were going up. It seems that K-12 was devoting too much time preparing the students for tests, at the cost of killing students' interest in math, trading quality instruction for test-taking skills. Is NCLB a factor in the study? Are socio-economic factors examined in the study?


or CC Physicist who stated:

I look at what Asst Prof wrote as an indication that a Dean, chair, and mentor didn't do a good job of getting across the history of assessment. Do you know what "Quality Improvement" program was developed a decade earlier, and what the results were of the outcomes assessment required from that round of reaffirmation of accreditation? Probably not, since we have pretty good communication at our CC but all the negative results from our plan were swept under the rug. The only indication we had that they weren't working was the silent phase-out of parts of that plan. Similarly, data that drove what we did a decade ago were not updated to see what has changed.


I think these two statements capture, very nicely, the main issue I have with the current round of educational reform. One, if you make meeting a specific metric (as a measure of on underlying goal) a high enough priority then people will focus on the metric and not the actual goal. After all, if you don’t then your name could be posted in LA Times although with your underperformance on the stated metric. So we’d better be sure that the metric that we are using is very robust in its relation to the underlying goal. In other words, that it is a very good representation of the curriculum that we want to see taught and measures the skills we want to see students acquire.

Two, trust in evidence based reform requires people to be able to believe the data. This is one area where medical research is leaps and bounds ahead of educational research. A series of small experiments are attempted (often randomized controlled trials) while the standard of care continues to be used in routine patient care. Only when the intervention shows evidence of effectiveness in the trial environment is it translated into routine care.

In education, such trials are rare indeed. Let us exclude natural experiments for the moment; if we care enough to change the education policy of a country and to violate employment contracts then it’s fair to hold ourselves to a high standard of evidence. After all, the lotteries (for example) are not a true experiment and it’s hard to be sure that the lottery itself is completely randomized.

The problem is that educational reforms look like “doing something”. But what happens if the reforms are either counterproductive or ineffective (and implanting an expensive reform that does nothing has a high opportunity cost). The people implementing the reforms are often gone in five to ten years but the teachers (at least now while they have job security) remain to clean up the wreckage afterwards.

I think that this links well to Mark's point about meta-work: it's hard to evaluate the contributions of meta-work so it may look like an administrator is doing a lot when actually they are just draining resources away from the core functions of teaching.

So when Dead Dad notes: “Apparently, a national study has found that colleges that have signed on to ATD have not seen statistically significant gains in any of the measures used to gauge success.” Why can’t we use this evidence to decide that the current set of educational reform ideas aren’t necessarily working well? Why do we take weak evidence of the decline of American education at face value and ignore strong evidence of repeated failure in the current reform fads?

Or is evidence only useful when it confirms our pre-conceptions?

Metawork

A business analyst I used to work with had a theory about metawork. His definition of the term was work about work. He cited HR departments as the classic example.

As he liked to explain it, metawork is not, in and of itself, a bad thing. A certain amount is necessary for a well-functioning organization. It's not unusual for new companies to fail because of an overly rich work to metawork mixture.

But, my friend went on, metawork is like a gas -- it expands to fill all available space, both because it's easy to create metawork projects and because those projects can often be stretched to whatever time is available to them (you can always schedule an extra meeting). Furthermore, once it has established a foothold, it has a way of becoming part of the corporate culture.

There are also other reasons why companies tend to grow more metawork heavy as they mature and expand:

Major metawork initiatives tend to be top down (no customer ever said, "I like this company's products but I have a feeling they aren't having enough team-building seminars."). From a career standpoint, it is always a good idea to give a high priority to projects that people above you consider important;

Metawork projects almost always sound good. They have impressive sounding goals like improving efficiency, raising morale, making the company more nimble and responsive, or moving to data-driven strategies (more on that one in future posts). They suggest big-picture, forward-thinking approaches that make fixing problems like billing glitches seem prosaic, perhaps even trivial;

Metawork tends to be safer than the other kind. Let's say a company launches two big and badly-conceived initiatives, a new product launch and a 'data-driven' reworking of the project management process. The product sells badly and the new process eats up man hours without making projects run faster or smoother. Both end up costing the company about the same amount of money, but the product's failure is public and difficult to ignore while the process's failure is internal and can be denied with some goalpost moving and willing suspension of disbelief (something that's easy to generate for a VP's pet project);

As mentioned before, metawork isn't all pre-meetings and mission statements. Some kinds of metawork are essential (payroll comes to mind). Other kinds can help a company improve its profitability and stability (like employee morale studies in a labor-intensive industry with high turnover). Employee can be resistant to some of these good, important initiatives, but it's worth keeping a couple of facts in mind:

There is a lot of bad metawork out there;

The employees who most resent doing metawork are often the employees who are doing the most of the other kind of work.

Sunday, February 13, 2011

Tiger moms are nothing...



Click for the full strip.

One more simple game for your weekend

You have been invited to play a dice game with Pierre and Blaise. The game is played with three dice marked as follows:

Dice A {2,2,4,4,9,9}
Dice B {1,1,6,6,8,8}
Dice C {3,3,5,5,7,7}

The game has three rounds. First you roll against Pierre, then you roll against Blaise, then Pierre and Blaise roll against each other. The winner of each round is the one who rolls the higher number. The overall winner is the player who wins the most rounds.

Which die should you choose?

[Here's the relevant link (try not to look at the status bar -- the address gives away too much). It's a fun, trivial oddity but it raises some interesting questions about how numbers we trust can do unexpected things. More on that later.]

Saturday, February 12, 2011

Weekend Gaming -- Hexagonal Chess

I don't have any data to back this up, but I've always thought that the bulk of the benefits from learning a game -- improving problem solving, pattern recognition, strategic thinking -- come late in the beginning of the process, just after the rules are internalized. If that's true (and maybe even if it isn't), you might be able to extend that period of intense learning by modifying a game so that old rules are seen in new ways.

Case in point, Gliński's hexagonal chess.

Gliński's chess variant is hugely popular in Europe (more than 100,000 sets have been sold). You can get the rules at my Kruzno site, but you can probably figure most of them out for yourself. The only pieces that might give you trouble are the bishops and,to a lesser extent, the knights.


Bishops come in three colors, which points out an interesting topological feature of a hexagonal grid which I'm betting you can spot for yourself.

It's a strange and intriguing game and yet another reason why every house should have a hexboard.

Friday, February 11, 2011

"Why the Efficient Market Hypothesis (Weak Version) Says Nothing about the Ability to Identify Bubbles"

I found this post by Peter Dorman interesting for a couple of reasons. First because it was, well, interesting -- it had something insightful to say about an important subject -- and second because it took a question that is normally framed in terms of arguing assumptions (are markets efficient?) and showed that the question had nothing to do with those assumptions.
Let’s put aside the possibility that even the weak EMH can be wrong from time to time. We don’t need to go there; the error is more basic than this.

Let’s put ourselves back in 2005. It is two years before the unraveling of the financial markets, but I don’t know this; all I know is what I can see in front of me, publicly available 2005 data. I can look at this and see that there is a housing bubble, that prices are rising far beyond historical experience or relative to rents. The “soft” warning signs are all around me, like the explosion of cheap credit, the popularity of credit terms predicated on ever-rising prices, and the talk of a new era in real estate. Based on my perceptions, I anticipate a collapse in this market. What can I do?

If I am an investor, I can short housing in some fashion. My problem is that I have no idea how long the bubble will go on, and if I take this position too soon I could lose a bundle. In fact, anyone who went short in 2005 and passed on the following two years are price frothery grossly underperformed relative to the market as a whole. Indeed, you might not have the liquidity to hold your position for two long years and could end up losing everything. Of course, it is also possible that the bubble could have burst a year or two early and your bets could have paid off. What the EMH tells us is that, as an investor, not even your prescient analysis of the fundamentals of the housing market would enable you to outperform more myopic investors or even a trading algorithm based on a random number generator.

The logical error lies in confusing the purposes of an investor with those of a policy analyst. Suppose I work for the Fed, and my goal is not to amass a personal stash but to formulate economic policies that will promote prosperity for the country as a whole. In that case, it doesn’t much matter whether the bubble bursts in 2006, 2007 or 2010. In fact, the longer the bubble goes on, the more damage will result from its deflation. At the policy level, the relevant question is whether trained analysts, assembling data and drawing on centuries of experience in financial manias, can outperform, say, tarot cards in identifying bubbles. The EMH does not defend tarot.

To profit from one’s knowledge of a market condition one needs to be able to outperform the mass of investors in predicting market turns, which the EMH says you can’t do. Good policy may have almost nothing to do with the timing of market turns, however.

The Good Principal Principle

This looks promising. While it is difficult to build a model for good management, this still strikes me as a simpler problem than modelling effective teaching. For one thing, K through 12 teaching includes a large management component. More importantly, the data on teachers runs into some nasty nesting issues. Things are much cleaner when you go up a level.

Yet each school year thousands of principals beat the odds and do excel, women and men who love their leadership positions, relish the challenges and take pride in running schools that perform well year after year. Who are these people? And what are they doing that so many others aren’t?

“We know that principals matter for a school’s success, but we don’t know much about why and how they matter,” says Jason Grissom, an assistant professor of public affairs in MU’s Truman School of Public Affairs. Grissom and Susanna Loeb, a professor of education at Stanford University, are working to provide answers, thanks in part to a $1 million grant from the Institute of Education Sciences, the self-described “research arm” of the U.S. Department of Education.

“Our goal at the end of this study is to be able to offer some tangible recommendations for making principals more effective in terms of improving student outcomes,” Grissom says. “We’re excited about the proposal because it’s pretty ambitious. The kind of study we’ve proposed is essentially the first of its kind.”

At least we've gotten more efficient at some things

From Wikipedia:

I Am Number Four is a young adult science fiction novel by Pittacus Lore, the pen name of authors James Frey and Jobie Hughes. The book was published by HarperCollins on August 3, 2010,[1] and has currently spent 6 weeks on the children's chapter of The New York Times Best Seller list.[2]

DreamWorks Pictures bought the rights to the film in June 2009; it will be released on February 18, 2011. The novel is the first of a proposed six-book series.[3]

Back in the old days, you used to have to publish one before they started the film series. By way of comparison, it took James Bond nine years, ten best sellers and the implied endorsement of JFK before it made it to the screen.

E-verify

This article linked by Thoreau is pretty frightening. The main issue seems to be the combination of error rate and lack of transparency. Two points of great interest:

And the results have been devastating. U.S. workers excited to start a new job are instead thrust into bureaucratic limbo as they try to sort out government mistakes. Their new employers hire, then fire them and never tell them why; or worse, they might never be hired in the first place and not know why . . . According to government reports, the program (even after years of work) has a stubbornly high error rate and well-documented problems in attempts to resolve those errors. According to the most conservative numbers, at least 80,000 American workers lost a new job last year because of a mistake in the system. If E-Verify were mandatory, that number would rise to 770,000


and

Ultimately the most brutal irony is that E-Verify doesn’t work. According to government-required audits, 54 percent of those not allowed to work in the U.S. were actually approved by the system.


Even worse:

The government mechanism to fix errors is a Kafkaesque tragedy. There is currently no court remedy to force Immigration and Customs Enforcement to fix an error. Many times those errors are as simple as an incorrect data entry or a name change, but in order to uncover the error, workers have to file letters with different parts of the agency seeking copies of their records


The part that is the most painful about this process is that being denied employment is a cost (especially in the current employment environment). What happens if we refuse a person employment (incorrectly) and their unemployment insurance is about to expire? Worse, if they do not know why they were flagged then it might be years before they find out what went wrong.

This doesn't seem like a good idea at all.

Administration

I was reading Mark's post today on principals and I thought it was interesting to see that Dean Dad has a post from the other side of the fence today as well. Some of the points in the Dead Dad post are extremely insightful. Consider the following question:

I’ll answer the question with another question. Good, strong, solid, peer-reviewed scientific data has made it abundantly clear that poor eating habits lead to obesity and all manner of negative health outcomes. There’s no serious dispute that obesity is a major public health issue in the US. And yet people still overeat. Despite reams of publicity and even Presidential support for good eating and exercise habits, obesity continues to increase. Why?


In other words, reform is hard to do even when you know where you want to go. In cases where the evidence is weak or where budgets are falling, the problem gets even worse. And, of course, right now we are experiencing a fall in most education budgets in the United States.

However, it was interesting how Dead Dad was unable to resist worrying about tenure as a barrier to reform:

There’s also a fundamental issue of control. Faculties as a group are intensely protective of their absolute control of the classroom. Many hold on to the premodern notion of teaching as a craft, to be practiced and judged solely by members the guild. As with the sabermetric revolution in baseball, old habits die hard, even when the evidence against them is clear and compelling. There’s a real fear among many faculty that moving from “because I say so” to “what the numbers say” will reduce their authority, and in a certain sense, that’s true. In my estimation, this is at the root of much of the resentment against outcomes assessment.

Even where there’s a will, sometimes there just isn’t the time. It’s one thing to reinvent your teaching when you have one class or even two; it’s quite another with five. And when so many of your professors divide their time among different employers, even getting folks into the same room for workshops is a logistical challenge.

Of course, accountability matters. Longtime readers know my position on the tenure system, so I won’t beat that horse again, but it’s an uphill battle to sell disruptive change when people have the option of saying ‘no’ without consequence. The enemy isn’t really direct opposition; it’s foot-dragging.


I think that this line of thinking may also be part of why there is such a huge push for reform of teacher job security. Administrators are under enormous pressure to reform the education system and teachers may be very resistant.

Of course, one element that may be left out is that the teachers may be resistant to change for good reasons. When you have been in an organization long enough, you realize that a lot of reform can be about trying "old ideas all over again". These reforms can be both time consuming and ineffective. They may even lower outcomes due to the friction of implementation.

Let us consider a business analogy. One way that corporations tried to handle bad outcomes is a series of "re-orgs". These changes in structure have two good properties. One, the people in charge seems to be doing something to address issues by making changes. Two, a series of re-organizations can make it very hard to track a long term pattern of bad management as units break apart too often for performance to be easily tracked.

The ability of tenured people to resist cosmetic reforms is, obviously very frustrating to administrators who have little ability to influence the organization but seemingly unlimited accountability. However, endless re-organizations did not, in the end, help corporations like General Motors. Instead, they may well have accelerated the decline by focusing on making changes that were more cosmetic than effective. So do we really need to import the worst practices of modern corporations into the educational system?

[This post is also relevant -- Mark]

The Principal Problem

One of the many odd things about the education reform debate is how little we hear about holding principals accountable. The subject does come up but it only gets a fraction of the press devoted to plans to punish or fire teachers.

This is particularly strange because, if you're looking for something to explain why a certain school is under-performing, you would obviously start by looking for a common factor, something that could explain why so many classes are bad and so many students are doing poorly. When you take out demographics and social factors, the only candidate left is administration, the people who hire and manage the teachers, who maintain overall campus discipline, who are responsible for how the school runs. Running a school is a tough job, but there are lots of great schools out there, both public and charter, so obviously it is possible to do it well.

I've always felt that every firing represents a failure of the hiring or management process, but if you have your heart set on reform by winnowing, it seems clear that administration should be the first to go. Unfortunately, administrators tend to be survivors (Ever been to a school board meeting? It's hard to avoid the cockfight analogy). Case in point...
U.S. Plan to Replace Principals Hits Snag: Who Will Step In?
By SAM DILLON

COLUMBUS, Ohio — The aggressive $4 billion program begun by the Obama administration in 2009 to radically transform the country’s worst schools included, as its centerpiece, a plan to install new principals to overhaul most of the failing schools.

That policy decision, though, ran into a difficult reality: there simply were not enough qualified principals-in-waiting to take over. Many school superintendents also complained that replacing principals could throw their schools into even more turmoil, hindering nascent turnaround efforts.

As a result, the Department of Education softened the hit-the-road plans for principals of underperforming schools laid out in the program rules. It issued guidelines allowing principals hired as part of local improvement efforts within the last two years to stay on, then interpreted that grandfather clause to mean three years.

Although the program created an expectation that most schools would get new leadership, new data from eight large states show that many principals’ offices in failing schools still bear the same nameplates. About 44 percent of schools receiving federal turnaround money in these states still have the same principals who were leading them last year.
When I mentioned this story to Joseph, he drew a parallel between this and the financial crisis. In both cases we were told that we couldn't get rid of the people who screwed up because we supposedly needed them to fix the problem. It was hard to swallow with AIG; it's even less credible here. But it's what you expect from survivors.

Being politically skilled is valuable to an administrator, as is being media-savvy. This is not a bad thing. These talents can help administrators serve their students and promote their vision (look at Geoffrey Canada), but, as an old superintendent told me when I first started teaching (in somewhat more blunt language), you have to be aware of these talents and be careful when dealing with people like him.

Not surprisingly, superintendents have done pretty well for themselves in the reform movement. They have brought in additional money. They have managed to shift most of the attention from the damage done by bad administrators to the damage done by bad teachers. When colleagues actually were blamed for failing schools, they have frequently managed to shield them from any real consequences.

The past couple of years have even seen the emergence of the superstar school administrator. With the rise of Michelle Rhee and Joel Klein, what was a good gig now has at least the possibility of significant fame and fortune.

Having said all of this, it's important to step back and remind ourselves of some basic truths:

Whether you're talking about administrators or teachers or researchers or reformers, virtually everyone involved with education is there out of a deep concern for the education and general welfare of children;

At the same time, all of these groups will also tend to look out for their own self-interests. There is no contradiction here. We expect the police to have the interests of the public and of the police department in mind. We expect the same of firemen, journalists, the military and many others.

Administrators are very good at this game. There's nothing wrong with that, as long the people covering the game know the score.

Thursday, February 10, 2011

I originally had a mean headline but my better angels swooped in*

Not only is Diane Ravitch a knee-jerk liberal catspaw of the unions, she's also a right wing extremist:
Admittedly, I have only skimmed the book, but it is not hard to find evidence that Dr. Ravitch has not left all of her highly conservative views behind. She blames the familiar bogeymen of the religious right for many of the problems in American public education, notably constructivism and whole language with the selective citing of easily refuted research. Her naive understanding of learning theory or learner-centered pedagogy is like that of a teacher education student or mom who just returned home from a “Tea Party” rally.
In the education debate, you don't have a left/right divide; you have a Möbius strip.





*R.I.P. Gerry Rafferty

Do good teachers make difficult employees?

A few years ago I did a stint as an instructor at a large state school teaching, among other things, business calculus. The sections for that course tended to be good-sized, usually running from fifty to one fifty. At one time, I probably would have found the experience a bit intimidating but I was just coming off a couple of years as a TA for a professor who routinely taught sections of more than three hundred so I considered myself lucky to be able to make out individual faces.

With few exceptions, experienced teachers are comfortable addressing large groups and with very few exceptions, effective teachers are comfortable demanding the full attention of those group. Along with knowledge of the subject, strong communication skills, and commitment to the students, a "when I talk, you listen" attitude is a defining trait of an effective instructor.

That doesn't automatically translate to a room full of kids sitting quietly while the teacher drones on. Often the result is just the opposite. Teachers are more likely to have looser classes with more student participation if they feel in control. As a rule of thumb, you should never be more than ninety seconds away from having every student seated and reading quietly. For really good teachers, even the most adventurous lesson plans fall into that ninety second radius.

Put another way, it comes down to authority. A teacher's job is to teach, counsel and objectively evaluate his or her students. A sense of authority is an essential trait for all these tasks but it's an incredibly annoying one to find in a direct report.

Good principals (and I've met some excellent ones) are masters at the difficult art of managing managers. They can exercise their authority in a way that actually enhances the authority of those under them.

Even with the best administrators, however, there is always an element of tension and it only gets worse with less competent principals and superintendents. This is something to keep in mind when you hear about plans to improve education by giving principals more authority to get rid of bad teachers. Sometimes bad doesn't mean incompetent; it means inconvenient. (I don't have the book in front of me, but Diane Ravitch' Death and Life has some notable examples.)

I really ought to do a post on estimation and developing mathematical intuition one of these days

Till then, check out this decidedly cool offering from MIT's reliably cool Open Courseware (via DeLong):

18.098 / 6.099 Street-Fighting Mathematics

This course teaches the art of guessing results and solving problems without doing a proof or an exact calculation. Techniques include extreme-cases reasoning, dimensional analysis, successive approximation, discretization, generalization, and pictorial analysis. Applications include mental calculation, solid geometry, musical intervals, logarithms, integration, infinite series, solitaire, and differential equations. (No epsilons or deltas are harmed by taking this course.) This course is offered during the Independent Activities Period (IAP), which is a special 4-week term at MIT that runs from the first week of January until the end of the month.

The Principal Effect -- repost

[I'm working on a post about this New York Times story about not holding principals accountable for failing schools so I thought I'd use this to get the discussion started.]

When it comes to education reform, you can't just refer to the elephant in the room. It's pretty much elephants everywhere you look. There is hardly an aspect of the discussion where reformers don't have to ignore some obvious concern or objection.

The elephant of the moment is the effect that principals and other administrators have on the quality of schools. Anyone who has taught K through 12 can attest to the tremendous difference between teaching in a well-run and a badly-run school. Even the most experienced teacher will find it easier to manage classes, cover material, and keep students focused. All of those things help keep test scores up, as does the lower rate of burn out. For new teachers, the difference is even more dramatic.

On top of administrator quality, there is also the question of compatibility. In addition to facing all the normal managerial issues. teacher and and principal have to have compatible educational philosophies.

As we've mentioned more than once on this site, educational data is a thicket of confounding and aliasing issues. That thicket is particularly dense when you start looking at teachers and principals and, given the concerns we have about the research measuring the impact of teachers on test scores, I very much doubt we will ever know where the teacher effect stops and the principal effect starts.

In the center, the National Review. On the right, the New Republic

Jim Manzi has an excellent column discussing proposed teacher evaluation metrics from a business perspective, a column that raises some of the same questions that teacher's unions have brought up. There's nothing particularly surprising about that -- Manzi is an intelligent man with a well known independent streak. He's not going to disagree with a position just because he's a conservative.

Jonathan Chait dismisses Manzi's points with some sweeping generalities, completely ignores his point about fairness to the evaluated and ends up being significantly less sympathetic to the concerns of labor than Manzi. Sadly this not surprising either. Chait is one of the most brilliant pundits we have but on the topic of education he combines intense feelings with an apparent lack of knowledge of the important research in the field. This has caused him to embrace certain popular narratives even when they lead him to conclusions that contradict his long standing values.

But as unsurprising as the parts may be, when you put them together the strangeness of the current education debate just sweeps over you. Formerly right-wing positions like privatizing large numbers of schools or denying unions the right to protect workers from unfair termination are now dogma for much of the left. It has reached the point where when a writer for the National Review suggests, as part a larger analysis, that teachers can have legitimate concerns about the reliability of the metrics used to evaluate them, the voice of the New Republic dismisses the possibility without even feeling the need to make an argument.

Even without the political role reversal, Chait's response is strange and oddly disengaged. Judge for yourself.

[I'm presenting these out of order for reasons that will obvious]

1. You need some system for deciding how to compensate teachers. Merit pay may not be perfect, but tenure plus single-track longevity-based pay is really, really imperfect. Manzi doesn't say that better systems for measuring teachers are futile, but he's a little too fatalistic about their potential to improve upon a very badly designed status quo.
Argument by modifier with not one but two 'really's and a 'very' to sell the point. What he doesn't give is any kind of supporting evidence whatsoever. With millions of teachers and a small but thriving industry of think tanks digging up damning anecdotes, you can always find something negative to say, but Chait doesn't even bother coming up a bad argument.

There's an odd, listless quality to the entire post. Chait is normally an energetic and relentless debater. Here he just goes through the motions. He doesn't even bother to proof his prose (I'm pretty sure he either meant to say "the search...is futile"). He also makes a huge jump from the specific techniques Manzi is focusing on to "better systems." I'm pretty sure that Manzi believes better systems can improve the status quo; he just questions how big a role value-add metrics will play in those systems.

As for the case for longevity vs. value-added, I'll let Donald Rubin take it from here:
We do not think that their analyses are estimating causal quantities, except under extreme and unrealistic assumptions.
This is not to say that there isn't a case to be made for merit pay. I don't have any problem with rewarding teachers who do exceptional work, but the methods being discussed here are simply not the way to do it.

Chait's third point runs along similar lines:
3. In general, he's fitting this issue into his "progressives are too optimistic about the potential to rationalize policy" frame. I think that frame is useful -- indeed, of all the conservative perspectives on public policy, it's probably the one liberals should take most seriously. But when you combine the fact that the status quo system is demonstrably terrible, that nobody is trying to devise a formula to control the entire teacher evaluation process, and that nobody is promising the "silver bullet" he assures us doesn't exist, his argument has a bit of a straw man quality.
More argument by adverb and a strange double straw man (straw-straw man? straw straw man man?) continued from the soon-to-be-discussed point 2. The first 'nobody' is doubtful; Chait seems to jump from the fact that no state currently bases evaluations primarily on value-added metrics to the conclusion that no one is even looking into the possibility. The second 'nobody' is just plain wrong; many reform movement followers have so much faith in the silver bullet status of value-added metrics that they have seriously proposed firing more than half of our teachers based on that one number.

But the weirdest part came in point 2.
2. Manzi's description...
evaluating teacher performance by measuring the average change in standardized test scores for the students in a given teacher’s class from the beginning of the year to the end of the year, rather than simply measuring their scores. The rationale is that this is an effective way to adjust for different teachers being confronted with students of differing abilities and environments.
..implies that quantitative measures are being used as the entire system to evaluate teachers. In fact, no state uses such measures for any more than half of the evaluation. The other half involves subjective human evaluations.
Argument by ellipses. Take a look at the whole paragraph:
Recently, Megan McArdle and Dana Goldstein had a very interesting Bloggingheads discussion that was mostly about teacher evaluations. They referenced some widely discussed attempts to evaluate teacher performance using what is called “value-added.” This is a very hot topic in education right now. Roughly speaking, it refers to evaluating teacher performance by measuring the average change in standardized test scores for the students in a given teacher’s class from the beginning of the year to the end of the year, rather than simply measuring their scores. The rationale is that this is an effective way to adjust for different teachers being confronted with students of differing abilities and environments.
Manzi explicitly says "widely discussed attempts." Now, for the sake of comparison, check out the New York Times' similar wording:

A growing number of school districts have adopted a system called value-added modeling to answer that question, provoking battles from Washington to Los Angeles — with some saying it is an effective method for increasing teacher accountability, and others arguing that it can give an inaccurate picture of teachers’ work.

The system calculates the value teachers add to their students’ achievement, based on changes in test scores from year to year and how the students perform compared with others in their grade.

Manzi was perfectly clear with his wording and used language consistent with the New York Times' coverage. It was only by excerpting his paragraph mid-sentence that Chait was able to get even the suggestion of a distortion.


I have somewhat mixed feelings Manzi's business-based approach. There are certain aspects of education that are, if not unique, then at least highly unusual and you have to be careful when drawing analogies (obviously the subject for another, much longer post). That said, all of his points about the way evaluations work are valid and useful.

This is not a bad place to start the debate.


[You can read Jim Manzi's somewhat bewildered reaction to Chait's column here.]

On the off chance that you ever wondered what "The Love Song of J Alfred Prufrock" would sound like if written by Rudyard Kipling...

Today would, amazingly enough, seem to be your lucky day.

Wednesday, February 9, 2011

Jim Manzi has some smart things to say about teacher evaluations

From the National Review (via Chait, but more on that later)
This seems like a broadly sensible idea as far as it goes, but consider that the real formula for calculating such a score in a typical teacher value-added evaluation system is not “Average math + reading score at end of year – average math reading score at beginning of year,” but rather a very involved regression equation. What this reflects is real complexity, which has a number of sources. First, at the most basic level, teaching is an inherently complex activity. Second, differences between students are not unvarying across time and subject matter. How do we know that Johnny, who was 20 percent better at learning math than Betty in 3rd grade is not relatively more or less advantaged in learning reading in fourth grade? Third, an individual person-year of classroom education is executed as part of a collective enterprise with shared contributions. Teacher X had special needs assistant 1 work with her class, and teacher Y had special needs assistant 2 working with his class — how do we disentangle the effects of the teacher versus the special ed assistant? Fourth, teaching has effects that continue beyond that school year. For example, how do we know if teacher X got a great gain in scores for students in third grade by using techniques that made them less prepared for fourth grade, or vice versa for teacher Y? The argument behind complicated evaluation scoring systems is that they untangle this complexity sufficiently to measure teacher performance with imperfect but tolerable accuracy.

Any successful company that I have ever seen employs some kind of a serious system for evaluating and rewarding / punishing employee performance. But if we think of teaching in these terms — as a job like many others, rather than some sui generis activity — then I think that the hopes put forward for such a system by its advocates are somewhat overblown.

There are some job categories that have a set of characteristics that lend themselves to these kinds of quantitative “value added” evaluations. Typically, they have hundreds or thousands of employees in a common job classification operating in separated local environments without moment-to-moment supervision; the differences in these environments make simple output comparisons unfair; the job is reasonably complex; and, often the performance of any one person will have some indirect, but material, influence on the performance of others over time. Think of trying to manage an industrial sales force of 2,000 salespeople, or the store managers for a chain of 1,000 retail outlets. There is a natural tendency in such situations for analytical headquarters types to say “Look, we need some way to measure performance in each store / territory / office, so let’s build a model that adjusts for inherent differences, and then do evaluations on these adjusted scores.”

I’ve seen a number of such analytically-driven evaluation efforts up close. They usually fail. By far the most common result that I have seen is that operational managers muscle through use of this tool in the first year of evaluations, and then give up on it by year two in the face of open revolt by the evaluated employees. This revolt is based partially on veiled self-interest (no matter what they say in response to surveys, most people resist being held objectively accountable for results), but is also partially based on the inability of the system designers to meet the legitimate challenges raised by the employees.

I found the point about techniques that hurt futures performance particularly good. When I was teaching, how well a class would go was greatly influenced by how well previous teachers had done their jobs. Did the students understand the foundations? Did they have a good attitude to the material? Good work habits and study strategies?

Teachers want reliable evaluations not just because they want to be rewarded for good work but also because they want to see incompetent teachers identified so that those teachers can be encouraged to do better, given training to improve their performance or, should the first two fail, fired. What they object to is having their fates rest on a glorified roll of the dice.

Michael Hiltzik on the Texas Miracle

Lots of good stuff in this comparison of the surprisingly similar fiscal woes of my native and adopted states. In particular, the following passage caught my eye:

Curiously, Texas' reputation as a low-tax, business-friendly state survives although its state and local business levies exceed California's as a percentage of each state's business activity (4.9% versus 4.7% in 2009, according to a report by the accounting firm Ernst & Young). What's different is that Texas business taxation relies more on property, sales and excise taxes and government fees than California, which relies on taxing corporate income.

Of course, one reason many business owners and executives favor Texas over California is that the Lone Star State doesn't have a personal income tax — a big deal when you're pulling in a Texas-size paycheck.

But self-interest aside, what's at stake from fiscal policy in both states is the same — the services and programs that really matter to business owners, such as functioning schools, high-caliber universities and serviceable transport infrastructure.

Even more important are the measures that point to public well-being. In many categories, California and Texas are closer together than either state's residents would probably find comforting.

But here are a few where they're not: Texas ranks 49th in the nation (that is, third worst) in teen births; California 22nd. In providing prenatal care to expectant mothers, Texas is dead last, California eighth. Texas ranks 34th in median family income, with $47,143; California 13th, at $56,852. This is the harvest of its "superior policies," and given the current budget crisis, it's bound to get worse. Miraculous.

What do you do when things are tight?

I was reading two different pieces today and I thought that they had a really interesting link between the two of them.

From Dana Goldstein:

While we're on the subject of Wisconsin, I find Scott Walker sort of terrifyingly simple-minded but charismatic. His education platform is basically Race to the Top plus vouchers while somehow massively cutting education budgets. (Huh?)


From Mark Thoma:

Local school districts have cut 154,000 education jobs since August 2008.


So my question is this: why is the push for excellence being connected with schemes to reduce manpower costs? If the argument is that education is a key priority then why are we not increasing funding for education? Instead we have the odd situation where the state wants education to improve while cutting expenses.

Usually when this contradiction shows up, the government is seeking cover for the decision to cut services. If cutting expenses also results in better outcomes than we are all better off, right? Or it could be an attempt to remove the more senior (and thus higher paid) teachers to minimize the impact of budgetary decisions that have already been made. But that is a different conversation, isn't it?

Now consider another area that the state runs that is in a similar position, namely the military. Is anybody seriously arguing that some soldiers do not pull their weight? That we could be more effective with a smaller force? After all, wasn't there a movie (Rambo, for example) where a single heroic special forces soldier was more effective than a brigade? But if the administration began talking about waste and cost effectiveness then you would be certain what they really wanted was cover for cuts. Now imagine they talked about those lazy soldiers who re-enlisted or who were only interested in rewards? Who needs a veterns administration when soldiers are fighting for principle and principle alone?

Would such cuts make sense? Either for the military or for education it is a matter of opportunity cost. But maybe the best conversation to have is one about the trade-off between the options. Taxes hurt economic growth but lack of education or defence can both lead to fairly bad long term outcomes. I am not sure where the balance is but I'd prefer to have the conversation openly. Pretending that test scores plus cuts will somehow improve education seems odd.

More efficient models of defense and education may both exist, but then the optimal path seems to be to show the efficiencies first and implement the cuts second.

New rule

Anyone who reprints one of Piet Hein's Grooks gets an automatic link.

An experiment in blogging -- the conclusion [reposted]

[I'm about finished with a longer post that refers to this topic so I decided to do a repost. I apologize to long time readers.]

When assessing a statement, sometimes it's useful to rephrase it in a more general way and see how well it holds up. I tried that with a passage I found in a popular blog (one of the very few I read every day). Where the author had referred to members of a specific profession I substituted in the word 'employees' except when talking about unions ('employees unions' seemed redundant). I also changed a couple of words for consistency, but other than that the passage was exactly the same.

The resulting paragraph (seen below) was much more extreme than I had expected and it got me to thinking, how would people react to this passage if they encountered it without all the baggage? I decided to post the generalized version with a brief explanatory note then give people a couple of days to think about it before filing in the details.

Here's the generalized passage:
If you concede that employers need to be able to fire bad employees, then you can't fully defend the role of the unions. You can defend the concept of unions, and you can believe that some of the things unions do, like bargain for higher aggregate wages, help society. But most unions demonstrably make it very difficult to fire bad employees. That is currently a core function of unions, and something that must change. You're also going to need higher salaries to attract a better caliber employee into the workforce, and that's something unions could potentially help. But being "treated like professionals" has to mean both the opportunity to earn a good living if you do well and the potential to be fired if you fail.
And here is the passage Jonathan Chait (that's right, Jonathan Chait) originally posted in his blog:
If you concede that principals need to be able to fire bad teachers, then you can't fully defend the role of the unions. You can defend the concept of unions, and you can believe that some of the things unions do, like bargain for higher aggregate wages, help education. But most teachers unions demonstrably make it very difficult to fire bad teachers. That is currently a core function of teachers unions, and something that must change. You're also going to need higher salaries to attract a better caliber teacher into the profession, and that's something unions could potentially help. But being "treated like professionals" has to mean both the opportunity to earn a good living if you do well and the potential to be fired if you fail.
There are obviously two possible responses Chait could make here (three if you count ignoring it entirely). He could say he agrees with the general statement or he could argue that teachers are a special case and should be granted less union protection than, say, policemen.*

Ironically, the more defensible position Chait can take here is the extreme one, namely that unions should not do anything to discourage employers from firing their members. It's not a position that most readers of the New Republic would embrace but, as a statement of personal belief, it is extraordinarily difficult to rebut.

If he tries to explain why teachers constitute a special case, he will have to deal with the data and in this particular debate, the numbers are not his friends. (It's worth remembering that Diane Ravitch started out on Chait's side. Her road to Damascus came when she realized she could no longer reconcile those views with what she was seeing in the research findings.)

Jonathan Chait can be a formidable debater but he has shown himself to be largely ignorant of the research behind these issues (no one at TNR even knew enough about PISA to catch the bait and switch in the intro to Waiting for Superman and in the education debate that's about as slow as the pitches get).

He'll be trying to punch holes in the findings of institutions like EPI and Rand and big guns in the field like Donald Rubin. He'll have to show precipitous educational decline without resorting to the aforementioned PISA (good test but absolutely meaningless in this context). He'll have to explain why schools that use his policies are more likely to underperform than to outperform unionized schools. He'll have to justify firing people based on metrics so volatile that a third of teachers in the top 20% could find themselves in firing range the next year, metrics based on data so confounded that "students’ fifth grade teachers were good predictors of their fourth grade test scores."

This is one time the smart money is on the other guys.



* Yes, we fire policemen. What we don't do is is fire policemen based on unreliable metrics that are largely outside of the officers' control and are easily manipulated by their superiors

An experiment in blogging -- reposted

[I'm about finished with a longer post that refers to this topic so I decided to do a repost. I apologize to long time readers.]

This will just take a minute of your time.

What follows is a passage from a popular blog, rewritten slightly to make it more general but otherwise unchanged. I'll post the original quote with some comments Monday or Tuesday. [later today for the repost -- Mark]

I'd appreciate it if you would take a look at this and give some thought both to the arguments proposed and to the larger belief system they suggest, then come back in a few days and see what effect learning the context has had on your initial impressions.

Thanks.

If you concede that employers need to be able to fire bad employees, then you can't fully defend the role of the unions. You can defend the concept of unions, and you can believe that some of the things unions do, like bargain for higher aggregate wages, help society. But most unions demonstrably make it very difficult to fire bad employees. That is currently a core function of unions, and something that must change. You're also going to need higher salaries to attract a better caliber employee into the workforce, and that's something unions could potentially help. But being "treated like professionals" has to mean both the opportunity to earn a good living if you do well and the potential to be fired if you fail.

I welcome comments but please don't include the source of the passage. Obviously that would undercut the point of the experiment.

Tuesday, February 8, 2011

Catching up with Dana Goldstein

I'm going to try to comment on each of these individually later. Feel free to beat me to the punch.



How Politically Astute is Michelle Rhee?



The Revival of the Private School Voucher Movement



Wisconsin Teachers' Union Prepares for Battle with GOP Gov



If I'm going to compare H-1Bs and serfdom, I should at least find a British guy for the video

The Daily Show With Jon StewartMon - Thurs 11p / 10c
Olivers on the Strike
www.thedailyshow.com
Daily Show Full EpisodesPolitical Humor & Satire BlogThe Daily Show on Facebook

"If you don't do well in school, your descendants could grow up to be Morlocks"

OK, maybe it's not that bad, but Berkeley professor Claude Fischer still paints a grim picture. (via Thoma, of course.)

Degree inequality

It is now generally understood that economic inequality has expanded greatly since about 1970. (Well, there are exceptions. For a couple of decades, some commentators denied that economic inequality was growing, claiming that it was all a statistical illusion. A few holdouts against reality may remain.) Now the debate has shifted to what – if anything at all – should be done about inequality.

Most of that discussion has been about income inequality. Between 1979 and 2007, the one-fifth of American households with the highest income experienced a roughly 100% increase in their annual, inflation-adjusted, after-tax income (280% [!] for the highest one percent of households); the middle one-fifth got about 25% more income; and the poorest one-fifth got about 15% more (see pdf). For wealth – property, stocks, and the like – the gap is enormously greater and has also widened over the last few decades.

Less discussed is the widening college degree gap. Yet its implications go considerably beyond money, to widening differences in life experiences and ways of life. (I draw in particular on the work of my colleague, Michael Hout, notably here [pdf], and on two books we wrote together, here and here.)

Fischer follows this with a number of troubling statistics. I found this part particularly striking:
Even is more happening along the education gap: Increasingly, college graduates marry college graduates and live among college graduates. Increasingly, Americans group by education and their ways of life diverge by education.

Although the trends are complex (see here), Americans today are likelier to marry people of the same educational level as themselves than was true decades ago. Some of this development results from educated men increasingly marrying educated women; for example, the lawyer who married his secretary is now a lawyer who marries another lawyer. And some of this change is due to poorly-educated men becoming ineligible as spouses; drop-outs can no longer support families on brawn alone.

Then there is residential separation: A study by Thurston Domina (pdf) shows that college graduates are concentrating in some metropolitan areas (San Francisco and Raleigh-Durham, for example) and seem to be avoiding others (Indianapolis and Las Vegas, for example) and also that neighborhood segregation by college education grew substantially between 1970 and 2000. It grew faster than segregation by income, even as segregation by race declined. Another study documents how the highly-educated are concentrating in the downtowns of the most booming cities. And a recent story reported that these degree-holders are starting to raise their children in center cities — even in Manhattan. Thus, enclaves of the highly-educated are growing in chic, gentrified, non-smoking neighborhoods, while the less educated move to the scraggly, sprawling suburbs of stagnating cities.

What is less clear, although certainly plausible, is that this widening separation carries along with its economic and social divisions, a widening gap in values and ways of life: two different Americas, divided educational attainment.
I would find the use of just desserts as a justification for policy more palatable if we weren't seeing an alarming decline in social mobility.

Krugman does stat 101

Nothing fancy, no big insights, just a couple of bell curves, but this post by Paul Krugman does a nice job presenting weather trends and extreme events in terms of probability. He makes it simple but not overly simplified. We need more of this.

Krugman also deserves bonus points for describing economist's practice of putting the independent variable on the wrong axis as a QWERTY problem.

Semi-serfdom

What are going to do with Paul Krugman? He writes clear and insightful articles on complex economic issues, then he gives them titles like "Serf's Up" (circa 2003):
Here's the puzzle. In Europe circa 1100, with population scarce, serfdom was useful to the ruling class. By 1300 it wasn't, and had been allowed to drift away. But after 1348 it should have been worthwhile again. Yet it wasn't effectively reimposed. There were attempts to restrain wages and limit labor mobility, as well as attempts to tax the peasants (Wat Tyler's rebellion fits into all this.) But all-out feudalism didn't return. Why?

And an even bigger question: why hasn't indentured servitude made a comeback in the modern era? Yes, I know, human rights and all that - but if it was profitable to have indentured servants in the modern world, I'm sure that Richard Scaife's think tanks would have no trouble finding justifications, and assorted Christian groups would explain why it's God's will.
Though the analogy's not perfect, we do have an institution that restrains wages and limits labor mobility. It's called an H-1B visa and though I know of no Christian groups trying to explain why it's God's will, it certainly has plenty of think tanks finding justifications for it.

It has reached the point where not only do sitcoms portray Ponzi schemes...

They also use the term without explanation in their blurbs.

Monday, February 7, 2011

There might just be a lesson here somewhere.

Felix Salmon has an insightful and amusingly written take on AOL's acquisition of Huffington Post:
As for HuffPo, it gets lots of money, great tech content from Engadget and TechCrunch, hugely valuable video-production abilities, a local infrastructure in Patch, lots of money, a public stock-market listing with which to make fill-in acquisitions and incentivize employees with options, a massive leg up in terms of reaching the older and more conservative Web 1.0 audience and did I mention the lots of money?
I used to work for Earthlink. It was a great company, customer-centered, innovative, but it was trapped between AOL and Microsoft's Online, two companies with immeasurably deep pockets and no apparent concern about turning a profit.

AOL in particular was a mystery to us. It went from one bad business plan to another, all while maintaining spectacularly bad customer service. None of it mattered in the face of hundreds of millions of promotional disks.*







* According to former Chief Marketing Officer at AOL, Jan Brandt, "At one point, 50% of the CD's produced worldwide had an AOL logo on it."

Well-mannered spam

A few days ago I noticed a brief comment to one of my posts. It was from someone named Lily and all it said was "Thanks." I spent a minute wondering if the comment was meant to be sincere or sarcastic but, other than that, I didn't give it much thought.

Then yesterday Lily was back with another comment, or, more accurately, the same comment, this time on another post. There was no obvious common thread to the two posts, no apparent reason that anyone would single them out. All of this made me, in equal parts, suspicious and curious, so I clicked on the profile where I found a picture of a very attractive young Asian woman and a long list of blogs that, according to Google translate, offer a wide range of questionable services.

Profile-based spam is a new one on me and I wonder how the filters are going to handle it. On the bright side, though, at least it's polite.

More on poverty and education

From Aleks Jakulin:

"I do know that I get anxious whenever a correlation analysis tries to look like a causal analysis. A frequent scenario introduces an outcome (test performance) with a highly correlated predictor (say poverty), and suggests that reducing poverty will improve the outcome. The problem is that poverty is correlated with a number of other predictors."

I think that the dense correlation is precisely the point that makes inference challenging here. I like to point out California as being a good example of a puzzling outlier in the original study. There is no a priori reason to expect that California would do as poorly as it does relative to other states. But different groups have different explanations.

One hypothesis is that "teacher tenure" leaves weak performers in the classroom for decades. This seems to be the position of people like Michelle Rhee.

Another hypothesis is low educational spending in the state. According to wikipedia, "In teaching staff expenditure per pupil, California ranked 49th of 51".

A third hypothesis is the composition of people in California just leads to lower educational performance (due to innate differences with, for example, the population of New York state).

I kind of suspect the second hypothesis as most likely, but, given how inter-related things are it is unclear how you would do inference here. And some forms of the first hypothesis would lead to policies that could make things worse if the second were true. The third would suggest no policy is likely to make an important difference.

Coming up with clever ways to distinguish between these hypotheses is not trivial.

Speaking of epicycles...

James Kwak has a beautiful stone-by-stone dismantling of this paper by Bryan Caplan and Scott Beaulier.
The paper argues that welfare programs expand the set of choices available to people; while that is all good according to traditional economics, if we think that people are inclined to make bad choices (“behavioral economics”), then welfare programs give people more opportunity to make bad choices and hurt themselves. This is particularly a problem because, they claim, “there are good empirical reasons to think that behavioral economics better describes the poor than it does the rest of the population” (p. 4). In other words, if poor people are more irrational, then giving them more choices will hurt them more than other people.
Kwak takes it all apart, from the odd and strangely uninformed take on behavioral economics to the lack of supporting data to the ready supply of superior alternative hypotheses to the generally poor quality of the authors' reasoning. If this zombie claws its way out of the grave after Kwak is through with it then there's just no killing the damned thing.

One thing that is obvious from this paper is that Caplan and Beaulier have reached the epicycle stage.* They are no longer focused on finding the best model to fit the data; instead they are trying to salvage an intellectual framework that is based on elegant principles and appealing ideas like just desserts and efficient markets.

The shift from serving the data to serving the theory is easy to make and difficult to catch. Every new theory occasionally runs into a phenomena it has trouble explaining. Explaining them can be a good thing. The process of taking a theory into the counter-intuitive is an important, even necessary step in its evolution.

There is, however, one condition: the explanation you finally come up has got to be as good or better than any of the alternative hypotheses. Otherwise, the theory isn't evolving; it's decaying.



* The Wikipedia article on the subject suggests I'm being unfair to Ptolemy. I hope not -- I'd hate to have to come up with a new analogy.