West Coast Stat Views (on Observational Epidemiology and more): August 2010

Tuesday, August 31, 2010

Measurement part 2

On a recent evaluation, her principal, Oliver Ramirez, checked off all the appropriate boxes, Tan said — then noted that she had been late to pick up her students from recess three times.

“I threw it away because I got upset,” Tan said. “Why don’t you focus on my teaching?! Why don’t you focus on where my students are?”

Matt argues that proponents of teacher effectiveness are misunderstanding their critics:

The idea has gotten out there that proponents of measuring and rewarding high-quality teaching are somehow engaged in “teacher-bashing.” I think that’s one part bad faith on the part of our antagonists, one part misunderstanding on the part of people who don’t follow the issue closely, and at least one part our own fault for focusing too much on the negative.

But I think his own example is showing why skepticism persists. It's easy to measure the wrong things, incentive the wrong behavior and do a fair amount of damage to a system. What I would really like to see is an argument for incremental change and experimentation rather than radical reform driven by standardized tests. Or, if we must use some sort of standardized test approach, I’d like to have some better evidence that these tests are designed to measure teacher effectiveness and do not omit important elements. For example, I think clear and interesting writing is hard to do (as readers of this blog may notice when they try and follow my words) and it is very hard to objectively score. Multiple choice questions on word definitions are much easier to do but, perhaps, may not measure the most important skills we want to teach.

Certainly something to ponder.

Are you measuring the right thing?

Statistics is a fantastic tool and capable of creating enormous advances. However, it remains the slave of the data that you have. The worst case scenario is when the "objective metric" is actually measuring the wrong thing. Consider customer service in Modern America:

Modern businesses do best at improving their performance when they can use scalable technologies that increase efficiency and drive down cost. But customer service isn’t scalable in the same way; it tends to require lots of time and one-on-one attention. Even when businesses try to improve service, they often fail. They carefully monitor call centers to see how long calls last, how long workers are sitting at their desks, and so on. But none of this has much to do with actually helping customers, so companies end up thinking that their efforts are adding up to a much better job than they really do. In a recent survey of more than three hundred big companies a few years ago, eighty per cent described themselves as delivering “superior” service, but consumers put that figure at just eight per cent.

Here, the core issue seems to be that measuring efficiency at delivering customer service is not the same thing as having good outcomes. Having done statistics for a call center, I can assure you that they are obsessive about everything that can be measured. But satisfaction is a hard thing to measure and it most assuredly matters.

This analogy is why I am concerned with the use of standardized tests for measuring educational achievement. It is possible that they are capturing only part of a complex process and that the result of focusing on them could be fairly poor. After all, companies have tried to deliver exceptional customer service via call center for a couple of decades now and the results do not appear to be a uniform consensus that customer service is a delightful experience.

It is not that these processes can't be evaluated. It's just that the success of education or customer service may depend on things that are hard to measure. If we only measure those features that are easy to measure we may end up wondering why education is in decline despite a steady improvement in the key metrics we use to evaluate it.

Joel Klein's Record

From Mark Gimein (via Felix Salmon), here's a well-timed story from New York Magazine:

New York City public-school kids may be dreading the end of summer, but schools chancellor Joel Klein is the one who’ll really be tested when classes begin again. Last spring, Klein was bragging about the extraordinary upswing in scores during his tenure: a 31-point rise in the percentage of students who passed state reading tests, a 41-point increase in math. That was before state authorities admitted that they’d been progressively more lenient in scoring the tests, and decided to grade more strictly.

The new stringency resulted in the elimination of most of the miraculous gains of the Bloomberg years, and an administration that had lived by the numbers is getting clobbered by them. Klein told parents that the state “now holds students to a considerably higher bar.” This would make sense only if the state hadn’t previously been lowering that bar.

Last year, NYU professor and Klein antagonist Diane Ravitch said exactly that in a Times op-ed, an assertion that Klein claimed was “without evidence.” But the fact that New York students’ scores on the National Assessment of Educational Progress had moved only marginally, even as state scores skyrocketed, was manifest then and is inescapable now.

As discussed here, we've seen Klein omitting relevant statistics before.

Monday, August 30, 2010

Sentences to ponder

We always talk about a model being "useful" but the concept is hard to quantify.

-- Andrew Gelman

This really does match my experience. We talk about the idea that "all models are wrong but some models are useful" all of the time in Epidemiology. But it's rather tricky to actually define this quantity of "useful" in a rigorous way.

Is Ray Fisman one of the best and the brightest?

Based on some of the feedback to my past few posts on Ray Fisman's recent Slate article, there's a point I should probably make explicit: given all available evidence including reliable first-hand accounts, Ray Fisman is an accomplished researcher and a good guy. Furthermore, I am working under the assumption that, like most people in the reform movement, Dr. Fisman is motivated by a deep concern about the state of education and a genuine desire to improve it. (I have also found this a safe assumption when dealing with the vast majority of the teachers Dr. Fisman would fire.)

I have singled out Dr. Fisman not because "Clean Out Your Desk" was exceptionally bad but because it was exceptionally representative. If this were an anomaly written by someone who was stupid or incompetent or had a grudge against teachers, it wouldn't be worth anyone's time to discuss, let alone exhaustively rebut it. This is something more disturbing.

David Warsh has drawn a relevant analogy between the reform movement and the run-up to Vietnam:

Remember the recipe for a policy disaster? Start with a handful of policy intellectuals confronting a stubborn problem, in love with a Big Idea. Fold in a bunch of ambitious Ivy League kids who don’t speak the local language. Churn up enthusiasm for the program in the gullible national press – and get ready for a decade of really bad news. Take a look at David Halberstam’s Vietnam classic The Best and the Brightest, if you need to refresh your memory. Or just think back on the run-up to the war in Iraq.

The education reform is filled with smart, well-intentioned people like Ray Fisman. Under the circumstances, that doesn't provide much comfort.

Zero Tolerance

A timely post from Matt Yglesias:

The commonplace scenario in the United States when people decide to “get tough” and implement a policy of “zero tolerance” for infractions of the rules is to in practice tolerate the majority of infractions by not catching perpetrators and then hit a minority of violators with extremely harsh sanctions. For years now, Mark Kleiman has been pushing the reverse approach—make sanctions relative mild, but make them swift and nearly certain.

The results were compelling:

Now the results are in: drunk–driving fatalities fell from twice the national average, 70, in 2006 to just 34 in 2008, the most recent year for which data are available

It is a key element of public health policy to try and find ways to handle behaviors that involve both a health issue (like addiction to alcohol) and a negative externality (like hitting people with cars). It is really interesting to see researches being done on what approaches are actually the most effective. This type of research is important stuff and has some pretty interesting ramifications for improving public health in a wide range of circumstances.

Hazards of sweeping generalizations

Commenter Nat makes a good point about sensitivity versus specificity:

The other reason for having 100% sensitive tests at the cost of specificity is because of the clinical tradeoffs that occur because you have done that test.

If, for example, the treatment subsequent to a test is vitamin supplementation which should have next to zero complications then 100% sensitivity is the face of nasty complications caused by non-treatment makes quite a lot of sense.

In some of the areas that I work, like pain, we are denied these elegant trade-offs. However, I also do work in coagulation and there are good examples of this type of trade-off there. For example, despite the limited evidence of clinical utility, it can make sense for people with Homocysteine and MTHFR mutations to take b-vitamins. Similarly, a few false positives have very limited impact on the patients involved as the risk of taking a b-vitamin supplement (in the first world where economic hardship is unlikely) is small.

So this is a good reminder that there are no sweeping generalizations in epidemiology.

Saturday, August 28, 2010

Megan's List

Blogger Megan McArdle has a list of things where her best guesses turned out to be incorrect in the face of evidence. We all have these cases and it is good to look back and see where foresight failed us. There is nothing like data to help us revise our internal prediction algorithms.

This point, though was chilling:

I believed that over reasonably long time-frames, modest investments in equities would allow you to retire in comfort.

I am mildly interested (as a hobbyist) in personal investment. But I think the implications of this are far broader than one sentence really captures. It makes the whole idea of shifting Social Security (as an insurance program against poverty in old age) into personal accounts somehow less appealing. It also says very interesting things about the rate of retirement for those of us who started our careers late (due to mid-career changes). The more I think about this issue, the more it has very profound implications for the way our work force will evolve.

But, sadly, I also think that this point is the best reading of the current stagnation of equities, even if the long run is better it is hard to have the level of confidence that I once did.

Friday, August 27, 2010

Trade-offs between Type I and Type II error

I was reading this blog post by Andrew Gelman on a test that is 100% accurate for alzheimer's disease. Following the initial post up, it appears that the test has only 64% specificity.

But the feature of this discussion that I find the most interesting is that the decision to choose between sensitivity and specificity is a judgment that people seem to be very poor at. Consider pain medication. If you want to make sure that everyone in serious pain gets appropriate pain control than some fraudsters will get illicit narcotics. Alternatively, if you make the screen tight that nobody is able to obtain narcotics via "fake pain" then some real cases will be undertreated.

We see the same thing with releasing people from prison. Even if former prisoners only committed crimes at the rate of the general population, at least some crimes could be prevented by a tougher release policy. Of course, this line of reasoning leads to absurd conclusions -- we could completely eliminate adult crime by jailing everyone for life on their eighteenth birthday. We see the same thing in the Ray Fisman argument for getting rid of 80% of teachers during probation – it is so important not to make a mistake and keep an inferior teacher that we should fail to hire many good teachers just to make sure we have no sub-standard ones.

But people don't seem to like to make these trade-offs. In the case of the test for Alzheimer's disease, the authors could have been a lot more specific if they were willing to give up sensitivity. But, for some reason, people seem to prefer to end up at one extreme of a scale rather than the middle (where the value of the test is maximized).

It's a phenomenon that I wish I understood better.

Thursday, August 26, 2010

Badly needed break from Fisman

With two brilliant clips from the Daily Show

The Daily Show With Jon Stewart

Mon - Thurs 11p / 10c

Extremist Makeover - Homeland Edition

www.thedailyshow.com

Daily Show Full Episodes

Political Humor

Tea Party

The Daily Show With Jon Stewart

Mon - Thurs 11p / 10c

The Parent Company Trap

www.thedailyshow.com

Daily Show Full Episodes

Political Humor

Tea Party

US Mobility

In a ncie article, the mobility myth, it is pointed out that only 2.7% of Americans cange states each year. My experience is different than that but that is likely because I am in Academia. Still, I am definitely aware of the risks that changing states brings (as you can never know if a state will work out or not in advance).

On the research side, however, this fact is good news for database research as the "lost to migration" rate is low enough to make it unlikely that we will get serious bias in state level medical claims data. That is really useful to know when evaluating MedicAid studies.

Genetic Epidemiology

John Cook has a post on predicting height using genes. He quotes:

A 2009 study came up with a technique for predicting the height of a person based on looking at the 54 genes found to be correlated with height in 5,748 people — and discovered the results were one-tenth as accurate as the 125–year-old technique of averaging the heights of both parents and adjusting for sex.

I suspect that this issue is the central one facing genetic epidemiology. While it is possible that the approach of averaging the height of the parents includes some environmental information, it is a pretty strong comment on the predictive power of genes if that is the actual answer.

More likely, I think, is the idea that complex and important characteristics are due to many, many genes (all of which have a modest influence). The makes sense from a selection point of view (characteristics like height need to be stable) but makes the project of prediction using genes extraordinarily complicated. I don't know if there is a simple answer or not but it definitely provides some challenges for the paradigm of the classic epidemiological study.

Wednesday, August 25, 2010

Mystery (Education Question)

On thing that amazes me in the education debate is that people of all political stripes seem to agree that education is in a crisis. Consider Jonathan Chait (whom I think it is clear is a liberal) who seems to agree that teacher firings make sense. Yet, as Mark notes, America leads in elementary education.

So why are the two so often conflated?

It could be the "big lie" where a falsehood is said so often that the other side starts to believe it. But people are usually more sophisticated than that.

Another possibility is that we have lost perspective on the alternatives. We worry about reluctance to fire teachers but forget that private alternatives are not inexpensive. From Marginal Revolution:

A New York City charter school set to open in 2009 in Washington Heights will test one of the most fundamental questions in education: Whether significantly higher pay for teachers is the key to improving schools.

The school, which will run from fifth to eighth grades, is promising to pay teachers $125,000, plus a potential bonus based on schoolwide performance. That is nearly twice as much as the average New York City public school teacher earns, roughly two and a half times the national average teacher salary and higher than the base salary of all but the most senior teachers in the most generous districts nationwide.

However, this still doesn't explain the odd consensus of left and right as it seems improbable that many people are fighting for a serious increase in education costs.

Most likely, I suspect, the the current American focus on short term results. When we do annual ratings of employees, we do not consider issues of long term dedication -- we know people are simply going to move on anyway. Consider this report:

Among jobs started by workers when they were ages 38 to 42, 31 percent ended in less than a year, and 65 percent ended in fewer than 5 years.

Is it possible that, with 65% of middle aged workers holding a job for less than 5 years, that we have simply lost the sense of how to build long term loyalty and dedication?

The Old Shell Game -- Why you have to keep your eye on Ray Fisman (and no, we're not quite through with the second paragraph)

Ray Fisman's Slate article "Clean Out Your Desk" is something of a greatest hits of the education reform movement, covering most of the standard arguments and carefully selected statistics that are brought up again and again by the advocates. It is, therefore, fitting that FIsman gets no further than the second paragraph before getting to this:

New York City Schools Chancellor Joel Klein often quotes the commission before discussing how U.S. schools have fared since it issued its report. Despite nearly doubling per capita spending on education over the past few decades, American 15-year olds fared dismally in standardized math tests given in 2000, placing 18th out of 27 member countries in the Organization for Economic Co-operation and Development. Six years later, the U.S. had slipped to 25th out of 30. If we've been fighting against mediocrity in education since 1983, it's been a losing battle.*

The OECD tests are the book of Revelations of the education reform movement, the great ominous portent to be invoked in the presence of critics and non-believers. Putting aside questions of the validity and utility of this test (perhaps for another post if my stamina holds out), we would certainly like to be in the top ten rather than the bottom.

But before we concede this one, lets pull out our well-thumbed copy of Huff and take one more look. Whenever one side in a complex debate keeps pulling out one particular statistic, you should always take a moment and check for cherry-picking.

Is Fisman distorting the data by being overly selective when picking statistics to bolster his case? Yes, and he's doing it in an egregious way.

Take a look at at the Trends in International Mathematics and Science Study. Here's a passage from the executive summary from the National Center for Education Statistics:

In 2007, the average mathematics scores of both U.S. fourth-graders (529) and eighth-graders (508) were higher than the TIMSS scale average (500 at both grades). The average U.S. fourth-grade mathematics score was higher than those of students in 23 of the 35 other countries, lower than those in 8 countries (all located in Asia or Europe), and not measurably different from those in the remaining 4 countries. At eighth grade, the average U.S. mathematics score was higher than those of students in 37 of the 47 other countries, lower than those in 5 countries (all of them located in Asia), and not measurably different from those in the other 5 countries.

We could spend some time in the statistical weeds and talk about the methodology of TIMSS vs. OECD's PISA. TIMSS is the better established and arguably better credentialed, but both are serious efforts mounted by major international organizations and it would be difficult to justify leaving either out of the discussion.

If Fisman had limited his focus to the education of high school students and simply ignored the data involving earlier grades, we would have ordinary misdemeanor-level cherry-picking. Not the most ethical of behavior, but the sort of thing most of us do from time to time.

But Fisman does something far more dishonest; he quietly shifts the subject to teachers in general and often to elementary teachers in particular (take a good look at the study that's at the center of Fisman's article).

This means that, when you strip away the obfuscation, you get the following argument.

1. The best metrics for tracking American education are international rankings on math tests;

2. The best way of improving America's education system is fire massive numbers of teachers, including those in areas where we are doing well on international rankings on math tests.

The bad news here is that we have a long way to go to make it through Fisman's article and it doesn't get much better, but the good news is that we're through with the second paragraph.

* Fought, for the most part with Klein and Fisman's battle plan but we've already covered that.

More on Avandia

Here is an interesting post on Avandia. It points out the discrepancy between the number need to treat (to show benefits on the key endpoints) and the number needed to harm (with a myocardial endpoint). Now, the author is neglecting the uncertainty in these two numbers.

But it's a pretty clear that if the number needed to harm is 50 and the number needed to treat to prevent a serious outcome is 1000 then the medication is likely not favorable on the cost-benefit analysis.

There are cases where the risk-benefit calculation is a subtle problem and it is always tricky to withdraw a drug that showed actual benefits in the original clinical trials. But it is looking increasingly like Avandia may carry more risks than benefits making it an exception to the rule.

Monday, August 23, 2010

Starting from the beginning -- Ray Fisman's sins of omission

The centerpiece of Fisman's recent Slate article, "Clean Out Your Desk," is a deeply flawed analysis proposing that four out of five probationary teachers should be fired, but the problems with the article aren't limited to that one piece of research; they permeate the article, starting with the very first two paragraphs:

In 1983, a presidential commission issued the landmark report "A Nation at Risk: The Imperative for Educational Reform." The report warned that despite an increase in spending, the public education system was at risk of failure "If an unfriendly foreign power had attempted to impose on America the mediocre educational performance that exists today," the report declared, "we might well have viewed it as an act of war."

New York City Schools Chancellor Joel Klein often quotes the commission before discussing how U.S. schools have fared since it issued its report. Despite nearly doubling per capita spending on education over the past few decades, American 15-year olds fared dismally in standardized math tests given in 2000, placing 18th out of 27 member countries in the Organization for Economic Co-operation and Development. Six years later, the U.S. had slipped to 25th out of 30. If we've been fighting against mediocrity in education since 1983, it's been a losing battle.

Notice that strange gap of more than a quarter century? Other than mentioning increased spending per capita* and citing a couple of ominous sounding statistics, Fisman doesn't say a word about what happened since. There is no mention of what the response was to "A Nation at Risk." You could easily come away with the impression that there was no response, that educators had simply gone on with business as usual.

This is a common rhetorical trick in the educational reform movement.: to point out various facts suggesting a dangerous decline over the past two or three decades then quickly change the subject (sometimes citing "A Nation at Risk" to add a note of the Cassandra Syndrome). If only we had done something then, we wouldn't be on the precipice now.

The primary flaw in this narrative is that there was a response to the report, it was swift and sweeping, and mostly it consisted of the types of change reformers like Fisman, Klein, and Ben Wildavsky continue to push for to this day: importing techniques and philosophies from the private sector; encouraging privatization and entrepreneurs; basing the evaluation of schools on objective metrics (particularly standardized tests of student performance).

By the late Eighties, when I went into teaching, it was difficult to find a school without a mission statement. Staff development by then consisted almost entirely of the kind of training/motivation seminars that I would encounter a few years later working for Fortune 500 companies. Business jargon was all the rage. My first encounter with the school of education was when the dean gave us a talk on how Tom Peters and In Search of Excellence were going to revolutionize education.

For a while, I taught in a high school where the principal was known to say that he didn't like to base teacher renewal or promotion decisions on standardized test scores. He was careful not to say he wouldn't. That would have been a blatant lie. We knew he would make our lives miserable if we didn't teach to the test. He knew we knew it. But he maintained at least a vaneer of plausible deniability.

One particularly spineless history teacher spent about a month doing nothing but drilling facts that were likely to be on the test. No discussions. No writing assignments. No additional reading. No attempt to put the material into any kind of meaningful context. But his scores were good.

By the early Nineties, within less than a decade of the "Nation at Risk" report, states were starting to pass charter school laws. Pushes for merit pay and weakening tenure intensified. Faith in business as a source of answers for schools continued.

Education reform has proceeded in more or less a straight line for more than a quarter century without much that can be held up as a clear success. This isn't necessarily a damning criticism. Reformers like Fisman and Klein could honestly argue that the current state of education is mixed, not as good as it could be but not as bad as it would have been if steps had not been taken, or they could argue we have a classic case of half measures, that these reforms would solve our problems if fully implemented but in their watered down form they can do no good.

Both of these arguments are honest and defensible.

What they can't honestly do is imply that we are where we are because we didn't listen to them.

*Why per capita and not per student?

Thursday, August 19, 2010

Ray Fisman and the Tierney Ratio

The Tierney Ratio (sometimes called the Tierney Test because people love alliteration) is a measure of journalistic mediocrity named for its frequent subject, John Tierney. You find the Tierney Ratio of an article by counting the number of words it takes to address all of the significant problems in the article, then dividing that by the article's word count.

As you might expect, Tierney Ratios vary greatly from author to author. The sorely-missed Olivia Judson maintained a TR of virtually zero while writing for the New York Times while John Tierney, a science writer with no appreciable background in or aptitude for science, routinely had observed TRs in excess of one or two. (it is possible that Judson was kept in the Op-Ed rather than Science section out of concern that she would unfairly lower the latter's average.)

The value of the Tierney Ratio is somewhat limited by its serious data censoring problem (analogous to this well-known example). Faced with articles and essays of sufficiently low quality, researchers are almost always forced to leave significant mistakes, distortions and fallacies unaddressed.

Which brings us back to Ray Fisman's recent column in Slate, which reaches an almost Hellmanesque level of inaccuracy. Getting a true TR on something like this is an extraordinarily tedious job so the readers who aren't into hardcore education wonkery might want to skip the next few posts. You'll know it's safe to come back when we start posting Daily Show clips again.

Data censoring and Tootsie Pops

There is reason to suspect undercounting.

Wednesday, August 18, 2010

Desperately Seeking Suzanne (Null)

In Life in Hell, Matt Groening once asked if there was anything scarier than an open-mic poetry night. As a general rule, I have the same reaction to comment sections. There are exceptions (Andrew Gelman's site come to mind) but most of the time your chances of happening upon an intelligent and insightful conversation are better when you sit down between two strange drunks in an unfamiliar bar.

So you can understand why I initially skipped over the comments to Ray Fisman's recent post (if comments are usually less intelligent and well-written than the articles they accompany, just imagine the Stygian depths these would have to sink to in order to follow Fisman). Fortunately Joseph did brave the bottom of the web page and discovered that the comments here were actually better than the piece that inspired them.

The best of that very good crop were the entries was by Suzanne Null, who is (I believe) an education professor in the Northeast. [update, strike that last part. It looks like Suzanne is a fellow Westerner.] In this series of comments she takes down Fisman brick by brick:

Didn't Fisman's teachers ever teach him to conduct some research and check the validity of his sources (there is better and more recent information than the 1997 research he cited) before he publishes something? Virtually all of the information in this article has already been debunked. See Ravitch's copiously-researcherd "The Death and Life of the Great American School System" (particularly Chapter 9) and practically everything by Stanford's Linda Darling-Hammond. For example:

1) There is a great deal of evidence that better training helps teachers improve instruction (see research by Darling-Hammond and on the "Research" section of the www.nwp.org site). Teachers are "made" (not born) through training, professionally supportive school environments, and supportive communities. Experience makes a difference in teacher effectiveness (Ravitch 190) and one of the most major problems with the teaching profession is its high rate of attrition; many teachers leave the profession by their fifth year.

2) Despite what this article says about identifying "bad" teachers, we haven't yet found a reliable way to identify who the "bad" teachers are. Test scores are one-dimensional and subject to numerous validity and reliability issues (Ravitch 152-154). In addition, despite the claims made in this article, test scores can vary significantly by teacher from year to year because there is so much variation among the students in the teachers' classes. (Ravitch 185-186). A teacher who the tests identify as "high performing" one year might appear to be "low performing" the next.

3) The article insinuated that schools can "close the gap" simply by hiring the top quintile of teachers. This research comes from Gordon, Kane, & Stagler 2006; Hanushek & Rivkin (2004), and Sanders (2000), all cited by Ravitch (183-184). This has also been debunked because the learning gains cited in these articles don't persist over time (Jacob, Lefgren, & Simms 2008) and because of the general unreliability of the tests, particularly when used for the purpose of evaluating teachers, which was not the primary goals when most of these tests were designed.

4) The effectiveness of Teach for America (TFA) has been inconclusive (Rativich 188-191). For example, an extensive study by Darling-Hammond's found that TFA teachers "had a negative or non-significant effect on student achievement" (2005, cited by Ravitch 189). Thus "degrees from prestigious colleges" are also NOT a predictor of effective teaching. In any case, it is delusional to believe that the entire country can sustain the constant turnover of teachers that has characterized TFA (particularly given schools' current budgets for teacher pay) or that this level of turnover would be desirable for our students (Ravitch 190).

5) The research on the "cumulative effects" of attending NYC charter schools has been proven to be invalid. Charter schools are not all successful -- some post higher test scores than their comparative "public" schools and others post lower scores. When they have higher scores, it is usually because they take the students who chose to enroll or enter the lottery system. These students and their families tend to be more engaged with their education in general, and thus tend to perform better, no matter what the teacher does. Charter schools also take fewer students with special needs, such as Special Education students or second-language learners. When researchers have adjusted for these differences in their data pools, they have found no significant differences between charter schools and public schools (Ravitch 140 -143).

...

What's particularly worrisome and insidious about the author's arguments are that they will further harm students within our school system. If the "thought experiment" of abandoning teacher selection based on qualifications and teacher training is ever carried out in favor of allowing anyone to try to teach so that the "data" can winnow out the top 20%, it will mean that our students will bear the brunt of training and selecting teachers. They will be subjected to a revolving door of completely untrained teachers, and they will lose educational time and opportunities as they experience the steep learning curve that is present for teachers in their first two years. Our students deserve trained and experienced teachers; they don't deserve to be the guinea pigs that have to test out anyone who walks in off the street.
...

If we really want to overcome "mediocrity" in schools, we should focus on retaining the best teachers, giving them the professional freedom and support to do their jobs, and incentives for high performance, not just on tests, but on other measures of teacher success. Since I began teaching in 2000, I've watched many of the hardest working, most committed, and most motivated teachers leave. Those who stay are a few of the truly exceptional ones, and many of the ones who are happy to administer lectures and scan-tron tests and then go home. Teachers have few opportunities for professional advancement (aside from maybe becoming a principal), and few incentives to go "over and above" in their jobs. If we really want to improve education, our school SYSTEMS' tendency to support the mediocre and discourage (or even fire) the best is what will need to change. This change will require better working conditions, better support, more resources, smaller classes, and even better pay incentives for our hardest working and best-performing teachers.
...

I would add that the whole "martyr" or indolent "loser" dichotomy presented in the media's portrayals of teachers allows our society to evade responsibility for actually improving schools. If the best teachers are great because they CARE so much about their students and are willing to sacrifice so much, and if as the article says they are "born great," then they won't require smaller classes, better materials, more manageable work responsibilities, or higher pay. And if the worst teachers are indolent, then more money isn't going to help them anyway. The entire construction allows our culture to continue to alternately lionize and blame teachers while doing nothing that would actually help support teachers in their endeavors to help students learn.
...

Actually, the one strategy that's been proven to raise test scores is to winnow out the low-scoring students. This can be accomplished by re-drawing school attendance boundaries, creating "choice" or charter schools (which of course don't have the "resources" for Special Education students, second-language learners, or students with behavioral issues), or by "encouraging" the low-performing students to drop out or leave. The schools that have done this have been able to tout the "excellence" of their school management and teacher training, and their principals and superintendents have often gotten promotions and large pay raises.

So maybe our schools should all try that.
...
Just to clarify, this last suggestion was facetious. If our only way to "improve" our schools is to stop serving all of our students, is that a form of "success" that's worth having?
...

EB, I've particularly heard stories about nepotism and favortism from teachers in rural schools, so I know it happens. But many teachers' major fear about "performance based" pay is that it will be subject to the same dynamics. Even if teachers are evaluated solely on "data" such as test scores (which isn't a good idea for other reasons), it is very easy for principals to stack the deck against teachers they don't like by giving them the lower-performing students, making them change grades, or subjects, giving them unfavorable schedules, etc. I've already heard from some teachers I know in a rural area that principals will "drive a teacher out" by say, transferring them from fourth grade to first, only to then blame the teacher when test scores dip because the teacher hasn't had time to accumulate the practice and materials for the new grade level.

Many teachers are supportive of standards, accountability, and even incentive pay, but they want to be evaluated in fair and equitable ways.

Suzanne, if you've got a place you're posting on a regular basis, let us know and we'll add you to our blogroll. What you have to say deserves the widest possible audience.

Tuesday, August 17, 2010

A Partial N-space of Eduction Reform -- another preblogged footnote

If we are going to have an intelligent conversation about education (which at this point would be a refreshing change of pace), we have to start by thinking about the n-space. There are multiple dimensions that have to be considered here. As long as the debate fails to acknowledge them or approaches them in a sloppy way, the analyses will continue to be fatally flawed.

We could look at this on the level of classes or individual students, but in this case it probably makes the most sense to think of each school as representing a point in this multidimensional space. We assume these points are more or less fixed with respect to some of these dimensions (grade level, population density [rural/suburban/urban], demographics, region, etc.) but we like to believe that we can change the position of these points with respect to other dimensions (retention, discipline, standardized test scores, etc.).

Why is it so important to think in terms of this multidimensional space? Because there are few meaningful statements that are valid across these various axes. When Doug Staiger and Jonah Rockoff (here via Ray Fisman) made radical suggestions about teacher hiring policies, they based them on a study of arguably the two least representative school districts in the country. Even if the rest of the study were sound (rather than being a train wreck, but more on that later), the findings would be worthless for most of the country.

Worse yet, when you make a substantial change in educational policy, there is a wide range of relationships between the effects you see along different dimensions, including possible inverse relationships between retention and other measures of school performance (the fastest and most reliable way for a school to improve its performance is to get rid of the students it can't handle).

Even with the most careful of reasoning, the most clearly stated questions and the most closely examined assumptions, this kind of complex, multidimensional system can react to new conditions in dramatic, counterintuitive ways. If you approach it with the kind of sloppy thinking that has dominated the education debate, you are asking fate to do some very bad things.

[thanks to Wikipedia for the hypercube]

A dark, swirling, mammoth wall of wrong

Late last night (or more accurately early this morning) I had the TV on as background noise as I debugged some text mining code. The late show was airing Hidalgo and I happened to tune in shortly before the sandstorm scene.

Today, as I read this post by Ray Fisman, I had the sensation of being engulfed, much as the unlikely riders were, in an enormous, violent impenetrable cloud of bad arguments, flawed reasoning, shoddy research and statistical errors.

I'll try to make some sense of this tomorrow but in the meantime, check out Joseph's comments here.

Monday, August 16, 2010

When people don't understand regression equations

This article seems to be one of the worst mis-understandings of regression that has been posted in a while. Let us consider the heart of the argument:

When they ran the numbers, the answer their computer spat out had them reviewing their work looking for programming errors. The optimal rate of firing produced by the simulation simply seemed too high: Maximizing teacher performance required that 80 percent of new teachers be fired after two years' probation.

After checking and rechecking their analyses, Staiger and Rockoff came to understand why a thick stack of pink slips are needed to improve schools. There are enormous costs to having mediocre teachers burdening the school system, and once they get their union cards, we're stuck with them for decades. The benefits of keeping only the superstars is enormous, such that it's better to risk accidentally losing some of the good ones than to have deadwood sticking around forever.

The regression equation is assuming that all things remain equal. Presuming that there are 3 million teaching jobs in the United States (which was true in 1999 with 3.1 million), that would require filling 1.2 million vacancies per year. It's hard to get a good number for the total number of college graduates per year, but in 2004 there were 2.6 million freshmen; so one would assume, given a 100% graduation rate, that nearly 50% of college graduates would spend two years teaching (before being fired). Remember, in the long run this is sampling without replacement as we don’t rehire people who have already been fired in previous years.

Two comments come to mind. One, you have to have a powerful incentive to make the majority of college students do this. Either a social expectation (as in a teacher draft) that encourages potential teachers to give two years of service or some sort of extremely lucrative remuneration scheme would have to be developed.

Two, can we really believe that a cross section of 50% of college graduates would have better teaching ability than the median teacher currently does?

Furthermore, a school with a constant staff flux may have different characteristics than the current system. Teachers may be more willing to quite mid-year for another opportunity. Every year nearly 50% of teachers are learning the basics of school operations, administration and the material being taught. How do we get teachers to invest in long term outcomes and how do we handle mentoring new teachers given how few established teachers there are?

Which makes the decision to focus on this particular practical difficulty almost surreal:

And, of course, another issue is politics. It's hard to reconcile an 80 percent dismissal rate with the existence of teachers' unions: Pushback from unions and the government leaders who rely on their support have largely managed to prevent any breach of teacher job security thus far.

I think the bigger concern is to look at how we would overcome structural staffing issues. Or to wonder if the 80% of temporary teachers could possibly be superior to the teachers they replace. Seriously, I think that the existence of unions is far down on the list of concerns here.

Heck, if we reject a "draft based system" and presume that "social shaming" is unlikely to work, one might wonder if there was a way to invest the additional resources that we'd need to put into salary to make HALF of college graduates delay their career plans to teach in k-12 school systems.

All of this is based on a simulation study, which means the authors have failed to account for the degradation in the teaching pool as they increased the rate of rejection of teachers. By holding the job pool constant (i.e. holding the quality of the marginal recruits constant as you decrease the retention rate) they have made one of the classic mistakes of regression analysis.

Sunday, August 15, 2010

Light Posting

It is conference season (two conferences in the next two weeks) and I have no fewer than five talks to give. So I may be more silent than usual between now and September (especially as I may not have reliable internet access).

Apologies in advance.

Saturday, August 14, 2010

Bad collaborators

I have said it before and I will say it again: good collaborators are the best gift an early career scientist can have. That makes cases like this one all the more stark!

It's pretty clear to me that nobody will come out of this particular mess happy. The main researcher will have a hard time publishing their paper. The collaborator has lost an important paper. Confusion has been created in the the scientific literature. Nobody comes out ahead.

So far I've been very lucky on this front. Here is hoping that this continues.

Friday, August 13, 2010

Bad teachers, thought experiments and anecdotal data

Statisticians often have to come up with a first draft of metrics, filters, winnowing processes, etc. without having a sample of the data they'll be using. One approach to the problem is to take some anecdotal cases and ask ourselves how the system we've proposed would handle them. Would it have trouble classifying, leaving them in some 'other' box, or worse yet, would it mis-classify them, putting something that's clearly bad into the good or even excellent category?

Here's a thought experiment. Many years ago, when teaching at a medium-sized suburban school, I had a classroom across the hall from a football coach who taught history. For the record, some of the best teachers and administrators I have ever dealt with came from coaching. They were gifted motivators who brought to the classroom the same belief in excellence and "giving 110%" that they brought to the field or the court.

This was not one of those coaches.

Not only did he make no effort to motivate his students; I'm not sure he interacted with them in any way. His desk was set up at the back of the room, not a bad arrangement for a study hall but it effectively precluded addressing the class or answering questions or leading a discussion. As far as I could tell, the issue never came up. Students spent their hour filling out worksheets that he had Xeroxed out of a workbook. He spent the hour grading them.

I have never seen a more mind-numbing, soul-crushing approach to education but that didn't stop the principal from holding up this teacher as a role model for the rest of us. His classes were quiet, he never sent a student to the principal's office, and though the student's grasp of the material seldom extended beyond the rote level, that was sufficient for pretty good standardized-test scores (at least for knowledge-based rather than process-based courses).

This was almost two decades ago. Significant chunks of the current reform movement were already in place but No Child Left Behind was still years away. The teacher in question retired the year before I entered graduate school, but assuming he was still around, how well would he do under the proposed teacher evaluation system?

Presumably, most teacher evaluation metrics will largely be based on some combination of three factors: student test scores; classroom management; and supervisor evaluations. Our worksheet-dispensing educator would normally do well on the first and would max out the other two. I said 'normally' because (as mentioned before) these metrics are easy to game and the principal could easily arrange things to bump the test scores for his favorite teacher while screwing over a trouble-making teacher he would like to get rid of (someone like me, for instance).

Even if we assume that the principal didn't play favorites (and that's not an assumption I would have made with this administrator), this teacher would unquestionably be looking at generous bonuses. The question is, is this how we want to define excellence in education?

Chart of the day

Without getting into tax policy (and way out of my area of expertise), I think this chart (from Ezra Klein) does a good job putting the debate into prospective.

FemaleScienceProfessor on Tenure

This article is well worth the read. My favorite part:

Would a system of renewable contracts really allow professors to break out of the "publish or perish" mania? Methinks it might have even the opposite effect. If there were no tenure, the rat race would never end. And, since academia is apparently equivalent to a customer service industry, consider what renewable contracts for advisers would do to their graduate students and postdocs, not to mention the research infrastructure that we build in part from grants and in part from our institutions, and use to train our advisees.

The more one thinks about the whole tenure issue, the clearer it becomes that things are not as simple as "removing tenure would improve the academy". I had not even considered the issue that professors would be training their competitors in a rotating contract system, which would definitely make the sort of long term investment strategy that we currently have hard to incent.

FSP has a more sympathetic view of Cathy Trower's piece; I'll grant that the ideas at the end of the Trower piece are an improvement over the part I like to quote. I'm not anti-reform but I would prefer that reform not consist entirely of massive changes to employment contracts introduced from above.

But it's a very well thought out piece and definitely worth a read.

Epidemiology Data

In principle, I am highly supportive of the free release of data. But the issue is very tricky in epidemiology for two reasons.

One, data can make years (or even decades to collect). Making a study with many objectives instantly publicly available would make it hard for the orginal research team to be properly credited for the work. There are solutions that might work, but so long as the primary method of credit for grant and data collection is via the papers produced this will be tricky.

Two, epidemiological data often has a lot of tricky analysis issues. It's not implausible that taking short-cuts with the data analysis and taking an overly simple approach might not result in a publication being ready more quickly. It's good for neither the main team (which now has to rush) or the reading public (which has a higher risk of scientific errors).

So the principle is good but the implementation is much harder than it looks. It really is an area waiting for a good idea.

Thursday, August 12, 2010

Mentioned in passing

I've been overusing the "Save and Quit" option on Firefox too much of late, holding onto things that seemed to merit blog posts I didn't have time for. So in the interest of good desktop hygiene, here's a quick summary of some items I'd like to say more about later:

If you can manage it, there's a good reason to sleep in tomorrow.

Here's a list of the worst paying college degrees. Guess who lands at #2?

The LA Times blog has an interesting post on the relationship of toy sales and film-making.

I just learned that the LA Times lost one of its best writers. It's still a better paper than that other Times but the lead is shrinking.

America's finest fake news source remains our best source of real news analysis:

The Daily Show With Jon Stewart

Mon - Thurs 11p / 10c

Deductible Me

www.thedailyshow.com

Daily Show Full Episodes

Political Humor

Tea Party

Educational Reform and resources

This was an interesting suggestion on educational reform proposed by Dana Goldstein (h/t Matt Yglesias). He comments:

Rather, I'm imagining something like what the best public, private, and charter schools are already doing: a mix of additional instructional time and mealtimes with small group break-out activities like reading clubs, sports, board games, supervised computer time, library browsing time, and art and music lessons.

As a practical matter, to make this happen schools need extra labor: more hours from teachers, as well as specialized, perhaps part-time instructors in the arts and athletics.

Now I don't want to guarantee that this is a good idea. However, in a world of two working parents, a longer school day could be welfare enhancing and it's not impossible that it would have the effects on childhood obesity that are suggested by Ms Goldstein.

But there is one feature of this plan that I think it really makes sense to consider -- Ms Goldstein is discussing increasing the resources directed at schools (via meal subsidies, extra staff and additional funding) in order to improve outcomes. Could it fail? No questions. But it differs from a lot of education discussions by not trying to link a reduction of resources (e.g. removing tenure as a form of compensation) to improved outcomes. Instead, it argues that putting more resources into schools could result in a net public good.

That is a much better starting point for discussion (i.e. is this the best use of scare public resources) than the argument that removing resources will improve outcomes (so we can pay less and have a better school system).

In terms of the obesity argument, I suspect that much of this will hinge on the ability of the school to control eating and activity patterns. If students have snacks and/or school meals are not healthy then this seems less likely to work. In the same sense, putting together an activity program that succeeds in getting students to become active is not necessarily trivial.

But it certainly is worth an open discussion.

Replication

I have the opposite problem that Candid Engineer has with "fishy results". I have long advocated the central role of replication in science. This is especially important in Epidemiology where experiments are (by their nature) rare and so one needs to do most of their inference from observational research.

But how do you make a paper that has a near perfect replication seem interesting?

I mean it's good for science but it rather deadens the discussion section to have not that all much new to add except "that association is also observed in different populations".

Sigh!

Wednesday, August 11, 2010

Notes from a once and future reformer

This will shock those who know me personally, but when it comes to education I have always been a bit of a malcontent. As a student, I generally found grades 7 through 12 boring and, with the exception of a few really good classes, largely a waste of time. My senior year I took advantage of a program where I could go to college half the day (a much better option than AP if you can manage it). At that point high school became a brief obligation I had to take care of every morning before getting on with my life.

When I started taking education classes, I still had an attitude problem. I questioned the value of much of what was presented. I chafed at the endless buzzwords. I was suspicious of the research. I wondered (sometime out loud) about the competence of education professors who had spent little time in a non-college classroom. Once I actually made it to the classroom, things were better but there were still plenty of bad administrators, questionable standards, mind-numbing staff-development seminars and wasted potential to keep me bitching.

You might think that, given decades of accumulated dissatisfaction, I would be all for reform.

You'd be right.

The trouble is that almost none of the people using the term 'reform' are actually suggesting any reforms. Most of the proposals that have been put forward are simply continuations or extensions of the same failed policies and questionable theories that have been coming out of schools of education for years, if not decades.

Here's a small but telling example. When I was taking education classes back in the late Eighties, the professors spent endless hours discussing the proper way to write a lesson objective. Much was made of the importance of explicitly stating things in terms of student behaviors (objectives where student behavior was implicit were verboten, no matter how obvious that behavior might be). It was taken as an article of faith that subtle changes in wording could determine the outcome of a class.

Now take a look at these paragraphs from a recent reform puff piece from the Wall Street Journal editorial page:

Earlier this year, TFA released "Teaching as Leadership: The Highly Effective Teacher's Guide to Closing the Achievement Gap," which shares the practices of teachers who have made significant gains with students. One chart explains why teachers should choose an objective like "The student will be able to order fractions with different denominators," rather than "The teacher will present a lesson on ordering fractions with different denominations."

Objectives, says the guide, should be "student-achievement based, measurable and rigorous." Seem obvious? Well, as Ms. Kopp says, successful teaching is "nothing magic. It's nothing elusive. It's about talent and leadership and accountability."

Imagine you're in a bad pulp novel, captured by a cult that celebrates every full moon by sacrificing a half dozen virgins to a giant cabbage. You escape, free a group of dissidents and lead them to safety, only to find out they want to substitute a rutabaga for the cabbage and go for a full dozen virgins.

For me, following the reform debate has been one long concatenation of rutabaga moments.

Distributions

John Cook has another insightful post today on data distributions. It is an area that I know that I could stand to develop a fuller intuition about.

One point that I think comes out of his post is the idea that no real data ever fully fits a theoretical distribution. Ever. After all, all real data has noise in it. While this seems to be a obvious point, it is possible to get reasonable patterns of residuals using different distributions to fit data. Which could mean it is either distribution, or possibly neither.

Even worse, real data is sometimes a mixture of distributions based on a latent (or non-latent variables). Andrew Gelman has an example with height -- male and female adult heights are both approximately normally distributed but the combination of the two is not (there is a nice picture on page 14 of "Data Analysis using regression and multilevel/hierarchical models").

Even worse, there may be factor (e.e. genetic) that result in different mean adult heights. So you can get "fat tails". This is not a small problem as it means that your models will wildly underestimate the probability of an extreme result. I was not a huge fan of the Black Swan, but this point was correct (and, to be fair, it was the central theme of the book).

All of which is to say that I am definitely going to have to think more about this issue and , hopefully, see cases where I am not making the correct assumptions.

Tuesday, August 10, 2010

Complex processes are hard to simplify

One thing that I have noticed is that simple narratives about employment often gloss over important details. Mark has an example with teacher employment. Yves Smith has another one, where she points out that employers who are having trouble finding employees often manage this only by offering very poor terms of employment:

This does not add up. If the company can afford to spend ten weeks training people (and the additional cost of setting up a course), that suggests it could have offered more than $13 an hour, particularly given the opportunity cost of the orders it could have filled if it had had people on board sooner. The article later notes that Mechanical Devices “hire[s] through staffing agencies to help control health-care costs and maintain flexibility.” Um, that means they fire people as soon as orders fall.

I think that this issue is an old idea that we often forget. The jobs that are difficult to fill (in a functional economy) either require complex skills/credentials or have a known downside. I would worry that Yves Smith understates the case with her example: the worst of jobs are the ones where you never actually become unemployed but for which you don’t have a predictable income. Economists back to Adam Smith have noted that a lack of job security tends to raise wages.

Are we surprised that experiments with unpredictable income and low wages don't always work out? After all, there is a minimum income threshold below which basics like food, shelter and hygiene may start to become difficult to maintain.

In the same sense, why are we worried that it is hard to break into education? Many professions are hard to break into without, for example, credentials. Neither lawyers nor medical doctors can practice without the specific credential? Why is the lack of a graduate degree making it hard to become a teacher in competive markets an issue?

I suppose we could re-imagine the economy from the ground up. But top down approaches to comprehensive social and economic reform don't always result in wild successes and may generate adverse effects in the process.

This is not to say that there are not real issues to be resolved that can occur due to technological change. I just worry when the proposed solutions to a complex problem are extremely simple. That suggests either an agenda or a failure to appreciate all of the moving parts involved.

Now, before I get to be labeled an arch-conservative, I tend to think that slow and incremental reform is the best choice in most cases. There can be specific cases where the slow reform approach has failed completely and where the options are limited, but I'd rather be cautious rather than aggressive in finding them.

A better-late-than-never edition of reasons not to trust Naomi Schaefer Riley

As we used to say back in the Ozarks, I've been busier than a one-legged man at in an ass-kicking contest, so I still don't have time to go through the logical and factual problems with this piece in the Wall Street Journal Op-Ed section, but I have to take a moment to point out that Riley defines 'impossible' in much the same way that Newman defines 'rarely.'

In the spring of 1989 Wendy Kopp was a senior at Princeton University who had her sights set on being a New York City school teacher. But without a graduate degree in education or a traditional teacher certification, it was nearly impossible to break into the system. So she applied for a job at Morgan Stanley instead.

"Nearly impossible' in this case means 'mildly inconvenient.' It was no big deal. Lots of people did it. I know this because at around the same time Ms. Kopp was cranking out spreadsheets in Lotus 1-2-3, I was teaching high school despite the fact that I had neither an education degree nor traditional teacher certification.

I'm not saying Ms. Riley is woefully ignorant in her area of supposed expertise. I'm not saying that she's a liar. I'm just saying we've got the possibilities narrowed down.

Monday, August 9, 2010

Social Security Thought

I am not an economist and I don’t necessary have a strong understanding of personal finance. Mark brought up the issue of wealth transfer as a part of the whole social security debate. But I will admit to one additional element of puzzlement when it comes to privatizing social security.

Namely, precisely what would one invest in? If one is investing in government bonds (the safest instruments) then isn't the government just writing an IOU to the bondholders? The only difference I can see between a personal account investing in US treasury bonds and social security is that social security can be amended, like in 1983, without a default on United States bonds.

Or we could invest in stocks, but them we have risk due to issues like decades with a net negative return. I'd always naively assumed that the real role of programs like social security was to smooth out variability in returns (as the government can afford to plan for the real long run) and that'd seem to be defeated by personal accounts.

Hiltzik puts SS in plain English

Michael Hiltzik is one of the reasons I still feel that, despite cruel budget cuts and bad management, the LA Times is a better paper than the NYT. The latter gives us stories like this. Hiltzik gives us this:

What trips up many people about the trust fund is the notion that redeeming the bonds in the fund to produce cash for Social Security is the equivalent of "the government" paying money to "the government." Superficially, this resembles transferring a dollar from your brown pants to your gray pants — you're no more or less flush than you were before changing pants.

But that assumes every one of us contributes equally to "the government," and by equal methods — you, me and the chairman of Goldman Sachs.

The truth is that there are two separate tax programs at work here — the payroll tax and the income tax — and they affect Americans in different ways. The first pays for Social Security and the second for the rest of the federal budget.

Most Americans pay more payroll tax than income tax. Not until you pull in $200,000 or more, which puts you among roughly the top 5% of income-earners, are you likely to pay more in income tax than payroll tax. One reason is that the income taxed for Social Security is capped — this year, at $106,800. (My payroll and income tax figures come from the Brookings Institution, and the income distribution statistics come from the U.S. Census Bureau.)

Since 1983, the money from all payroll taxpayers has been building up the Social Security surplus, swelling the trust fund. What's happened to the money? It's been borrowed by the federal government and spent on federal programs — housing, stimulus, war and a big income tax cut for the richest Americans, enacted under President George W. Bush in 2001.

In other words, money from the taxpayers at the lower end of the income scale has been spent to help out those at the higher end. That transfer — that loan, to characterize it accurately — is represented by the Treasury bonds held by the trust fund.

The interest on those bonds, and the eventual redemption of the principal, should have to be paid for by income taxpayers, who reaped the direct benefits from borrowing the money.

So all the whining you hear about how redeeming the trust fund will require a tax hike we can't afford is simply the sound of wealthy taxpayers trying to skip out on a bill about to come due. The next time someone tells you the trust fund is full of worthless IOUs, try to guess what tax bracket he's in.

Sunday, August 8, 2010

Quote of the day

In Unqualified Offerings, Thoreau made the following quote about reform in higher education:

In any other line of business, a person who says “How hard can it be to do this efficiently?” is usually a clueless idiot. Ditto for many aspects of higher education.

It gives one pause.

Saturday, August 7, 2010

I am not certain that word means what they think it does

From the New York times comes the news that Once a Leader, U.S. Lags in College Degrees. Let us see what is meant by the word lags:

The United States used to lead the world in the number of 25- to 34-year-olds with college degrees. Now it ranks 12th among 36 developed nations.

I suppose that it is true that the United States is lagging behind the number one country. Who is that?

Canada now leads the world in educational attainment, with about 56 percent of its young adults having earned at least associate’s degrees in 2007, compared with only 40 percent of those in the United States. (The United States’ rate has since risen slightly.)

So a small country with a highly credential based society and a large network of Universities and Colleges with financial incentives to expand is actually able to beat the United States in proportion of degrees?

I am curious as to whether this change is due to Canadian improvements or US decline? But, either way, being in the top half of developed countries is an odd definition of "lags".

Finally, let's look at the proposed remedy:

The group’s first five recommendations all concern K-12 education, calling for more state-financed preschool programs, better high school and middle school college counseling, dropout prevention programs, an alignment with international curricular standards and improved teacher quality.

I am unclear exactly how these things directly link with the proportion of college graduates in the general population. One might think that a better place to start would be to look at what the Canadians are actually doing. After all, they are only a short drive north . . .

Friday, August 6, 2010

OT: Paizo

I think it has been a year since I did an off-topic post on role-playing games. One year ago the company Pazio released a new role-playing game based on the d20 system (August 2010). In that time they have managed to produce a pretty amazing set of high quality products. Nice art, high production values and a very well written set of books. It reminds me of the high quality work that Iron Crown Enterprises did in the 1980's in their Middle Earth books.

Well worth checking out!

Too many talks

I have four talks this year at the conference I am attending. I now officially promise that this is a high water mark. In future years, I plan to actually relax at the conference and plan to spend at least some of the time networking.

It's unclear what I was thinking (or if thought was part of the process).

Thursday, August 5, 2010

The role of luck in science

Candid Engineer has a nice question about the role of luck in science. It'sa good question and it is quite true that careers tend to be sensitive to initial conditions. I don't think that this is entirely true in science but it applies to many fields. Michael Lewis describes this phenomenon (being in the right place at the right time leading to better career opportunities) occurring to him mostly due to a single major trade early his career as a band trader in "Liar's Poker". Heck, he describes being hired entirely due to lucky meeting with the wife of a senior member of the corporation.

There is no way to eliminate luck in science. Even the assignment of reviewers to one's grants and papers can have an important influence on one's ultimate success.

When I was a physicist, I got some very odd readings in some of my experiments. We tried varying key parameters, replicated them and always got the same fascinating results. However, other labs had trouble finding the same thing and, as a result, I never published these results. It's good thing -- the error was finally detected in the analysis software!! Was it bad luck that the software had a bug in it for the exact type of data we had but worked well on the standard testing samples? Of course, with enough work or a flash of insight the bug might have been detected with less work. But ruling things out starts from most likely to least likely and, in this case, we guessed wrong.

So the best one can do is to work hard, focus on the best betsand roll with the punches. I'll leave you with a quote from physioprof (that I thought was the best take on the issue so far):

I hate that saying as well, because it doesn’t capture the most important aspect of “attracting good luck”. In order to maximize your chances of getting lucky, you need to maximize your exposure to beneficial risk. This has much more to do with the judicious application of hard work in wisely chosen directions than it does the sheer volume of hard work.

More on Trower

The tenure articles posted by round-table Mark tenure really do have the best of quotes. Let us return to the quote by Cathy A. Trower:

Research shows that Generation X values qualities that are in conflict with this system: collaboration, not competition; transparency, not secrecy; community, not autonomy; flexibility, not uniformity; diversity, not homogeneity; interdisciplinary structures, not disciplinary silos; and family-work life balance, not “publish or perish” careers.

There was so much wrong with this quote that I decided it deserved a second post. By setting her argument up this way, Cathy seems to be associating the academy with: competition, secrecy, autonomy, uniformity, homogeneity, disciplinary silos and poor work-life balance. Now it is possible that some of these issues could be altered (in a positive direction) by the removal of tenure.

But some of them are likely to move the opposite direction if tenure is abolished. I am unclear about the conditions in the Gamma quadrant, but are we sure that work-life balance is going to be improved by reducing job security? People feel less pressure to work hard when they do not have job security? Or what about competition -- does this mean that if private industry wanted to create a more competitive environment they would offer more job security?

Now I don't want to construct a straw person argument here there are three points that I think are really important. One, if one wants to re-envision the academy it is necessary to fully spell out the alternatives and not just attack one feature that people find unfair. Ideally, this would be done with the clear understaniding of how successful top-down reforms typically are. Do note that this is a critique that begins with the idea of competition as being bad!

Two, it is odd to attack a key instituition of the American academy at the point when it is at the peak of it's success. Look carefully at the Academic Rankings of World Universities (a Chinese project, by the way) and see how many American schools are ranked in the top 100 (in 2009, it was 8 of the top 10 universities with the other two being British). Sure, the methodology can be criticized but it's not a sign of complete failure to rank so strongly in international rankings.

Three, I find it odd how people simplify the academic enivironment as if there was a single system across the entire United States. Working in an NIH funded biomedical research shop is very different than teaching english at a community college. The stresses and solutions are very different. Recognition of this compexity would do a lot to refine the arguments being presented.

But, to be fair, I am not sure that the virutes of competition, autonomy and disciplinary focus are things that we want to get rid of (and I am sure that work-life balance is unlikely to improve with less job security). That the list of problems, itself, contains virtues is a rather interesting dilemma and it does rather make me wonder what the end state looks like.

Wednesday, August 4, 2010

University Towns

One interesting feature of American life is the University Town. These have survived and thrived, even as towns based around corporations have tended to be troubled (See: Michigan, Flint). However, unlike company towns, university towns rely on the ability to convince people to move there (as universities are historically reluctant to recruit their own graduates as faculty).

This is one of the key problems with statements like this one, made by Mark C. Taylor:

A middle ground will address most of the problems. After a trial period of three to five years, faculty members who merit promotion should be given seven-year renewable contracts. For this system to work effectively, these reviews must be rigorous and responsible.

I am not saying that you could not get this to work. But it is going to change the dynamics of these positions considerably. People are often willing to move cross country for the chance of permanent employment. But who is willing to do a major relocation to a small town for a series of rotating contracts? Even more important, these are people who have spent a decade (or more) in school and who are ill positioned to make major risks. And, under these conditions, buying a house counts as financial circumstances may make the decision not a renew contracts happen in clusters.

It is true that, ideally, there would be no global financial considerations placed on the decision to renew a contract. But that seems like an idealistic assumption.

Now, it is true that universities in big cities don't have this issue. However, in the United States and Canada (where I have experience), it is the universities in large (and diverse) cities that can afford to not have tenure now. You can decide you like Toronto as a city and be willing to find another job if the contracts at the University of Toronto don't work out. It's hardly the same at Lakehead University (in the small and very blue collar city of Thunder Bay) where relocation is likely to be hundreds of miles should employment fall through and alternative employment options are thin.

So one item that is being missed in the tenure debate is the heterogeneity of universities, themselves, and how much local conditions matter.

Tuesday, August 3, 2010

Academia and Diversity

Mark had a link to a tenure round-table that was a good reflection of the current argument about tenure that seems to be making the rounds. It's a rich area for discussion but I wanted to focus on an argument by Cathy A. Trower:

Research shows that Generation X values qualities that are in conflict with this system: collaboration, not competition; transparency, not secrecy; community, not autonomy; flexibility, not uniformity; diversity, not homogeneity; interdisciplinary structures, not disciplinary silos; and family-work life balance, not “publish or perish” careers.

Curiously enough, by making the argument that the institution of academia is in conflict with the current values of Generation X might be one of the best arguments to leave well enough alone. Consider, the principle of diversity is that diversity, by trying a lot of different approaches to problems and being inclusive of a wide range of viewpoints, allows us to find solutions more efficiently. The idea that a long standing institution is taking a wildly different approach than that of the current default approach actually suggests that keeping the system in place is a way to enhance diversity.

This is not to make me an apologist for the academy. There are many issues with the modern academic system that could use resolution. But the argument that it is in conflict with the values of a particular generation seems to be one of the weaker arguments for doing away with the whole enterprise.

Sunday, August 1, 2010

I'm too busy to comment on this at length...

but the NYT has a roundtable on tenure. Pretty much the same old lines. With the exception of Kezar (who does, at least, ask some good, tough questions), no one adds anything original or insightful to the discussion, but there is, I suppose, some value in seeing where the debate has come to a dead stop.

At Stake: Freedom and Learning

Cary Nelson, University of Illinois at Urbana-Champaign
Unsustainable and Indefensible

Mark C. Taylor, Columbia University
No Tenure, No Nothing

Adrianna Kezar, University of Southern California
Reducing Intellectual Diversity

Richard Vedder, economist, Ohio University
How to Start Over

Cathy A. Trower, Harvard School of Education

Tuesday, August 31, 2010

Monday, August 30, 2010

Saturday, August 28, 2010

Friday, August 27, 2010

Thursday, August 26, 2010

Wednesday, August 25, 2010

Monday, August 23, 2010

Thursday, August 19, 2010

Wednesday, August 18, 2010

Tuesday, August 17, 2010

Monday, August 16, 2010

Sunday, August 15, 2010

Saturday, August 14, 2010

Friday, August 13, 2010

Thursday, August 12, 2010

Wednesday, August 11, 2010

Tuesday, August 10, 2010

Monday, August 9, 2010

Sunday, August 8, 2010

Saturday, August 7, 2010

Friday, August 6, 2010

Thursday, August 5, 2010

Wednesday, August 4, 2010

Tuesday, August 3, 2010

Sunday, August 1, 2010

Reducing Intellectual Diversity