West Coast Stat Views (on Observational Epidemiology and more)

Tuesday, May 13, 2014

Another one for the West Coast Stat Views lexicon: Jethro Models

The Jethro Model is a formal or informal model that leaves out a large number of necessary parts. The allusion was explained in a previous post.

From Anti-orthogonality at Freakonomics

In one of the many recurring gags on the Beverly Hillbillies, whenever Jethro finished fixing the old flatbed truck, Jed would notice a small pile of engine parts on the ground next to the truck and Jethro would nonchalantly explain that those were the parts that were left over. I always liked that gag and the part that really sold it was the fact that the character saw this as a natural part of auto repair: when you took an engine apart then reassembled it you would always have parts left over.

Sometimes I find myself having a Jed moment when I read certain pop econ pieces.

"What's that pile next to your argument?"

"Oh, that's just some non-linear relationships, interactions, data quality issues and metrics that won't reduce to a scalar. We always have a bunch of stuff like that left over when we put together an argument."

For a recent example, consider this quote from George Mason University economist Robin Hanson (via Andrew Gelman):

If your main reason for talking is to socialize, you’ll want to talk about whatever everyone else is talking about. Like say the missing Malaysia Airlines plane. But if instead your purpose is to gain and spread useful insight, so that we can all understand more about things that matter, you’ll want to look for relatively neglected topics. . . .

Obviously, this is intended more as an observation than even an informal model, but we're still looking at a level of simplification that makes this rule pretty much meaningless; as soon as add any of the complexity of actual conversations, either with respect to why we converse or how we decide what to talk about, the whole argument just collapses. We converse for a long list of reasons. Sometimes we simply want company. Other times it's something more specific, to propagate our ideas, to amuse, to impress, to be liked, to establish individual and group identity, to get laid or, far more frequently, to convince ourselves that we could get laid if we wanted to. We could make similar list of reasons for picking conversational topics, but I think you get the point.

To reduce this down to social vs. informative motives and common vs. neglected topics, you either have to leave out important options or group together things so diverse as to make the definitions meaningless. What's more, by equating neglected topics with informative conversations, the model suggests some strange implications, such as that the person who just wants to be sociable will talk about racism and climate change, while the person who wants to be informative is more likely to discuss obscure distinctions between Phish bootlegs.

That's not to say that there's no extra value to bringing up neglected topics; it's just that Hanson's observation doesn't capture the fundamental relationships. I've been writing quite a bit recently on the importance of orthogonality and there's certainly a relationship between unique information and how much a topic has been discussed. Unfortunately there's also a great deal of collinearity. Lots of topics are relatively neglected because they don't contain that much interesting information.

To further complicate matters, under the right circumstances, you can gain considerable social cachet by knowing interesting facts about little known topics. The "interesting" part can be a bit of a hurdle, but I know people who do it which puts yet another hole in the model. As do people who bring up obscure topics for the primarily social purpose of making themselves seem distinctive or erudite.

Another problem with Jethro models is the way that their oversimplified, overgeneralized approach can enable self-serving hero/villain narratives. Andrew Gelman made a related point about many popular economics books and articles -- "What strikes me about this discussion is the mix of descriptive and normative that seems so characteristic of pop-microeconomics." You don't have to look hard to see that mix here -- you can almost hear the inspirational music in the background while reading this "if instead your purpose is to gain and spread useful insight, so that we can all understand more about things that matter, you’ll want to look for relatively neglected topics."

It should be noted that Robin Hanson spends a great deal of time on out-of-the-mainstream ideas. Without putting too fine a point on it, when someone who "has elected to have his head cryonically preserved in the event of medical death" depicts in such glowing terms people who discuss neglected topics, I can't help but suspect bias.

And given Hanson's tendency to portray himself as being above this sort of thing...

Monday, May 12, 2014

Yes, the House is still capable of bipartisan action...

...but what's interesting is where they choose to do it. Charter schools certainly aren't the least controversial issue facing congress and, if anything, they've become more so as stories have poured in about waste, huge payouts, discrimination, draconian discipline policies, and community protests. I don't want to demonize charters here -- there are a lot of good ones out there and I think they have an important role to play -- but they don't seem to be the sort of issue that could manage a 360 to 45 vote.

The answer lies, I think, in two factors: first, that it's easy to get a charter school bill in under the radar; and second, that charters have wide support on both the left and the right where it counts, in the media and among wealthy donors.

From the Hill:

The House on Friday passed bipartisan legislation to expand access to charter school funding.

Passed 360-45, the vote came in sharp contrast to the bitterly partisan debates this week over creating a select committee to investigate the 2012 Benghazi attack and holding former Internal Revenue Service official Lois Lerner in contempt of Congress.

A majority of Democrats — 158 in favor and 34 against — joined all but 11 Republicans in support of the measure.

The bill authored by House Education and the Workforce Committee Chairman John Kline (R-Minn.) and the panel's top Democrat, Rep. George Miller (Calif.), would consolidate the two existing federal charter school programs into one to award grants to state entities.

The measure would also authorize the secretary of Education to maintain a federal grant competition for charter schools that did not win state grants.

Republicans have touted the issue of school choice and access to charter schools as a way of limiting the federal government's role in education policy. Charter schools receive public funding, but operate independently and therefore are not subject to federal regulations.

"Expanding education opportunity for all students everywhere is the civil rights issue of our time," House Majority Leader Eric Cantor (R-Va.) said. "I say we help those students by expanding those slots so they can get off the waiting lists and into the classrooms."

Saturday, May 10, 2014

Weekend blogging -- brought to you by the Good Wife

One of the small but ubiquitous changes the internet has brought is the end of the lost song. Before the mid-Nineties, the main ways to learn the names of songs was from the DJs who sometimes remembered to tell you who you were listening to or from the captions on videos that had a way of fading just as you remembered to look up. Songs you heard on TV shows and movies were generally lost causes. The irritating feeling that came from not being able to find or forget a song (specifically "Anna" but not the Beatles cover) was the basis of at least one sitcom episode. From Wikipedia:

In the Married... with Children episode "Oldies But Young 'Uns" (Season 5, Episode 17; airdate March 17, 1991), Al Bundy becomes obsessed with finding out the name of this song which has become his earworm (originally he can only tell people the nondescript misheard lyric "hmm hmm him").

It is still possible not to be able to find a song, but it doesn't happen often. If you can remember a fragment of a lyric or pin down where you heard it, you can usually be listening to it on Youtube in a couple of minutes.

On last week's The Good Wife, a distinct and very catchy beat kept running through the episode. As soon as it was over I went online and learned that the beat came from the equally catchy song "High On the Ceiling."

Once I got on the subject, I remembered an obscure song from Malcolm in the Middle. Googling the show's title and the word 'hockey' was enough to bring up the song.

"Little Buster" from the beloved coming of age anime FLCL was another potential earworm that proved easy to find.

I have to admit, I could never get into that show. The only anime I ever really connected with was Cowboy Bebop but that one won me over completely. It also had one of the great late 60s/early 70s opening titles. Even Lalo Schifrin would have been jealous.

Technically, this last one doesn't exactly belong on this list -- I was already a big fan if the song -- but I like this version a lot and, like all good covers, it reveal something interesting that you probably missed in the original.

Friday, May 9, 2014

A musical introduction to the old "new math"

The commonality between the current education reform movement and the Post-Sputnik era have been mentioned before. Among other similarities, both movements prided themselves on taking a rigorously scientific approach to education and yet some of their sharpest critics were the very scientists and mathematicians they were trying to emulate.

We've already talked about Richard P. Feynman's criticism of Post-Sputnik era math and science textbooks, particularly their attempts to be rigorous and realistic. Along similar lines, Tom Lehrer, who was either teaching math at Harvard or political science at MIT (depending on exactly when the song came out), had a great deal of fun with the topic in the song "New Math."

Thursday, May 8, 2014

Two more for the West Coast Stat Views lexicon: The Jar Jar Binks Paradox and Mathematical Anosognosia

The Jar Jar Binks Paradox

Improving the reputation of something bad by adding an additional element that's even worse. The effect works by focusing criticism on one point, making the other elements look better by comparison, and by creating a more favorable narrative (____ would have been good if not for ______).

You could argue that the fatalities-per-mile metric was the Jar Jar Binks of Freakonomics' shoddy analysis of the risks of walking drunk vs. driving drunk. Just to be clear, walking drunk is very dangerous. It might even be more dangerous that driving inebriated, but Levitt's analysis was a collection of comically oversimplified assumptions and numbers pulled out of the air. (See here, here and here for critiques). By addressing criticisms of the fatalities-per-mile metric, Levitt was able to create the impression that the rest of the work was solid.

Mathematical Anosognosia

A condition that causes the false impression of comprehension when a concept is accompanied with familiar mathematical symbols and methods. This is often accompanied with a heightened sense of self-confidence and diminished sense of judgement and restraint. Those prone to this condition are often observed making sweeping pronouncements in fields they have no relevant background in. Though almost anyone working in a math-based field can suffer from Mathematical Anosognosia, physicists and economists seem most susceptible, Extreme cases have been known to produce NYT best-sellers.

Wednesday, May 7, 2014

More futures past -- Highway edition

The standard post to accompany this sort of clip either oohs and aahs over the more prophetic aspects or laughs at the less realistic, but I don't feel like either of those fits my reaction. What strikes me watching this is how fast-moving and, more importantly, ambitious people expected the future to be. This was particularly notable in the section describing road construction (living in LA no doubt contributes to my reaction).

When I look at the Post-War era, I almost always get this incredible sense of pent-up energy, as if the country couldn't wait to make up for all those years lost to the Depression and the War. People wanted to do big things. What's more, they wanted to do them as soon as possible and they were willing to pay whatever was required.

It would be interesting to try to attach some numbers to the attitudes, but just anecdotally it seems clear that when it comes to progress, we're now more tolerant of delay and less tolerant of cost. When an ambitious proposal (manned space exploration, hypersonic trains) does make the news, it almost invariably comes with a laughably low-balled cost, usually one or more orders of magnitude below reasonable.

We'd still like magic highways, just not enough to foot the bill.

Disney's Magic Highway - 1958

Tuesday, May 6, 2014

A distributional question

I have been under the weather for a bit (thus no posts) but I wanted to share a thought I have been having in reaction to the minimum wage discussion on the west coast. People have tended to be worried about these increases, with much concern about economic damage.

However, income and wealth in the United States are distributed with an extremely heavy tail, especially in terms of growth in the last 30 years. This sort of growth, presuming a perfect market, is quite odd as I had always presumed human ability has a normal distribution. The normal distribution is continuous and naturally presents several people of nearly the same ability behind the exceptional person (at least at the top end). We can ignore the odd outlier -- if there was only one billionaire in the United States and they had had risen from poverty then we'd exclude them from any realistic analysis.

But if we want to argue that current economic trends represent fairness, it turns out that we have some steep assumptions and observations to explain. These turn out to be crucial.

For example, when top wages are so high, why don't the companies hire the "next best executive" and split the surplus? If ability is an innate (as opposed to learned ability), why don't we outsource all CEO jobs immediately to China (which would have more top performers as a function of a larger population)? But this sort of wealth distribution seems like an odd way to end up given a normal distribution of ability, presuming one is talking about some sort of meritocratic environment.

Or how do we know we have an ideal marketplace now? There have been a lot of commercial structures over history. What makes us different than: a) Plato's Athens, b) the Roman Empire, c) Medieval England (say the anarchy period), or d) the Song dynasty in China? They had markets too -- where the results of such markets just? If the difference is due to corruption, government interference, and rent seeking, do we have a better balance now? And how would we know, without making a consequentialist argument?

It is actually a pretty deep question.

It may be a good thing that I missed this New York Times SAT article when it first came out...

If I had read all of their coverage at once, I'm afraid my head would have exploded.

From A New SAT Aims to Realign With Schoolwork
By TAMAR LEWIN

"The guessing penalty, in which points are deducted for incorrect answers, will be eliminated."

We been through this before

The SAT and the penalty for NOT guessing

On SAT changes, The New York Times gets the effect right but the direction wrong

but saying 'points' instead of 'fractions of points' is just inexcusable. I realize that the concept of expected value can throw people but even a NYT reporter should be able to distinguish between one and one fourth.

Monday, May 5, 2014

A Star Wars Day experiment

I know I'm mixing franchises here, but the recent coverage of Star Wars Day has left me with something of a Twilight Zone feeling. It's almost like waking up in a world where people have always celebrated an unofficial holiday commemorating some pretty good, if dated science fiction films of the Seventies and Eighties.

So I did some data collection, doing some Google searches (Web and News) over different custom time ranges and I found that, though the origins of the holiday date back to the late Seventies, the vast majority of the coverage seems to have started about the time Disney recently started seriously promoting the upcoming sequel.

Try your own data gathering at home. You may get slightly different results but I think you'll find an exceptionally large jump this year. Wikipedia says "Observance of the holiday spread quickly due to Internet, social media, and grassroots celebrations," and I'm sure that interest in the upcoming film accelerated the process, but I have trouble believing that these factors alone could drive the increase we've seen. It's almost like major media conglomerates like Disney had some mysterious force that could cause journalists to promote their product.

Saturday, May 3, 2014

Weekend blogging -- perhaps the strangest Donald Sterling tie-in you'll see this week

Well, that worked out nicely. A few days ago, we ran a post about the similarities between the controversy over the NAACP accepting money from Donald Sterling and the moral dilemma at the heart of Shaw's Major Barbara. This morning I check out Hulu for the free selections from the Criterion Collection and I discover that the theme of the week is stage to screen and one of the selections is the 1941 adaptation of Shaw's play.

While I was at it, I also embedded a few other films from the collection, including one that I've always had a special connection to, Olivier's take on Richard III. I came across the film one night when I was ten or eleven. I had no idea what or whom I was watching, but I was fascinated nonetheless. I'm a big fan of Ian McKellen, but if you can only see one...

Friday, May 2, 2014

"The Heart of Algebra"

I'm working on a couple of bigger pieces on the SAT and one of the things that I've been looking at as part of the background work is this statement from the College Board discussing the changes in the math section of the test. Board president David Coleman quotes extensively from this and I'd be very much surprised if he hadn't been extensively involved in its writing. (the press releases very much have Coleman's voice.)

Reading these official statements after closely reviewing the old SAT test produces a couple of strange reactions. The first is a disconnect that comes from a list of changes that, with one or two exceptions, seem to describe the test we already have (work with systems of equations, analyze data, use percentages and ratios) and/or contradict other proposed changes (reduce the scope and add "trigonometric concepts").

The second is a strange lost-in-translation feeling, as if the passages were almost saying something meaningful, but some key words had been omitted or put out of order. Perhaps the best example is this discussion of linear equations and functions as "the heart of algebra." Coleman seems particularly enamored with this phrase -- he uses it frequently in interviews about the SAT -- but when I read through the press statement, I didn't see anything that made linear functions more important or fundamental than other polynomial functions (or rational functions or logarithmic or exponential functions for that matter).

Here's a little experiment. Read the passage below extolling the importance of equations and functions based on linear expressions. Then read it again but mentally strike out every occurrence of 'linear' except for the parenthetical phrase. I think you'll find it actually makes as much sense.

Heart of Algebra: A strong emphasis on linear equations and functions
Algebra is the language of much of high school mathematics, and it is also an important prerequisite for advanced mathematics and postsecondary education in many subjects. Mastering linear equations and functions has clear benefits to students. The ability to use linear equations to model scenarios and to represent unknown quantities is powerful across the curriculum in the postsecondary classroom as well as in the workplace. Further, linear equations and functions remain the bedrock upon which much of advanced mathematics is built. (Consider, for example, the way differentiation in calculus is used to determine the best linear approximation of nonlinear functions at a certain input value.) Without a strong foundation in the core of algebra, much of this advanced work remains inaccessible.

You might make a pretty good case for the central importance of polynomials (particularly if you want to get nerdy and bring in Taylor). You can make a great case for the central importance of functions. You can even make a crawl-before-you-walk case for focusing on linear expressions. But you have to make some sort of coherent argument.

Even the part about finding the slope of the tangent at a given point (that is what they're talking about, right? or am I missing something?) has an odd quality. It's difficult to see how using a derivative to help find the equation of a line makes linear equations the 'bedrock' of more advanced math. There are certainly examples where linear equations are used to find formulas and prove theorems in calculus and other more advanced fields, but the example in the parenthesis actually goes the other way. To me, the passage as a whole and the parenthesis in particular read as if the author had asked someone knowledgeable "where do we use linear equations and functions?" and had paraphrased the answer with only minimal comprehension.

What's so strange and somewhat sad about that possibility is the extraordinary pool of mathematical talent that was hanging around the halls when this was written. If you take a tests and measurements class, you soon realize that most of the good examples come from the SAT. The people who put the exam together are exceptionally good in a highly demanding field of statistics.

Not listening to people with experience and expertise is a noted characteristic of and perhaps even a point of pride with Coleman, who came into the field as a McKinsey & Company consultant and had no relevant experience in education or statistics.

When Coleman attended Stuyvesant High in Manhattan, he was a member of the championship debate team, and the urge to overpower with evidence — and his unwillingness to suffer fools — is right there on the surface when you talk with him. (Debate, he said, is one of the few activities in which you can be “needlessly argumentative and it advances you.”) He offended an audience of teachers and administrators while promoting the Common Core at a conference organized by the New York State Education Department in April 2011: Bemoaning the emphasis on personal-narrative writing in high school, he said about the reality of adulthood, “People really don’t give a [expletive] about what you feel or what you think.” After the video of that moment went viral, he apologized and explained that he was trying to advocate on behalf of analytical, evidence-based writing, an indisputably useful skill in college and career. His words, though, cemented his reputation among some as both insensitive and radical, the sort of self-righteous know-it-all who claimed to see something no one else did.

Coleman obliquely referenced the episode — and his habit for candor and colorful language — at the annual meeting of the College Board in October 2012 in Miami, joking that there were people in the crowd from the board who “are terrified.”

Given some of the changes we've seen in the test the College Board worked so hard to get right (the loss of orthogonality, the shoehorning in of "real-world" data), we may have some idea what they were scared of.

Thursday, May 1, 2014

Symmetries and asymmetries of the fringes

I've already referred to this excellent Rick Perlstein essay ("I didn’t like Nixon until Watergate"), but I never got around to writing anything about the main point of the piece which was the role of lies and cons in the modern conservative movement. I had largely forgotten the topic until I came across an article in the LA Weekly

Here's a memorable and representative excerpt from Perlstein:

There’s a kind of mystic wingnut great-circle-of-life aura to this stuff. Mark Skousen, a Mormon, is the nephew of W. Cleon Skousen, author of the legendarily bizarre Birchite tract The Naked Communist, which claimed to have exposed the secret forty-five-point plan by which the Soviet Union hoped to take over the United States government. (Among the sinister aims laid out in the document: gain control of all student newspapers; “eliminate all good sculpture from parks and buildings, substitute shapeless, awkward and meaningless forms.”) Upon its publication in 1958 (it was republished in 2007 as an ebook), the president of the Church of Latter-day Saints, David O. McKay, recommended that all members read it. Mark Skousen is also author of a book called Investing in One Lesson, which cribs its title from the libertarian tract Economics in One Lesson, distributed free by conservative organizations in the millions in the fifties, sixties, and seventies (Reagan was a fan). He founded an annual Las Vegas convention called “FreedomFest”—2012 keynoters: Steve Forbes, Grover Norquist, Charles Murray, Whole Foods CEO John Mackey—which advertises itself as “the world’s largest gathering of right-wing minds.” This event points to another signal facet of the conservative movement’s long con: convincing its acolytes that they are the true intellectuals, that anyone to their left is the merest cognitive pretender. (“Will this 3 Minute Video Change Your Life?” you can read on FreedomFest’s website. Because three-minute videos are how intellectuals roll. Click here to learn more.)

The oilfield in the placenta is another perfect mélange of right-wing ideology and a right-wing money con. It begins with a signal ideological lie: that stem-cell research represents an outrage against the right to life (but the cultivation of embryos for in vitro fertilization does not). It then pulls the mark along with the right-wing fantasy that energy independence is only one miraculous technological breakthrough away (but the development of already existing alternative energy sources doesn’t count as one of those breakthroughs). It all makes its own sort of internally coherent sense when you consider the salesman: James Dale Davidson is a founder of the National Taxpayers Union, a Richard Mellon Scaife–funded enterprise that gave Grover Norquist his start as a professional conservative. Davidson himself is a producer of Unanswered: The Death of Vincent Foster. “There is overwhelming evidence that Foster was murdered,” he told the Washington Post. “They obviously have reasons they don’t want this to come out . . . obviously there’s something big they’re trying to protect.”

Of course, the childlike appeals won’t work their full magic without the invocation of the conservative movement’s childlike heroes. The Gipper appears in another splendid specimen received by Human Events readers—which is appropriate, because Human Events is where Reagan himself got a lot of the made-up stuff he spouted across his entire political career. “When President Ronald Reagan got cancer during his presidency,” this one begins, “the great German doctor Hans Nieper, M.D., treated him. It would have been frontpage news if it hadn’t been hushed up at the time.” (“German doctors ‘cook’ cancer out of your body while you nap!”) “Many American cancer patients lose their hair and their vitality. But Reagan kept his famous pompadour hairstyle. He also kept his warm smile and vigorous style.” (“CLICK HERE to request German Cancer Breathrough: A Guide to Top German Alternative Clinics.”) “Reagan lived for another 19 years. He died at age 93, and not from cancer.” (“Fortunately, as a journalist I’m protected by the First Amendment. I can tell you the truth without having to risk persecution from the authorities.”)

That last passage came back to me when I read this article on the implosion of Pacifica.

A National Public Radio fund drive, such as those heard in Los Angeles on much bigger KCRW and KPCC, is a mix of cloying boosterism, promises of tote bags and begging. A Pacifica fund drive, meanwhile, sounds like a never-ending infomercial for products created by a street-corner lunatic.

Take, for example, a five-DVD set titled "The Great Lies of History," which includes five documentaries by Italian filmmaker Massimo Mazzucco: The Second Dallas; The New American Century; UFOs and the Military Elite; The True History of Marijuana; and Cancer: The Forbidden Cures. Cancer features Dr. Tullio Simoncini, an Italian doctor who claims to treat cancer, which he says originates with a fungus, with sodium bicarbonate, or baking soda.

"There was a woman [diagnosed with] cancer of the uterus," Mazzucco recently explained to KPFK producer Christine Blosdale on air. "She tried the Simoncini method. She healed by herself by simply doing douches, washing with sodium bicarbonate. The cancer's gone, and now she can have babies. Of course, that's one less patient the cancer industry had to milk from."
...
Blosdale then informed the listener, "If you got all the DVDs individually, yes, it would cost $500, but you get all five together for a $250 pledge." (A quick search on Amazon shows "The Great Lies of History" multi-DVD package selling for $49.90.)
...
Much of the money raised in a recent WBAI fund drive came from Gary Null and Monique Guild, a so-called "business intuitive and wealth builder," who was hawking "prosperity workshops." Various sources estimate that Guild and Null take between 30 and 50 percent of the money paid for these "premiums" — the gifts and items they sell to listener-supporters. Many suggest this may actually be illegal, since Pacifica is a 501(c)3 nonprofit.

The similarities are obvious but because they are so obvious, they raise certain questions. If people on the far left are susceptible to virtually the same scams as those on the far right, why don't we see comparable direct marketing models on a comparable level on the left. It's easy to think of prominent conservatives who have parlayed their standing into lucrative marketing partnerships (Gingrich, Beck and Huckabee come to mind. Perlstein has a longer list) and who have kept their day jobs.

It's possible that there are more "high responders" on the right than on the left but it's hard to believe that the difference is big enough to explain the disparity in marketing. These industries are highly competitive and are good at spotting underserved markets. Unless there is a great deal of activity going unnoticed, it would appear that Pacifica and Mother Jones for some reason don't generate the kind of valuable mailing lists that Human Events does.

Actually, I shouldn't have said 'reason' -- no monocausalists, here. At least not on social science questions -- but if I had to speculate on primary reasons, these would be my top two:

The media of the far right is much larger, better organized and better run than the media of the far left.This is conducive both for creating mailing lists and building (or in the case of former politicians) maintaining personal brands;

The role of the far right in the GOP is different than the role of the far left in the Democratic Party. Democrats have largely come to view their extreme as an impediment to election; Republicans have come to see them as an absolute necessity. As a result, Democratic candidates are much more reluctant to be associated with far-left ideas like, for example, negative income tax (despite some decidedly not-so-liberal support). There does not appear to be a comparable perceived cost on the right for association with ideas like the gold standard. I suspect that this disparity holds even for cases where the ideas in question appeal to both the far left and the far right such as "the government and the medical establishment are withholding cures for cancer."

Does anyone have any other thoughts?

Wednesday, April 30, 2014

I'm amazed that no one seems to have quoted George Bernard Shaw on Donald Sterling and the NAACP

Not that I necessarily agree with Shaw (I'm not entirely certain that Shaw agrees with Shaw), but given the discussion over whether the NAACP should give back Sterling's money, it is surprising that (as far as I can tell) no one has brought Major Barbara into the discussion.

From the preface:

On the point that the [Salvation] Army ought not to take such money, its justification is obvious. It must take the money because it cannot exist without money, and there is no other money to be had. Practically all the spare money in the country consists of a mass of rent, interest, and profit, every penny of which is bound up with crime, drink, prostitution, disease, and all the evil fruits of poverty, as inextricably as with enterprise, wealth, commercial probity, and national prosperity. The notion that you can earmark certain coins as tainted is an unpractical individualist superstition. None the less the fact that all our money is tainted gives a very severe shock to earnest young souls when some dramatic instance of the taint first makes them conscious of it. When an enthusiastic young clergyman of the Established Church first realizes that the Ecclesiastical Commissioners receive the rents of sporting public houses, brothels, and sweating dens; or that the most generous contributor at his last charity sermon was an employer trading in female labor cheapened by prostitution as unscrupulously as a hotel keeper trades in waiters' labor cheapened by tips, or commissionaire's labor cheapened by pensions; or that the only patron who can afford to rebuild his church or his schools or give his boys' brigade a gymnasium or a library is the son-in-law of a Chicago meat King, that young clergyman has, like Barbara, a very bad quarter hour. But he cannot help himself by refusing to accept money from anybody except sweet old ladies with independent incomes and gentle and lovely ways of life. He has only to follow up the income of the sweet ladies to its industrial source, and there he will find Mrs Warren's profession and the poisonous canned meat and all the rest of it. His own stipend has the same root. He must either share the world's guilt or go to another planet. He must save the world's honor if he is to save his own. This is what all the Churches find just as the Salvation Army and Barbara find it in the play. Her discovery that she is her father's accomplice; that the Salvation Army is the accomplice of the distiller and the dynamite maker; that they can no more escape one another than they can escape the air they breathe; that there is no salvation for them through personal righteousness, but only through the redemption of the whole nation from its vicious, lazy, competitive anarchy: this discovery has been made by everyone except the Pharisees and (apparently) the professional playgoers, who still wear their Tom Hood shirts and underpay their washerwomen without the slightest misgiving as to the elevation of their private characters, the purity of their private atmospheres, and their right to repudiate as foreign to themselves the coarse depravity of the garret and the slum. Not that they mean any harm: they only desire to be, in their little private way, what they call gentlemen. They do not understand Barbara's lesson because they have not, like her, learnt it by taking their part in the larger life of the nation.

Tuesday, April 29, 2014

Problems that (nearly) rich people have -- college edition

Yet another one of those posts that I started weeks ago as part of the big SAT thread then didn't get around to posting.

What are the major concerns of high school students applying for college? It's a long list but based on having worked with high school kids (primarily in urban and rural areas including Watts and the Mississippi Delta), I'd probably say:

Finding the money to pay for it;

Being able to finish in four years;

Avoiding remedial courses.

If, on the other hand, I was going to make my list based on what I read in the New York Times, the number one concern would clearly be not getting into the college of your choice.

[The SAT] was one of the biggest barriers to entry to the colleges [students] dreamed of attending.

I don't want to whitewash the issues with SAT and its role in college selection. The test has a history of being misused and there are real concerns about cultural biases in the verbal section, but even with these problem, the NYT's assertion simply isn't true for most students. For kids hoping to find a way to cover rent and groceries while attending local community colleges or four-year schools, fear of a bad SAT simply isn't a high priority concern.

It is, however, for one segment of the population, namely the well-off.

I'm not talking about the rich. For people with serious money, there really aren't big barriers to getting kids into an elite school. I'm talking about roughly the top ten percent minus the top one half, people who have the money to cover a pricey tuition and to get their kids in the schools and settings where Ivy League admissions are fairly common. In other words, these are families with the resources to get their kids in range of prestigious schools.

The coverage of the SAT in major publications has been written almost entirely from the viewpoint of that nine and a half percent. This is, of course, not the first time we've seen the press (particularly the NYT) write from this perspective. A few years ago, we heard a great deal about how difficult it could be for a family to get by on between $250,000 to $350,000 in taxable income.

We could speculate on the underlying causes for this slant, but I think the important part is that the people writing and editing these stories seem completely unaware of how the world looks to the bottom 90%.

Monday, April 28, 2014

More on understanding the math but not the statistics

[one of the standard rebuttals to criticisms of popular STEM writing is that certain compromises have to be made when putting things in 'laymen' s terms.' To head off that particular charge, I'm going to use as little technical language as possible in this post.]

Before I post something, I usually do one final search on the subject, just to avoid any surprises. As a result, I often discover better examples than the ones I used in the post. Case in point, after writing a post looking at the pre-538 work of Walt Hickey (and concluding that the editors at 538 appeared to be doing a better job than those at Business Insider), I found this article by Hickey from the Atlantic:

5 Statistics Problems That Will Change The Way You See The World

It was a fairly standard piece (the kind that invariably includes the Monty Hall paradox) and I skimmed through it quickly until the final section which I found myself reading repeatedly to make it actually said what I thought it said:

(5) SIMPSON'S PARADOX
A kidney study is looking at how well two different drug treatments (A and B) work on small and large kidney stones. Here is the success rate that was found:
Small Stones, Treatment A: 93%, 81 out of 87 trials successful
Small Stones, Treatment B: 87%, 234 out of 270 trials successful
Large Stones, Treatment A: 73%, 192 out of 263 trials successful
Large Stones, Treatment B: 69%, 55 out of 80 trials successful.

Which is the better treatment, A or B?

ANSWER: TREATMENT B

Even though Treatment A had higher success rates in both small and large stones, when the whole trial is viewed as a sample space Treatment B is actually more successful:

Small Stones, Treatment A: 93%, 81 out of 87 trials successful
Small Stones, Treatment B: 87%, 234 out of 270 trials successful
Large Stones, Treatment A: 73%, 192 out of 263 trials successful
Large Stones, Treatment B: 69%, 55 out of 80 trials successful.
All stones, Treatment A: 78%, 273 of 350 trials successful
All stones, Treatment B: 83%, 289 of 350 trials successful.

This is an excellent example of Simpson's Paradox, where correlation in separate groups doesn't necessarily translate to the whole sample set.

In short, just because there correlation in smaller groups hides the real story taking place in the largest of groups.

This is an almost perfect example of what I mean by understanding the math but not the statistics. The math, though somewhat counterintuitive (as you would expect from a 'paradox'), is straightforward: in certain situations it is possible to have observations of a data set distributed in such a way that, if you cut the set up along certain lines, two variables will have a positive correlation in each subsection but will have a negative correlation when you put them together. It's an interesting result -- cut things one way and you see one thing, cut them another and you see the opposite -- but it doesn't seem particularly meaningful and it certainly doesn't suggest that one view is right and the other is wrong. The result is just ambiguous. ("This is an excellent example of Simpson's Paradox, where correlation in separate groups doesn't necessarily translate to the whole sample set, causing ambiguity.")

When, however, you start thinking not just mathematically but statistically (and more importantly, causally), one view is very much better than the other. Let's look at the kidney stone example again. What we see here is a lot more patients with large stones being given treatment A and a lot more patients with small stones being given treatment B. This is something we see all the time in observational data, more powerful treatments being given to more extreme cases.

This is one of the first things a competent statistician checks for because that relationship we see in the undivided data set is usually covering up the relationship we're looking for. In this case, the difference we see in the partitioned data is probably due to the greater effectiveness of treatment A while the difference we see in the unpartitioned data is almost certainly due to the greater difficulty in treating large kidney stones. Though there are certainly exceptions, statisticians generally combine data when they want larger samples and break it apart when they want a clearer picture.

The version posted at Business Insider with a later timestamp has a different conclusion ("Answer: Treatment A, once you focus on the subsets"). This appears to be a corrected version possibly in response to this comment:
KSC on Nov 13, 12:33 PM said:

After reading the wikipedia article I believe your answer in the Simpson's paradox example is incorrect.
Treatment B is not better. Treatment A is better.
As pointed out in the article Treatment B appears better when looking at the whole sample because the treatments were not randomly assigned to small and large stone cases.
The better treatment (A) tended to be used on the more difficult cases (large stones) and the weaker treatment (B) tended to be used on the simpler cases (small stones).

Even in the corrected version, though, Hickey still closes his badly garbled conclusion with "correlation in smaller groups hides the real story taking place in the largest of groups." Between that and the odd wording of the unacknowledged correction (A is better, period. When we "focus on the subsets," we control for another factor that obscured the results), it seems that Hickey didn't understand his mistake even after having it was explained to him.

Though I've had some rather critical things to say about 538 recently, there's no question that its publisher and editors do understand statistics. These days, that's' enough to put them ahead of the pack.