West Coast Stat Views (on Observational Epidemiology and more)

Wednesday, October 15, 2014

Assuming I didn't lose you at "TED Talk"

I need to do more research before I wade into this (or convince Joseph to do it for me), but even with the 10 to 50 year wiggle room, talk of having absolutely total confidence makes me nervous.

[GUY] RAZ: Which they did, an amazing scientific feat. They mapped the code that makes up all human DNA. Now they're still trying to figure out what it means, but they already know what it could mean for the future.

(SOUNDBITE OF TED TALK)

RESNICK: The world has completely changed and none of you know about it.

RAZ: So how is it going to change the world?

RESNICK: In a bunch of ways. The good news is it's going to help us immensely in treating cancer 'cause cancer is nothing more than a disease of the genome. It's a disease where one cell has certain changes, which cause it to get a little bit worse and then it reproduces. And by the time you've got a solid tumor, you've got this really heterogeneous population of cancerous cells. And if you sequence their genomes, they're a mess. And so right now, prior to genome sequencing, we're taking wild guesses at what the molecular basis of one's cancer is. And now going forward, what we're going to do is say, forget all of that, what is happening at the molecular level because this drug can target only those cancers that have the BRAF mutation, as an example.

RAZ: So where is it headed? What can you imagine in 10 or 20 years or beyond?

RESNICK: I think we will cure cancer. Genomics and sequencing at large will ultimately cure cancer. Whether that happens in 10 years or 50 years or more is difficult to say.

RAZ: That's incredible. I mean, you can say that with total confidence?

RESNICK: Absolutely. At some point, we'll snuff it out. I mean, people will still develop cancer, certainly, unless we get into genetic engineering of humans, which is something we ought to talk about, but it will be curable.

Two Quotes

From Salon recently:

“It’s not really about asking for a raise, but knowing and having faith that the system will give you the right raise,” [Microsoft CEO Satya] Nadella said in conversation with Dr. Maria Klawe, a member of the Microsoft Board, Harvey Mudd College president and computer scientist.

“That might be one of the initial ‘super powers,’ that quite frankly, women (who) don’t ask for a raise have,” stated to Klawe. “It’s good karma. It will come back.

And from Marketplace last year:

Sarah Lacy, founder of tech news site Pando Daily* ... said the BART strike exacerbated what she sees as a philosophical divide in the Bay Area. “People in the tech industry feel like life is a meritocracy. You work really hard, you build something and you create something, which is sort of directly opposite to unions.”

Both the tech and financial sector have embraced the idea that economic rewards are directly correlated to work and worth. It's a strange mixture of efficient market theorem and social Darwinism, often with more than a bit of Randianism. I suspect that Nadella and Lacy have so internalized this worldview that they no longer have any idea how they sound to the general public.

* To those of you following the pension scandals: yeah, that Pando.

Tuesday, October 14, 2014

Effect sizes: an often overlooked issue

This is a post by Joseph

Brad DeLong makes an argument that fits very well with a long running discussion that Mark and I have had. Just because there is a known relation, doesn't mean that the effect size of the elements can be ignored. So, the existence of the Laffer curve is pretty much certain, but the exact inflection point where the curve shift from more revenue to less revenue is very, very important.

Brad Delong compares current arguments for infrastructure to the Laffer curve:

In a world where the real rate at which the U.S. Treasury can borrow for ten years is 0.3%/year and in which the tax rate t is about 30%, infrastructure investment fails to be self-financing only when the comprehensive rate of return is less than 1%/year.

Now you can make that argument that properly-understood the comprehensive rate of return is less than 1%/year. Indeed, Ludger Schuknecht made such arguments last Saturday. He did so eloquently and thoughtfully in the deep windowless basements of the Marriott Marquis Hotel in Washington DC at a panel I was on.

But Mankiw doesn’t make that argument.

And because he doesn’t, he doesn’t let his readers see that there is a huge and asymmetric difference between:

my argument that tax-rate cuts are not (usually) self financing, which at a tax rate t=30% requires only that α < 2.33; and:

his argument that infrastructure investment is not self-financing, which at a tax rate t=30% requires that ρ < 1%/year.

To argue that α < 2.33 is very easy. To argue that ρ < 1%/year is very hard. So how does Mankiw pretend to his readers that the two arguments are equivalent? By offering his readers no numbers at all.

This principle is broadly applicable to all sorts of arguments that come up on this blog. For example, getting rid of a marginal bad teacher is probably efficient. But constantly churning teachers might shift the efficiency function to a different place on the curve.

So realistic estimates of parameters are critical but also they can also be hard. How do you really tell the Comprehensive rate of return of infrastructure? Is it different in Detroit versus San Francisco? Can it be reliably estimated in advance or only known historically.

But it does lead to better arguments when transparent estimates (that can be discussed or tested) are placed out where they can be evaluated.

Selection on Spinach*

[I have the nagging feeling that I'm not using the proper terminology with the following but the underlying concepts should be clear enough. At least for a blog post.]

Let's talk about three levels of selection effects :

The first is initial selection. At this level, certain traits of potential subjects influence the likelihood of their being included in the study. If you ask for volunteers in person, you will end up underrepresenting shy people. If you use mail surveys, you will underrepresent the homeless:

The second level comes after a study starts. You will frequently lose subjects over time. This type of selection is particularly dangerous because you cannot assume that the likelihood of dropping out is independent of the target variable. The isue comes up all the time in medical studies. For serious conditions, a turn for the worse can make it extremely difficult to continue treatment. The result is that the people who stick around till the end of the study are far more likely to be those who were getting better;

(Up until now, the types of selection bias we have discussed, though potentially serious, are generally not deliberate. Their consequences are unpredictable and they happen to even the best and most conscientious of researchers. That is no longer the case with level three.)

The third level concerns attempts to manipulate attrition so as to affect the results of a study. In these cases, researchers will attempt to get rid of those subjects who are likely to drag down the average. This is blatant data cooking and it can be remarkably effective. In school administration, the term of art is "counseling out." It is shockingly widespread, particularly among the "no excuses" charter schools.

The effect of this practice on kids can be brutal but that is a topic for another post. What interests us here are the statistical concerns; what are the analytic implications of this policy? In terms of direction, the answer is simple: schools that engage in these policies will see their test scores artificially inflated. In terms of magnitude, there is really no telling. The potential for distortion here is huge, particularly when you take into account the possibility of peer effects.

Put bluntly, in cases like this, "The first Success graduating class, for example, had just 32 students. When they started first grade in August 2006, those pupils were among 73 enrolled at the school," data showing above-average results are almost meaningless.

[A few weeks ago, I put out a collection of our early posts on education (Things I Saw at the Counter-Reformation). The impact of attrition is one of the big running themes.]

*Spinach being, in this case, a substance that greatly increases the power of a given effect.

Monday, October 13, 2014

XKCD -- write your own damned post

I've got at least two pieces I'd like write around this: one discussing the way we approach AI research (and the innate limitations in that favored approach); the other a rant about how ddulite journalists fail to catch the important subtleties in technology.

I'm sure there are more angles here so I'll throw this one out to the room. What are the examples of a slight change taking a problem from easy to nearly impossible?

Friday, October 10, 2014

Checking in with Cracked.com -- the website that's better than it has an right to be

Even more than Mental Floss, Cracked.com has taken the worst genre in journalism (the unfortunately named listicle) and made it something entertaining, informative and intelligent. I don't drop by that often because it's such a time sink, but when I do I always come away with something worth sharing.

For instance, 5 Dirty Tricks Apple Uses to Get You to Buy a New iPhone opens with this nice example of a deceptive graphic:

The problem is that the old version (on the left) is misleadingly shot in a different light: it doesn't have any shadowed black edge and is a completely silver shade, whereas the iPhone 6 and 6 Plus are cleverly shaded at the sides to make them appear skinnier than they actually are. Here's a handy GIF to show what we mean:

I'm not crazy about the animation, but still.

The article goes where so few technology writers dare and actually discusses the functionality from a common sense perspective.

Think about what you do with your phone -- send texts, make calls, check social media, play terrible games, and send immediately regrettable photographs to people you just met. Unless you're a professional photographer, you're not going to care about how much the camera has improved on the iPhone 6 (and if you are a professional photographer, you probably take pictures on something better than a goddamn iPhone). And for those of you who game -- nothing playable on the iPhone really needs a huge upgrade in power. Just look what happened when they tried to sell Angry Birds on actual gaming systems. So what do we need the better specs for? To have more apps? Not according to the hard numbers.

In 6 BS News Stories That Went Viral: The Girl With Three Boobs, they gleefully point out how gullible journalists can be when there's a deadline.

That's Telegraph, The Hollywood Reporter, E! Online, Huffington Post, and International Business Times reminding us that, like the ocean, the Internet is a vast chilly abyss that cradles unspeakable wonder as well as waking nightmares. We'll leave you to decide which category triple boobs fall under, because we honestly have no idea.

For those of you wondering if this means Martian mind-vacations are just around the corner, it shockingly turns out there are a few things off about this story. Like the fact that the woman has refused to name any of the doctors involved, won't show her new gift to the world for more than a quick few seconds up close, or that she once filed a missing baggage claim listing "3 breast prosthesis" as one of the stolen items. Also relevant? She once apparently described herself as a "provider of Internet hoaxes since 2014."

4 Reasons Movie Special FX Are Actually Getting Worse has an excellent discussion of the paradoxical economics of CGI,

It turns out that making the most visually spectacular images that the human brain can comprehend requires a good bit of scratch. That's why huge-budget blockbusters have been becoming the norm (33 of the 50 most expensive movies of all time have been made in the last four years); studios are so preoccupied pouring hundreds of millions of dollars into CGI for schlock like Battleship because they could, that they didn't bother to stop to think if they should.

And, as CGI continues to improve, movies only become more reliant on it. We've mentioned before how Rhythm & Hues, the visual effects company most famous for bringing to life all the Oscar-winning, pants-shitting fear of sharing a Tunnel of Love rowboat with a 400-pound marvel of evisceration and death in Life of Pi, went bankrupt because they did their job too well.

Meanwhile, the studios are pumping more and more money into already-bloated special effects budgets (it sure as shit isn't going toward better screenplays). For Transformers: Thing of Whatever, Industrial Light & Magic spent about 15 weeks per Transformer just getting the basic model ready, and each model has about 10,000 parts -- that's not a joke, that's seriously how many individual pieces there are in Michael Bay's idea of a talking truck. The company had to start making models six months before filming even started, just to meet the production schedule. And remember, ILM is like the GE of special effects studios, so if they're balls-to-the-wall to make their effects look good in a profitable fashion, what chance does a scrappy, upstart VFX company have?

Finally, 3 Artists Who Got Screwed for Creating Iconic Characters is a perfect complement to the Kirby thread, reminding us that, like many industries based on creativity, little of the money from comics goes to those who do the actual creating.

Thursday, October 9, 2014

Step-back SAT/GRE problems -- trying something new at "You Do the Math"

I've been thinking about the problem of adapting lessons for different media in general and for video in particular. There is a popular but wildly misguided impression that you can create an effective video by just sticking a camera in front of a live presentation. Teaching live is an interactive process. Even when the students don't say a word, the good teacher is alert to the class's reactions. You speed up, slow down, offer words of encouragement, come up with new examples and occasionally stop what you're doing and go back and reteach a previous section.

With a video lesson you set the course then you leave the room. What's worse, it's a really big room and many if not most of the kids are there because the standard methods of instruction have not served them well.

One idea I'm playing with is thinking of the problems in terms of a graph (as in graph theory, not data visualization) where the path is determined by how well the student is doing. As a start in that direction I'm playing around with paired problems -- if you are confused by the first (more difficult) problem there an easier one to try -- and I've got the first couple up at the teaching blog.

Here's the medium problem:

Circle 1

The radius of circle 1 is 5. Both line segments pass through the center of the circle. Find the area of the shaded region.

You can find the answer and explanation at You Do the Math. Feedback is always appreciated.

The New York Times' regularly scheduled sackcloth and ashes show

From Talking Points Memo:

When New York Times columnist David Brooks revealed last month that his son is serving in the Israeli military, plenty of questions followed: Should Brooks have been more open about that fact? Should it preclude him from writing about Israel? Is it any different from a columnist with a child serving in the U.S. military?

We learned Wednesday that the revelation has even brought about a minor disagreement between two Times editors.

The paper's public editor Margaret Sullivan wrote Wednesday that while she "strongly" disagrees with the suggestion that Brooks "should no longer write about Israel," she also believes that "a one-time acknowledgement of this situation in print (not in an interview with another publication) is completely reasonable."

"This information is germane; and readers deserve to learn about it in the same place that his columns appear," Sullivan wrote.

That's not how Times editorial page editor Andrew Rosenthal sees it though. Rosenthal told Sullivan that the columnist shouldn't have been required to note that his 23-year-old son enlisted in the Israel Defense Forces.

"I do not think he ever had an obligation to say that his son made this choice, any more than if his son had joined the U.S. Air Force (although I recognize that Israel is more controversial in some people’s minds)," Rosenthal said.

Just to be clear, we're talking about David Brooks. You know the guy, quotes discredited studies, makes stuff up. Over the years, he has given critics a steady stream of material, truly unambiguous examples of factual mistakes and substantial omissions in service of the narrative of the moment. His editors have been remarkably quiet on these errors (which is about par for the NYT course)

The New York Times does frequently engage in very public displays of repentance and self-examination. They admit to professional and ethical lapses. They debate in very serious tones the finer points of journalistic conduct. Almost invariably, however, they pick the most minor of lapses to focus on. It is almost as if they wanted to appear conscientious about their profession without actually doing the hard work or accepting the consequences.

Wednesday, October 8, 2014

XKCD Marriage

Lots of interesting implications here, but they'll have to wait till later.

“I can no longer accept cash in bags in a Pizza Hut parking lot” -- time to add Pennsylvania to the list

In an article entitled READING, WRITING, RANSACKING, Charles P. Pierce makes me think that I haven't been spending nearly enough time looking at education reform in the Keystone State. The quote from the title comes Pierce's account of the federal investigation of former Pennsylvania Cyber Charter School leader Nick Trombetta:

The bags of cash, a private plane bough by Avanti but used mostly by Trombetta, a Florida vacation home and a home in Mingo Junction, Ohio, for Trombetta’s former girlfriend all were described as perks enjoyed by Trombetta as part of a scheme to siphon money from taxpayers’ funds sent to PA Cyber for more than four years.

The case is actually small time compared to the other scandals going on in the state, but you have to admit it's a great quote.

A bigger and much more familiar scandal is the lack of accountability:

For reasons that aren't clear, millions of dollars have moved between the network of charter schools, their parent nonprofit and two property-management entities. The School District is charged with overseeing city charters, but "does not have the power or access to the financial records of the parent organization," according to District spokesperson Fernando Gallard. "We cannot conduct even limited financial audits of the parent organization." That's despite the fact that charters account for 30 percent of the District's 2013-'14 budget. Aspira declined to comment. The $3.3 million that the four brick-and-mortar charters apparently have loaned to Aspira are in addition to $1.5 million in lease payments to Aspira and Aspira-controlled property-management entities ACE and ACE/Dougherty, and $6.3 million in administrative fees paid to Aspira in 2012.

Add to that some extraordinarily nasty state politics involving approval-challenged Pennsylvania governor Tom Corbett, the state-run Philadelphia School Reform Commission (which has a history of making teachers' lives difficult basically for the fun of it) and a rather suspicious poll:

"With Governor Corbett's weak job approval, re-elect and ballot support numbers, the current Philadelphia school crisis presents an opportunity for the Governor to wedge the electorate on an issue that is favorable to him," the poll concludes. "Staging this battle presents Corbett with an opportunity to coalesce his base, focus on a key emerging issue in the state, and campaign against an 'enemy' that's going to aggressively oppose him in '14 in any case."

I don't know enough about Pennsylvania politics to competently summarize this, let alone intelligently comment on it but it's difficult to imagine an interpretation that makes things looks good.

Tuesday, October 7, 2014

Now they've got me defending the efficient market theorem...

I know it's trivial, but this one has always annoyed me.

There are cases where the conventional wisdom is so screwed up that the market reads bad news as good news and rewards stupidity, but otherwise, in a reasonably efficient market, stocks only go up when bad news beats expectations if they had already gone down as the expectations had rolled in. They are, in other words, making up some of the lost ground. Financial reporters love the "went up on bad news" story but they almost invariably fail to mention how the stock had been doing before.

Don't get me wrong. I'm still not a fan of the EMT, but on this one, at least, I'm willing to give them a pass.

Monday, October 6, 2014

I'm going to let someone else bitch about the New York Times for a while

Besides, when itt comes to take-downs of bad financial journalism, there's no one sharper than Felix Salmon.

In "Annals of NYT innumeracy, Bank Rossiya edition," Salmon takes apart a recent article entitled “It Pays to be Putin’s Friend.” No doubt the basic premise is true, but the examples described by the NYT don't support the point at all. Salmon points out lots of sloppiness in the piece but this is arguably the money shot.

So [Sergei P.] Roldugin took out a loan, of unknown size, to buy a stake of 3.2% in Bank Rossiya. How on earth does that make him worth anywhere near $350 million?

And here the light slowly dawns — the NYT has taken the sum total of Bank Rossiya’s assets, and used that number as the the value of the bank itself. ($350 million, you see, is 3.2% of $11 billion.)

Of course you can’t value a bank by just looking at its assets, you first need to subtract its liabilities. The NYT story leads with “State corporations, local governments and even the Black Sea Fleet in Crimea” moving their bank accounts to Bank Rossiya — all of those deposits are liabilities of the bank, which need to be subtracted from its assets before you can even begin to arrive at an overall valuation for the bank itself. Just looking at the assets, without looking at the liabilities, is a bit like scoring a sports game by looking only at the points scored by one team.

Probably, most of the value in Bank Rossiya is to be found in the commodity and media assets which it seems to have been able to acquire on the cheap. (The bank itself, qua bank, might well be worth nothing at all.) And no one’s going to find out the true value of those assets by looking at the official size of Bank Rossiya’s balance sheet. It seems to me, indeed, that Bank Rossiya is in large part being used as a holding company, a reasonably safe place where Vladimir Putin’s billionaire friends can keep some of the valuable assets they’ve managed to acquire over the years. I’m just guessing here, but I doubt they have any particular desire to share 3.2% of those assets with some random cellist [Roldugin]. To simply take the official size of Rossiya’s balance sheet, and declare it to be the value of the bank: that’s just bonkers.

Friday, October 3, 2014

Examining the rope – – Rotten Tomatoes edition

[You can find the origin of the metaphor here]

Our last Rotten Tomatoes post may have come out a little harsher than I intended. I probably focused too much on the specific glitch and not enough on the larger point, namely that metrics almost never entirely capture what they claim to. Identifying and fixing problems is important, but we also have to acknowledge our imitations.

If we are stuck with imperfections then we will just have to learn to live with them. A big part of that is trying to figure out when our metrics can be relied upon and when they are likely to blowup in our faces.

Let's take Rotten Tomatoes for example. In many ways, the website provides an excellent tool for quantitatively measuring the critical reaction to a movie. It is broad-based, consistent, and as objective as we can reasonably hope for.

But is it the best possible measure in all conceivable circumstances? If not, when does it break down?

When you see a 60% fresh rating that means that 60% of the reviews examined were considered positive. You will notice that is a binary variable. The most enthusiastic of reviews is put in the same category as the mildly favorable. The inevitable result is that sometimes a film will rank lower on this binary average then it would have on a straight average of star rankings.

Just to be clear, there are some definite advantages to this yes/no approach. As anyone who has dealt with satisfaction scales knows, you can get into all sorts of trouble making interval assumptions about that one through five.

Can knowing their binary foundation help us make better use of the Rotten Tomatoes scores?

If we can make certain assumptions about the distribution of scores, we can tell a lot about which films are likely to be favored. Keep in mind that a good review counts the same as a great one. Therefore a film that is liked by everybody will do better than a film that is loved by most but leaves a few indifferent or hostile.

Without getting into relative merits (all are great films), consider Philadelphia Story and the big three from Martin Scorsese, Taxi Driver/Raging Bull/Goodfellas. By many measures, such as the influential Sight & Sound poll (according to Ebert "by far the most respected of the countless polls of great movies--the only one most serious movie people take seriously."), all three Scorsese pictures are among the most critically hailed movies ever. All three have very good scores on the "Tomatometer" but none have a perfect score. The same goes for films like Bonnie and Clyde, The Magnificent Ambersons, and Bicycle Thieves.

Philadelphia Story, on the other hand, is much less likely to get nominated as greatest film ever, but it is a movie that virtually everyone likes. It's an excellent film, skillfully directed, starring three of the most charming actors ever to come out of Hollywood. Not surprisingly, it has a perfect score on Rotten Tomatoes.

This is not to say that Sight & Sound is better than Rotten Tomatoes. Every scoring system is arbitrary, sometimes plays favorites and never exactly captures what we expect it to measure. The lesson here is that, if you want to use a metric in an argument, you need to know how that metric was derived and what its strengths and weaknesses. You can't find a perfect metric but you can have a pretty good idea where the imperfections are.

Thursday, October 2, 2014

Understanding Common Core-aligned math homework

I volunteer a couple of times a week with a group that does after school tutoring for urban students in LA. My role is "math floater." I walk around the room and help the kids, and sometimes the tutors, with math problems. When the kids ask for help, it's usually just your basic math question, but when the tutors ask for help it's often less about the math and more about the unfamiliar approach the assignment takes to solving a familiar problem.

This is perhaps most exasperating for those tutors with math backgrounds. You can imagine what it must be like to have a degree in engineering and yet be stumped by an eighth-grader's pre-algebra homework. Of course, it's not the math that's throwing them; it's all the weird and arbitrary steps that have been layered onto the math.

After struggling a bit myself, I realized that the key was to approach these problems as bad translations of unknown texts. If I looked hard enough, I could usually find an antecedent, a good lesson (something I had read in Pólya or seen demonstrated by a master teacher or used with success in one of my classes) that had somehow devolved into the misshapen thing sitting in front of the student.

Recently, I ducked into the tutoring center when I wasn't scheduled to work. I just stepped in to use the bathroom but before I got across the room, I heard a couple of tutors calling my name. They were struggling with a third or fourth grade problem where the student had to perform a number of steps including filling out a three by three grid in order to find the product of two three-digit numbers. The answer kept coming out wrong and none of the tutors could figure out why since none of them were sure how the process was supposed to go.

The point of the question was to illustrate the distributive property. Handled properly, the general format could have made for a pretty good problem. As was it was a disaster. Developmentally inappropriate, badly explained, overly long (two-digit numbers would have made the point just as well), devoid of relevant context. Like a bad translation of a bad translation of a good problem. That got me wondering if perhaps the process for coming up these problems worked something like this...

Wednesday, October 1, 2014

Two ways of looking at the achievement gap and how the reform debate often misses them both

The following came out of a phone conversation I had this weekend with Joseph. I'll need to get back to this later but for now here's a thumbnail version just to have something on the record.

When we talk about the achievement gap in education, there are two distinct but valid ways of approaching the question:

The first is in terms of variability. The people in the bottom quartile are, by most measures, getting a much worse education than the remaining three quarters of the population;

The second involves correlation. People in that bottom quartile are disproportionately likely to be poor, to be black or Hispanic, or to speak English as a second language.

You address the first by raising scores for those at the bottom. You address the second by changing the order. Reducing the gap is still desirable regardless of the definition used -- we don't want any of our schools to be bad nor do we want an education system that entrenches the class system -- and there are many things we can do that will improve both, but it is important to remember that we are talking about two distinct objectives.

To further complicate the picture, proposals that are meant to improve educational outcomes in general are often pitched as ways to address the achievement gap.

All three goals (improving overall outcomes, reducing variability and breaking the correlation) are important -- I'd argue the third one is absolutely vital -- but whenever we need to be clear about what we are trying to do.