West Coast Stat Views (on Observational Epidemiology and more)

Monday, October 20, 2014

The myth of a data-driven Netflix

There's been a lot of talk about Netflix stock this week (usually with words like "plummet"), but a big part of the story has largely gone unnoticed, probably in part because it involves statistics.

As mentioned before, some aspects of the Netflix narrative such as the company building and HBO type content library, are simply, factually incorrect. Others, while not blatantly wrong, are difficult to reconcile with the facts.

One of the accepted truths of the Netflix narrative is that CEO Reed Hastings is obsessed with data and everything the company does is data driven (for example "What little Netflix has also shared about its programming strategy is that its every decision is guided by data."). The evidence in support of this belief is largely limited to a model that Netflix crowd sourced a few years ago and to endless assertions from executives at the company that they do know what they are doing despite evidence to the contrary.

Of course, all 21st century corporations are relatively data-driven. The fact that Netflix has large data sets on customer behavior does not set it apart, nor does the fact that it has occasionally made use of that data. Furthermore, we have extensive evidence that the company often makes less use of certain data then do most other competitors.

On pertinent case in point, particularly for the SEC, is churn rates.

But Netflix disagrees. “With respect to various operational metrics, management has evolved its use of these metrics as the business has evolved,” it wrote the SEC in response. Because it is so easy to quit and then restart a Netflix subscription, it said, “the churn metric is a less reliable measure of business performance, specifically consumer acceptance of the service.”

This is problematic on any number of levels. In terms of marketing, pricing and long-term corporate strategy, having a complete picture of how long people stay and why they leave is huge. The only excuse for not reporting churn would be if you had such a detailed picture of who was leaving and why that this additional metric was redundant.

In other words, Hasting should have a good, data-supported explanation for a recent sudden loss of subscribers.

Netflix CEO Reed Hastings blamed the subscriber drop-off on a $1 price increase the company instituted back this spring.

"Our best sense is it's an effect of our price increase back in May," Hastings said Wednesday night in an interview with CNBC. "With a little bit higher prices, you get a little bit fewer subscribers. So that's our sense of it. But we can't be 100% sure. We had so much benefit from Orange in Q2 and the early Q3, but that's what we think."

Phrases you don't want to hear in these circumstances include "our best sense" and "that's what we think." They convey the impression of a CEO who was blindsided by a bad day at the NASDAQ.

When contemplating a price increase, well-run companies look at the impact on retention and on acquisition. When Netflix management said

[M]anagement believes that in a largely fixed-cost streaming world with ease of cancellation and subsequent rejoin, net additions provides the most meaningful insight into our business performance and consumer acceptance of our service. The churn metric is a less relevant and reliable measure of business performance, and does not accurately reflect consumer acceptance of our service.

They were basically saying that losing one customer and gaining another is the same as keeping the same customer. That's a dangerous approach under the best of circumstances but it can be deadly when trying to gauge the impact of a pricing change.

Just to be clear, for years analysts and the SEC have been asking for more data, or at least more detailed statistics and Netflix has been saying "trust us, the aggregate number are good enough." Now the company appears to have screwed up badly, and they've done it in pretty much exactly the way you would expect a company to screw up when it doesn't drill down into the data.

p.s. I'm considering putting out a collection of business posts (something similar to the education reform e-book Things I Saw at the Counter-Reformation). Any thoughts or suggestions would be appreciated.

Friday, October 17, 2014

Hexagonal Reversi

Over at You Do the Math, I've got a post on a Reversi variant played on a 6x6x6 hexboard.

If you're feeling particularly nerdy, check it out.

Dick Cavett from 2013

I've been catching up on Cavett's NYT column. I have no idea how he pulls it off but he manages to constantly talk about his encounters with the iconic and yet he avoids being boastful or sycophantic. Here are some favorites:

Burton being Burton.

Laurel not being Stan.

and John Wayne being... hell, you wouldn't believe me anyway.

But it was a section near the end of this piece on Carson's rough start on the Tonight Show (Cavett got his start as a writer for the show) that caught my eye.

From Tonight, Tonight, Its World Is Full of Blight
By DICK CAVETT MARCH 29, 2013 9:00 PM

If my friend Dave Letterman should decide next contract time that he’s sat through one too many starlet guests who come on to plug their movies, exhibit seemingly a yard of bare gam, pepper their speech with “likes” and “I’m like” and “awesome” and “oh, wow” and “amazing,” and list at least seven things they are “excited” about despite the evidence, from who knows what cause, of their half-mast eyelids, I’ll regret his going.

And speaking of Dave’s presumably stepping aside some sad day, if CBS is smart, there is in full view a self-evident successor to The Big L. of Indiana.

The man I’m thinking of has pulled off a miraculous, sustained feat, against all predictions — descendants of those same wise heads who foresaw a truncated run for the Carson boy — of making a smashing success while conducting his show for years with a dual personality. And I don’t mean Rush Limbaugh (success without personality).

I can testify, as can anyone who’s met him and seen him as himself, how much more there is to Stephen Colbert than the genius job he does in his “role” on “The Colbert Report.” Everything about him — as himself — qualifies him for that chair at the Ed Sullivan Theater that Letterman has so deftly and expertly warmed for so long. Colbert is, among other virtues, endowed with a first-rate mind, a great ad-lib wit, skilled comic movement and gesture, fine education, seemingly unlimited knowledge of affairs and events and, from delightful occasional evidence, those things called The Liberal Arts — I’ll bet you he could name the author of “Peregrine Pickle.” And on top of that largess of qualities (and I hope he won’t take me the wrong way here), good looks.

Should such a day come, don’t blow it, CBS.

"Velocitas Eradico," Nazi mad scientists and other cool things a little research would have uncovered

I know I'm being picky...

This article by Sam Biddle on the testing of a new weaponized railgun isn't bad. It does a good job explaining the physics and keeping the gee-whiz factor in check while acknowledging the genuinely exciting potential of the technology.

But I do have a complaint and it's one of the few times I'd actually favor a bit more gee-whiz. My problem with the article is that it is very much the story of one specific development -- albeit a cool and possibly important one -- and it doesn't show much interest in the history of the technology or its larger potential.

Here are some of the highlights from the Wikipedia entry:

[The Navy] gave the project the Latin motto "Velocitas Eradico", Latin for "I, [who am] speed, eradicate".

...

In 1944, during World War II, Joachim Hänsler of Germany's Ordnance Office built the first working railgun, and an electric anti-aircraft gun was proposed. By late 1944 enough theory had been worked out to allow the Luftwaffe's Flak Command to issue a specification, which demanded a muzzle velocity of 2,000 m/s (6,600 ft/s) and a projectile containing 0.5 kg (1.1 lb) of explosive. The guns were to be mounted in batteries of six firing twelve rounds per minute, and it was to fit existing 12.8 cm FlaK 40 mounts. It was never built. When details were discovered after the war it aroused much interest and a more detailed study was done, culminating with a 1947 report which concluded that it was theoretically feasible, but that each gun would need enough power to illuminate half of Chicago.

...

In 2003, Ian McNab outlined a plan to turn this idea into a realized technology. The accelerations involved are significantly stronger than human beings can handle. This system would be used only to launch sturdy materials, such as food, water, and fuel. Note that escape velocity under ideal circumstances (equator, mountain, heading east) is 10.735 km/s. The system would cost $528/kg, compared with $20,000/kg on the space shuttle.

I realize I complain about tech reporters getting carried away, but when something's this cool, it's OK to get a little more excited.

Sneetch class

James Kwak has a sharp and funny post on the economics and branding of first-class air travel.

Ultimately, what Suites Class is selling, along with every other “luxury” first-class cabin in the air, is a feeling of distinction. Air travel is a miserable experience for everyone involved, mitigated only by the immense convenience of being able to show up in another part of the world in a matter of hours. The glamour of high-end air travel, as with any other luxury good, is a function of exclusivity. Now that, thanks to Southwest Airlines, most people can afford to get on a plane, there have to be ways to pay more and more to get on the same plane. To get people to pay more, you have to give them something: an emotion, a brand, a story, something. And that’s why we have Suites Class.

Thursday, October 16, 2014

Education reform thoughts

This is Joseph.

I sent this article to Mark, but thought the blog might appreciate a few comments on it as well. In particular, I was struck by the following passage:

We could double teachers' salaries. I'm not joking about that. The standard way that you make a profession a prestigious, desirable profession, is you pay people enough to make it attractive. The fact that that doesn't even enter the conversation tells you something about what's wrong with the conversation around these topics. I could see an argument that says it's just not worth it, that it would cost too much. The fact that nobody even asks the question tells me that people are only willing to consider cheap solutions. They're looking for easy answers, not hard answers.

In a really important way, this is the most compelling counter-point to the cries of urgency that accompany reform. It's not that salary doubling is necessarily the solution, but that reformers appear to want cost-neutral or cost-saving improvements in quality.

Now it is possible that such options exist. It seems difficult to imagine extremely large effect sizes, as international comparisons don't seem to find obvious (cost neutral) options. Do note that a longer school day without increasing salaries is an effective pay cut.

Now, if there really is a crisis, why isn't the "pay more" option next to ideas like remove tenure? Was that not the way we often respond to crises (think of the World Wars)? Would increasing taxes to double salaries make it easier to remove (for example) tenure?

Who knows. But the Overton window here is pretty revealing.

Non-compete agreements

This is Joseph.

Jimmy John's has a noncompete agreement. Kevin Drum wonders what the point is, given how unlikely it is to be enforced. Alison Griswold notes that:

That said, an unenforceable clause is still problematic if it's scaring employees who don't know any better into thinking they can't work at another sandwich shop—or another restaurant of any sort with a trade in sandwiches—for the next two years.

I don't even think that "scaring" is the right word. Imagine the tight budgets somebody who works in fast food preparation likely has. Yes, there will be exceptions. But is it obvious that there will be lawsuits, even in a place like California.

Do you know what no company wants to do with the new sandwich person? Litigate to keep them. Nor does an employee want to spend time and money in court to defend themselves if the court should grant an injunction while they hear the case. Courts are slow, sessions happen during paid workdays, and it's not at all clear that anybody wants to deal with that while no longer making money.

So I think the chilling effect might be more than one realizes.

Wednesday, October 15, 2014

Assuming I didn't lose you at "TED Talk"

I need to do more research before I wade into this (or convince Joseph to do it for me), but even with the 10 to 50 year wiggle room, talk of having absolutely total confidence makes me nervous.

[GUY] RAZ: Which they did, an amazing scientific feat. They mapped the code that makes up all human DNA. Now they're still trying to figure out what it means, but they already know what it could mean for the future.

(SOUNDBITE OF TED TALK)

RESNICK: The world has completely changed and none of you know about it.

RAZ: So how is it going to change the world?

RESNICK: In a bunch of ways. The good news is it's going to help us immensely in treating cancer 'cause cancer is nothing more than a disease of the genome. It's a disease where one cell has certain changes, which cause it to get a little bit worse and then it reproduces. And by the time you've got a solid tumor, you've got this really heterogeneous population of cancerous cells. And if you sequence their genomes, they're a mess. And so right now, prior to genome sequencing, we're taking wild guesses at what the molecular basis of one's cancer is. And now going forward, what we're going to do is say, forget all of that, what is happening at the molecular level because this drug can target only those cancers that have the BRAF mutation, as an example.

RAZ: So where is it headed? What can you imagine in 10 or 20 years or beyond?

RESNICK: I think we will cure cancer. Genomics and sequencing at large will ultimately cure cancer. Whether that happens in 10 years or 50 years or more is difficult to say.

RAZ: That's incredible. I mean, you can say that with total confidence?

RESNICK: Absolutely. At some point, we'll snuff it out. I mean, people will still develop cancer, certainly, unless we get into genetic engineering of humans, which is something we ought to talk about, but it will be curable.

Two Quotes

From Salon recently:

“It’s not really about asking for a raise, but knowing and having faith that the system will give you the right raise,” [Microsoft CEO Satya] Nadella said in conversation with Dr. Maria Klawe, a member of the Microsoft Board, Harvey Mudd College president and computer scientist.

“That might be one of the initial ‘super powers,’ that quite frankly, women (who) don’t ask for a raise have,” stated to Klawe. “It’s good karma. It will come back.

And from Marketplace last year:

Sarah Lacy, founder of tech news site Pando Daily* ... said the BART strike exacerbated what she sees as a philosophical divide in the Bay Area. “People in the tech industry feel like life is a meritocracy. You work really hard, you build something and you create something, which is sort of directly opposite to unions.”

Both the tech and financial sector have embraced the idea that economic rewards are directly correlated to work and worth. It's a strange mixture of efficient market theorem and social Darwinism, often with more than a bit of Randianism. I suspect that Nadella and Lacy have so internalized this worldview that they no longer have any idea how they sound to the general public.

* To those of you following the pension scandals: yeah, that Pando.

Tuesday, October 14, 2014

Effect sizes: an often overlooked issue

This is a post by Joseph

Brad DeLong makes an argument that fits very well with a long running discussion that Mark and I have had. Just because there is a known relation, doesn't mean that the effect size of the elements can be ignored. So, the existence of the Laffer curve is pretty much certain, but the exact inflection point where the curve shift from more revenue to less revenue is very, very important.

Brad Delong compares current arguments for infrastructure to the Laffer curve:

In a world where the real rate at which the U.S. Treasury can borrow for ten years is 0.3%/year and in which the tax rate t is about 30%, infrastructure investment fails to be self-financing only when the comprehensive rate of return is less than 1%/year.

Now you can make that argument that properly-understood the comprehensive rate of return is less than 1%/year. Indeed, Ludger Schuknecht made such arguments last Saturday. He did so eloquently and thoughtfully in the deep windowless basements of the Marriott Marquis Hotel in Washington DC at a panel I was on.

But Mankiw doesn’t make that argument.

And because he doesn’t, he doesn’t let his readers see that there is a huge and asymmetric difference between:

my argument that tax-rate cuts are not (usually) self financing, which at a tax rate t=30% requires only that α < 2.33; and:

his argument that infrastructure investment is not self-financing, which at a tax rate t=30% requires that ρ < 1%/year.

To argue that α < 2.33 is very easy. To argue that ρ < 1%/year is very hard. So how does Mankiw pretend to his readers that the two arguments are equivalent? By offering his readers no numbers at all.

This principle is broadly applicable to all sorts of arguments that come up on this blog. For example, getting rid of a marginal bad teacher is probably efficient. But constantly churning teachers might shift the efficiency function to a different place on the curve.

So realistic estimates of parameters are critical but also they can also be hard. How do you really tell the Comprehensive rate of return of infrastructure? Is it different in Detroit versus San Francisco? Can it be reliably estimated in advance or only known historically.

But it does lead to better arguments when transparent estimates (that can be discussed or tested) are placed out where they can be evaluated.

Selection on Spinach*

[I have the nagging feeling that I'm not using the proper terminology with the following but the underlying concepts should be clear enough. At least for a blog post.]

Let's talk about three levels of selection effects :

The first is initial selection. At this level, certain traits of potential subjects influence the likelihood of their being included in the study. If you ask for volunteers in person, you will end up underrepresenting shy people. If you use mail surveys, you will underrepresent the homeless:

The second level comes after a study starts. You will frequently lose subjects over time. This type of selection is particularly dangerous because you cannot assume that the likelihood of dropping out is independent of the target variable. The isue comes up all the time in medical studies. For serious conditions, a turn for the worse can make it extremely difficult to continue treatment. The result is that the people who stick around till the end of the study are far more likely to be those who were getting better;

(Up until now, the types of selection bias we have discussed, though potentially serious, are generally not deliberate. Their consequences are unpredictable and they happen to even the best and most conscientious of researchers. That is no longer the case with level three.)

The third level concerns attempts to manipulate attrition so as to affect the results of a study. In these cases, researchers will attempt to get rid of those subjects who are likely to drag down the average. This is blatant data cooking and it can be remarkably effective. In school administration, the term of art is "counseling out." It is shockingly widespread, particularly among the "no excuses" charter schools.

The effect of this practice on kids can be brutal but that is a topic for another post. What interests us here are the statistical concerns; what are the analytic implications of this policy? In terms of direction, the answer is simple: schools that engage in these policies will see their test scores artificially inflated. In terms of magnitude, there is really no telling. The potential for distortion here is huge, particularly when you take into account the possibility of peer effects.

Put bluntly, in cases like this, "The first Success graduating class, for example, had just 32 students. When they started first grade in August 2006, those pupils were among 73 enrolled at the school," data showing above-average results are almost meaningless.

[A few weeks ago, I put out a collection of our early posts on education (Things I Saw at the Counter-Reformation). The impact of attrition is one of the big running themes.]

*Spinach being, in this case, a substance that greatly increases the power of a given effect.

Monday, October 13, 2014

XKCD -- write your own damned post

I've got at least two pieces I'd like write around this: one discussing the way we approach AI research (and the innate limitations in that favored approach); the other a rant about how ddulite journalists fail to catch the important subtleties in technology.

I'm sure there are more angles here so I'll throw this one out to the room. What are the examples of a slight change taking a problem from easy to nearly impossible?

Friday, October 10, 2014

Checking in with Cracked.com -- the website that's better than it has an right to be

Even more than Mental Floss, Cracked.com has taken the worst genre in journalism (the unfortunately named listicle) and made it something entertaining, informative and intelligent. I don't drop by that often because it's such a time sink, but when I do I always come away with something worth sharing.

For instance, 5 Dirty Tricks Apple Uses to Get You to Buy a New iPhone opens with this nice example of a deceptive graphic:

The problem is that the old version (on the left) is misleadingly shot in a different light: it doesn't have any shadowed black edge and is a completely silver shade, whereas the iPhone 6 and 6 Plus are cleverly shaded at the sides to make them appear skinnier than they actually are. Here's a handy GIF to show what we mean:

I'm not crazy about the animation, but still.

The article goes where so few technology writers dare and actually discusses the functionality from a common sense perspective.

Think about what you do with your phone -- send texts, make calls, check social media, play terrible games, and send immediately regrettable photographs to people you just met. Unless you're a professional photographer, you're not going to care about how much the camera has improved on the iPhone 6 (and if you are a professional photographer, you probably take pictures on something better than a goddamn iPhone). And for those of you who game -- nothing playable on the iPhone really needs a huge upgrade in power. Just look what happened when they tried to sell Angry Birds on actual gaming systems. So what do we need the better specs for? To have more apps? Not according to the hard numbers.

In 6 BS News Stories That Went Viral: The Girl With Three Boobs, they gleefully point out how gullible journalists can be when there's a deadline.

That's Telegraph, The Hollywood Reporter, E! Online, Huffington Post, and International Business Times reminding us that, like the ocean, the Internet is a vast chilly abyss that cradles unspeakable wonder as well as waking nightmares. We'll leave you to decide which category triple boobs fall under, because we honestly have no idea.

For those of you wondering if this means Martian mind-vacations are just around the corner, it shockingly turns out there are a few things off about this story. Like the fact that the woman has refused to name any of the doctors involved, won't show her new gift to the world for more than a quick few seconds up close, or that she once filed a missing baggage claim listing "3 breast prosthesis" as one of the stolen items. Also relevant? She once apparently described herself as a "provider of Internet hoaxes since 2014."

4 Reasons Movie Special FX Are Actually Getting Worse has an excellent discussion of the paradoxical economics of CGI,

It turns out that making the most visually spectacular images that the human brain can comprehend requires a good bit of scratch. That's why huge-budget blockbusters have been becoming the norm (33 of the 50 most expensive movies of all time have been made in the last four years); studios are so preoccupied pouring hundreds of millions of dollars into CGI for schlock like Battleship because they could, that they didn't bother to stop to think if they should.

And, as CGI continues to improve, movies only become more reliant on it. We've mentioned before how Rhythm & Hues, the visual effects company most famous for bringing to life all the Oscar-winning, pants-shitting fear of sharing a Tunnel of Love rowboat with a 400-pound marvel of evisceration and death in Life of Pi, went bankrupt because they did their job too well.

Meanwhile, the studios are pumping more and more money into already-bloated special effects budgets (it sure as shit isn't going toward better screenplays). For Transformers: Thing of Whatever, Industrial Light & Magic spent about 15 weeks per Transformer just getting the basic model ready, and each model has about 10,000 parts -- that's not a joke, that's seriously how many individual pieces there are in Michael Bay's idea of a talking truck. The company had to start making models six months before filming even started, just to meet the production schedule. And remember, ILM is like the GE of special effects studios, so if they're balls-to-the-wall to make their effects look good in a profitable fashion, what chance does a scrappy, upstart VFX company have?

Finally, 3 Artists Who Got Screwed for Creating Iconic Characters is a perfect complement to the Kirby thread, reminding us that, like many industries based on creativity, little of the money from comics goes to those who do the actual creating.

Thursday, October 9, 2014

Step-back SAT/GRE problems -- trying something new at "You Do the Math"

I've been thinking about the problem of adapting lessons for different media in general and for video in particular. There is a popular but wildly misguided impression that you can create an effective video by just sticking a camera in front of a live presentation. Teaching live is an interactive process. Even when the students don't say a word, the good teacher is alert to the class's reactions. You speed up, slow down, offer words of encouragement, come up with new examples and occasionally stop what you're doing and go back and reteach a previous section.

With a video lesson you set the course then you leave the room. What's worse, it's a really big room and many if not most of the kids are there because the standard methods of instruction have not served them well.

One idea I'm playing with is thinking of the problems in terms of a graph (as in graph theory, not data visualization) where the path is determined by how well the student is doing. As a start in that direction I'm playing around with paired problems -- if you are confused by the first (more difficult) problem there an easier one to try -- and I've got the first couple up at the teaching blog.

Here's the medium problem:

Circle 1

The radius of circle 1 is 5. Both line segments pass through the center of the circle. Find the area of the shaded region.

You can find the answer and explanation at You Do the Math. Feedback is always appreciated.

The New York Times' regularly scheduled sackcloth and ashes show

From Talking Points Memo:

When New York Times columnist David Brooks revealed last month that his son is serving in the Israeli military, plenty of questions followed: Should Brooks have been more open about that fact? Should it preclude him from writing about Israel? Is it any different from a columnist with a child serving in the U.S. military?

We learned Wednesday that the revelation has even brought about a minor disagreement between two Times editors.

The paper's public editor Margaret Sullivan wrote Wednesday that while she "strongly" disagrees with the suggestion that Brooks "should no longer write about Israel," she also believes that "a one-time acknowledgement of this situation in print (not in an interview with another publication) is completely reasonable."

"This information is germane; and readers deserve to learn about it in the same place that his columns appear," Sullivan wrote.

That's not how Times editorial page editor Andrew Rosenthal sees it though. Rosenthal told Sullivan that the columnist shouldn't have been required to note that his 23-year-old son enlisted in the Israel Defense Forces.

"I do not think he ever had an obligation to say that his son made this choice, any more than if his son had joined the U.S. Air Force (although I recognize that Israel is more controversial in some people’s minds)," Rosenthal said.

Just to be clear, we're talking about David Brooks. You know the guy, quotes discredited studies, makes stuff up. Over the years, he has given critics a steady stream of material, truly unambiguous examples of factual mistakes and substantial omissions in service of the narrative of the moment. His editors have been remarkably quiet on these errors (which is about par for the NYT course)

The New York Times does frequently engage in very public displays of repentance and self-examination. They admit to professional and ethical lapses. They debate in very serious tones the finer points of journalistic conduct. Almost invariably, however, they pick the most minor of lapses to focus on. It is almost as if they wanted to appear conscientious about their profession without actually doing the hard work or accepting the consequences.