Tuesday, October 14, 2014

Selection on Spinach*






[I have the nagging feeling that I'm not using the proper terminology with the following but the underlying concepts should be clear enough. At least for a blog post.]

Let's talk about three levels of selection effects :

The first is initial selection. At this level, certain traits of potential subjects influence the likelihood of their being included in the study. If you ask for volunteers in person, you will end up underrepresenting shy people. If you use mail surveys, you will underrepresent the homeless:

The second level comes after a study starts. You will frequently lose subjects over time. This type of selection is particularly dangerous because you cannot assume that the likelihood of dropping out is independent of the target variable. The isue comes up all the time in medical studies. For serious conditions, a turn for the worse can make it extremely difficult to continue treatment. The result is that the people who stick around till the end of the study are far more likely to be those who were getting better;

(Up until now, the types of selection bias we have discussed, though potentially serious, are generally not deliberate. Their consequences are unpredictable and they happen to even the best and most conscientious of researchers. That is no longer the case with level three.)

The third level concerns attempts to manipulate attrition so as to affect the results of a study. In these cases, researchers will attempt to get rid of those subjects who are likely to drag down the average. This is blatant data cooking and it can be remarkably effective. In school administration, the term of art is "counseling out." It is shockingly widespread, particularly among the "no excuses" charter schools.

The effect of this practice on kids can be brutal but that is a topic for another post. What interests us here are the statistical concerns; what are the analytic implications of this policy? In terms of direction, the answer is simple: schools that engage in these policies will see their test scores artificially inflated. In terms of magnitude, there is really no telling. The potential for distortion here is huge, particularly when you take into account the possibility of peer effects.

Put bluntly, in cases like this, "The first Success graduating class, for example, had just 32 students. When they started first grade in August 2006, those pupils were among 73 enrolled at the school," data showing above-average results are almost meaningless.

[A few weeks ago, I put out a collection of our early posts on education (Things I Saw at the Counter-Reformation).  The impact of attrition is one of the big running themes.]



*Spinach being, in this case, a substance that greatly increases the power of a given effect.

Monday, October 13, 2014

XKCD -- write your own damned post

I've got at least two pieces I'd like write around this: one discussing the way we approach AI research (and the innate limitations in that favored approach); the other a rant about how ddulite journalists fail to catch the important subtleties in technology.

I'm sure there are more angles here so I'll throw this one out to the room. What are the examples of a slight change taking a problem from easy to nearly impossible?








Friday, October 10, 2014

Checking in with Cracked.com -- the website that's better than it has an right to be

Even more than Mental Floss, Cracked.com has taken the worst genre in journalism (the unfortunately named listicle) and made it something entertaining, informative and intelligent. I don't drop by that often because it's such a time sink, but when I do I always come away with something worth sharing.

For instance, 5 Dirty Tricks Apple Uses to Get You to Buy a New iPhone opens with this nice example of a deceptive graphic:


The problem is that the old version (on the left) is misleadingly shot in a different light: it doesn't have any shadowed black edge and is a completely silver shade, whereas the iPhone 6 and 6 Plus are cleverly shaded at the sides to make them appear skinnier than they actually are. Here's a handy GIF to show what we mean:


I'm not crazy about the animation, but still.

The article goes where so few technology writers dare and actually discusses the functionality from a common sense perspective.
Think about what you do with your phone -- send texts, make calls, check social media, play terrible games, and send immediately regrettable photographs to people you just met. Unless you're a professional photographer, you're not going to care about how much the camera has improved on the iPhone 6 (and if you are a professional photographer, you probably take pictures on something better than a goddamn iPhone). And for those of you who game -- nothing playable on the iPhone really needs a huge upgrade in power. Just look what happened when they tried to sell Angry Birds on actual gaming systems. So what do we need the better specs for? To have more apps? Not according to the hard numbers.





In 6 BS News Stories That Went Viral: The Girl With Three Boobs, they gleefully point out how gullible journalists can be when there's a deadline.
That's Telegraph, The Hollywood Reporter, E! Online, Huffington Post, and International Business Times reminding us that, like the ocean, the Internet is a vast chilly abyss that cradles unspeakable wonder as well as waking nightmares. We'll leave you to decide which category triple boobs fall under, because we honestly have no idea.

For those of you wondering if this means Martian mind-vacations are just around the corner, it shockingly turns out there are a few things off about this story. Like the fact that the woman has refused to name any of the doctors involved, won't show her new gift to the world for more than a quick few seconds up close, or that she once filed a missing baggage claim listing "3 breast prosthesis" as one of the stolen items. Also relevant? She once apparently described herself as a "provider of Internet hoaxes since 2014."



4 Reasons Movie Special FX Are Actually Getting Worse has an excellent discussion of the paradoxical economics of CGI,

It turns out that making the most visually spectacular images that the human brain can comprehend requires a good bit of scratch. That's why huge-budget blockbusters have been becoming the norm (33 of the 50 most expensive movies of all time have been made in the last four years); studios are so preoccupied pouring hundreds of millions of dollars into CGI for schlock like Battleship because they could, that they didn't bother to stop to think if they should.

And, as CGI continues to improve, movies only become more reliant on it. We've mentioned before how Rhythm & Hues, the visual effects company most famous for bringing to life all the Oscar-winning, pants-shitting fear of sharing a Tunnel of Love rowboat with a 400-pound marvel of evisceration and death in Life of Pi, went bankrupt because they did their job too well.

Meanwhile, the studios are pumping more and more money into already-bloated special effects budgets (it sure as shit isn't going toward better screenplays). For Transformers: Thing of Whatever, Industrial Light & Magic spent about 15 weeks per Transformer just getting the basic model ready, and each model has about 10,000 parts -- that's not a joke, that's seriously how many individual pieces there are in Michael Bay's idea of a talking truck. The company had to start making models six months before filming even started, just to meet the production schedule. And remember, ILM is like the GE of special effects studios, so if they're balls-to-the-wall to make their effects look good in a profitable fashion, what chance does a scrappy, upstart VFX company have?


Finally, 3 Artists Who Got Screwed for Creating Iconic Characters is a perfect complement to the Kirby thread, reminding us that, like many industries based on creativity, little of the money from comics goes to those who do the actual creating.

Thursday, October 9, 2014

Step-back SAT/GRE problems -- trying something new at "You Do the Math"

I've been thinking about the problem of adapting lessons for different media in general and for video in particular. There is a popular but wildly misguided impression that you can create an effective video by just sticking a camera in front of a live presentation. Teaching live is an interactive process. Even when the students don't say a word, the good teacher is alert to the class's reactions. You speed up, slow down, offer words of encouragement, come up with new examples and occasionally stop what you're doing and go back and reteach a previous section.

With a video lesson you set the course then you leave the room. What's worse, it's a really big room and many if not most of the kids are there because the standard methods of instruction have not served them well.

One idea I'm playing with is thinking of the problems in terms of a graph (as in graph theory, not data visualization) where the path is determined by how well the student is doing. As a start in that direction I'm playing around with paired problems -- if you are confused by the first (more difficult) problem there an easier one to try -- and I've got the first couple up at the teaching blog.

Here's the medium problem:

Circle 1


The radius of circle 1 is 5. Both line segments pass through the center of the circle. Find the area of the shaded region.


You can find the answer and explanation at You Do the Math. Feedback is always appreciated.







The New York Times' regularly scheduled sackcloth and ashes show

From Talking Points Memo:
When New York Times columnist David Brooks revealed last month that his son is serving in the Israeli military, plenty of questions followed: Should Brooks have been more open about that fact? Should it preclude him from writing about Israel? Is it any different from a columnist with a child serving in the U.S. military?

We learned Wednesday that the revelation has even brought about a minor disagreement between two Times editors.

The paper's public editor Margaret Sullivan wrote Wednesday that while she "strongly" disagrees with the suggestion that Brooks "should no longer write about Israel," she also believes that "a one-time acknowledgement of this situation in print (not in an interview with another publication) is completely reasonable."

"This information is germane; and readers deserve to learn about it in the same place that his columns appear," Sullivan wrote.

That's not how Times editorial page editor Andrew Rosenthal sees it though. Rosenthal told Sullivan that the columnist shouldn't have been required to note that his 23-year-old son enlisted in the Israel Defense Forces.

"I do not think he ever had an obligation to say that his son made this choice, any more than if his son had joined the U.S. Air Force (although I recognize that Israel is more controversial in some people’s minds)," Rosenthal said.
Just to be clear, we're talking about David Brooks. You know the guy, quotes discredited studies, makes stuff up. Over the years, he has given critics a steady stream of material, truly unambiguous examples of factual mistakes and substantial omissions in service of the narrative of the moment. His editors have been remarkably quiet on these errors (which is about par for the NYT course)

The New York Times does frequently engage in very public displays of repentance and self-examination. They admit to professional and ethical lapses. They debate in very serious tones the finer points of journalistic conduct.  Almost invariably, however, they pick the most minor of lapses to focus on. It is almost as if they wanted to appear conscientious about their profession without actually doing the hard work or accepting the consequences.

Wednesday, October 8, 2014

XKCD Marriage





Lots of interesting implications here, but they'll have to wait till later.

“I can no longer accept cash in bags in a Pizza Hut parking lot” -- time to add Pennsylvania to the list

In an article entitled READING, WRITING, RANSACKING, Charles P. Pierce makes me think that I haven't been spending nearly enough time looking at education reform in the Keystone State. The quote from the title comes Pierce's account of the federal investigation of former Pennsylvania Cyber Charter School leader Nick Trombetta:
The bags of cash, a private plane bough by Avanti but used mostly by Trombetta, a Florida vacation home and a home in Mingo Junction, Ohio, for Trombetta’s former girlfriend all were described as perks enjoyed by Trombetta as part of a scheme to siphon money from taxpayers’ funds sent to PA Cyber for more than four years.
The case is actually small time compared to the other scandals going on in the state, but you have to admit it's a great quote.

A bigger and much more familiar scandal is the lack of accountability:
For reasons that aren't clear, millions of dollars have moved between the network of charter schools, their parent nonprofit and two property-management entities. The School District is charged with overseeing city charters, but "does not have the power or access to the financial records of the parent organization," according to District spokesperson Fernando Gallard. "We cannot conduct even limited financial audits of the parent organization." That's despite the fact that charters account for 30 percent of the District's 2013-'14 budget. Aspira declined to comment. The $3.3 million that the four brick-and-mortar charters apparently have loaned to Aspira are in addition to $1.5 million in lease payments to Aspira and Aspira-controlled property-management entities ACE and ACE/Dougherty, and $6.3 million in administrative fees paid to Aspira in 2012. 
Add to that some extraordinarily nasty state politics involving approval-challenged Pennsylvania governor Tom Corbett, the state-run Philadelphia School Reform Commission (which has a history of making teachers' lives difficult basically for the fun of it) and a rather suspicious poll:
"With Governor Corbett's weak job approval, re-elect and ballot support numbers, the current Philadelphia school crisis presents an opportunity for the Governor to wedge the electorate on an issue that is favorable to him," the poll concludes. "Staging this battle presents Corbett with an opportunity to coalesce his base, focus on a key emerging issue in the state, and campaign against an 'enemy' that's going to aggressively oppose him in '14 in any case."
I don't know enough about Pennsylvania politics to competently summarize this, let alone intelligently comment on it but it's difficult to imagine an interpretation that makes things looks good.

Tuesday, October 7, 2014

Now they've got me defending the efficient market theorem...

I know it's trivial, but this one has always annoyed me.


There are cases where the conventional wisdom is so screwed up that the market reads bad news as good news and rewards stupidity, but otherwise, in a reasonably efficient market, stocks only go up when bad news beats expectations if they had already gone down as the expectations had rolled in. They are, in other words, making up some of the lost ground. Financial reporters love the "went up on bad news" story but they almost invariably fail to mention how the stock had been doing before.

Don't get me wrong. I'm still not a fan of the EMT, but on this one, at least, I'm willing to give them a pass.


Monday, October 6, 2014

I'm going to let someone else bitch about the New York Times for a while

Besides, when itt comes to take-downs of bad financial journalism, there's no one sharper than Felix Salmon.

In "Annals of NYT innumeracy, Bank Rossiya edition," Salmon takes apart a recent article entitled “It Pays to be Putin’s Friend.” No doubt the basic premise is true, but the examples described by the NYT don't support the point at all. Salmon points out lots of sloppiness in the piece but this is arguably the money shot.
So [Sergei P.] Roldugin took out a loan, of unknown size, to buy a stake of 3.2% in Bank Rossiya. How on earth does that make him worth anywhere near $350 million?

And here the light slowly dawns — the NYT has taken the sum total of Bank Rossiya’s assets, and used that number as the the value of the bank itself. ($350 million, you see, is 3.2% of $11 billion.)

Of course you can’t value a bank by just looking at its assets, you first need to subtract its liabilities. The NYT story leads with “State corporations, local governments and even the Black Sea Fleet in Crimea” moving their bank accounts to Bank Rossiya — all of those deposits are liabilities of the bank, which need to be subtracted from its assets before you can even begin to arrive at an overall valuation for the bank itself. Just looking at the assets, without looking at the liabilities, is a bit like scoring a sports game by looking only at the points scored by one team.

Probably, most of the value in Bank Rossiya is to be found in the commodity and media assets which it seems to have been able to acquire on the cheap. (The bank itself, qua bank, might well be worth nothing at all.) And no one’s going to find out the true value of those assets by looking at the official size of Bank Rossiya’s balance sheet. It seems to me, indeed, that Bank Rossiya is in large part being used as a holding company, a reasonably safe place where Vladimir Putin’s billionaire friends can keep some of the valuable assets they’ve managed to acquire over the years. I’m just guessing here, but I doubt they have any particular desire to share 3.2% of those assets with some random cellist [Roldugin]. To simply take the official size of Rossiya’s balance sheet, and declare it to be the value of the bank: that’s just bonkers.


Friday, October 3, 2014

Examining the rope – – Rotten Tomatoes edition

[You can find the origin of the metaphor here]

Our last Rotten Tomatoes post may have come out a little harsher than I intended. I probably focused too much on the specific glitch and not enough on the larger point, namely that metrics almost never entirely capture what they claim to. Identifying and fixing problems is important, but we also have to acknowledge our imitations.

If we are stuck with imperfections then we will just have to learn to live with them. A big part of that is trying to figure out when our metrics can be relied upon and when they are likely to blowup in our faces.

Let's take Rotten Tomatoes for example. In many ways, the website provides an excellent tool for quantitatively measuring the critical reaction to a movie. It is broad-based, consistent, and as objective as we can reasonably hope for.

But is it the best possible measure in all conceivable circumstances? If not, when does it break down?

When you see a 60% fresh rating that means that 60% of the reviews examined were considered positive. You will notice that is a binary variable. The most enthusiastic of reviews is put in the same category as the mildly favorable. The inevitable result is that sometimes a film will rank lower on this binary average then it would have on a straight average of star rankings.

Just to be clear, there are some definite advantages to this yes/no approach. As anyone who has dealt with satisfaction scales knows, you can get into all sorts of trouble making interval assumptions about that one through five.

 Can knowing their binary foundation help us make better use of the Rotten Tomatoes scores?

If we can make certain assumptions about the distribution of scores, we can tell a lot about which films are likely to be favored. Keep in mind that a good review counts the same as a great one. Therefore a film that is liked by everybody will do better than a film that is loved by most but leaves a few indifferent or hostile.

Without getting into relative merits (all are great films), consider Philadelphia Story and the big three from  Martin Scorsese, Taxi Driver/Raging Bull/Goodfellas. By many measures, such as the influential Sight & Sound poll (according to Ebert "by far the most respected of the countless polls of great movies--the only one most serious movie people take seriously."), all three Scorsese pictures are among the most critically hailed movies ever. All three have very good scores on the "Tomatometer" but none have a perfect score. The same goes for films like Bonnie and Clyde, The Magnificent Ambersons, and Bicycle Thieves.

Philadelphia Story, on the other hand, is much less likely to get nominated as greatest film ever, but it is a movie that virtually everyone likes. It's an excellent film, skillfully directed, starring three of the most charming actors ever to come out of Hollywood. Not surprisingly, it has a perfect score on Rotten Tomatoes.

This is not to say that Sight & Sound is better than Rotten Tomatoes. Every scoring system is arbitrary, sometimes plays favorites and never exactly captures what we expect it to measure.  The lesson here is that, if you want to use a metric in an argument, you need to know how that metric was derived and what its strengths and weaknesses. You can't find a perfect metric but you can have a pretty good idea where the imperfections are.

Thursday, October 2, 2014

Understanding Common Core-aligned math homework

I volunteer a couple of times a week with a group that does after school tutoring for urban students in LA. My role is "math floater." I walk around the room and help the kids, and sometimes the tutors, with math problems. When the kids ask for help, it's usually just your basic math question, but when the tutors ask for help it's often less about the math and more about the unfamiliar approach the assignment takes to solving a familiar problem.

This is perhaps most exasperating for those tutors with math backgrounds. You can imagine what it must be like to have a degree in engineering and yet be stumped by an eighth-grader's pre-algebra homework. Of course, it's not the math that's throwing them; it's all the weird and arbitrary steps that have been layered onto the math.

After struggling a bit myself, I realized that the key was to approach these problems as bad translations of unknown texts. If I looked hard enough, I could usually find an antecedent, a good lesson (something I had read in Pólya or seen demonstrated by a master teacher or used with success in one of my classes) that had somehow devolved into the misshapen thing sitting in front of the student.

Recently, I ducked into the tutoring center when I wasn't scheduled to work. I just stepped in to use the bathroom but before I got across the room, I heard a couple of tutors calling my name. They were struggling with a third or fourth grade problem where the student had to perform a number of steps including filling out a three by three grid in order to find the product of two three-digit numbers. The answer kept coming out wrong and none of the tutors could figure out why since none of them were sure how the process was supposed to go.

The point of the question was to illustrate the distributive property. Handled properly, the general format could have made for a pretty good problem. As was it was a disaster. Developmentally inappropriate, badly explained, overly long (two-digit numbers would have made the point just as well), devoid of relevant context. Like a bad translation of a bad translation of a good problem. That got me wondering if perhaps the process for coming up these problems worked something like this...











Wednesday, October 1, 2014

Two ways of looking at the achievement gap and how the reform debate often misses them both

The following came out of a phone conversation I had this weekend with Joseph. I'll need to get back to this later but for now here's a thumbnail version just to have something on the record.

When we talk about the achievement gap in education, there are two distinct but valid ways of approaching the question:

The first is in terms of variability. The people in the bottom quartile are, by most measures, getting a much worse education than the remaining three quarters of the population;

The second involves correlation. People in that bottom quartile are disproportionately likely to be poor, to be black or Hispanic, or to speak English as a second language.

You address the first by raising scores for those at the bottom. You address the second by changing the order. Reducing the gap is still desirable regardless of the definition used -- we don't want any of our schools to be bad nor do we want an education system that entrenches the class system -- and there are many things we can do that will improve both, but it is important to remember that we are talking about two distinct objectives.

To further complicate the picture, proposals that are meant to improve educational outcomes in general are often pitched as ways to address the achievement gap.

All three goals (improving overall outcomes, reducing variability and breaking the correlation) are important -- I'd argue the third one is absolutely vital -- but whenever we need to be clear about what we are trying to do.

Limits of Market Forces: a never-ending saga

This is a post by Joseph.

I was reading this article and was struck by this passage:

Poole lamented in his blueprint that the country was still not ready in 1980, and he warned his policymaker readers to expect resistance at the local level if they tried to push through programs transferring the costs for criminal justice (and policing) from general taxpayers to “users.” But one thing that Poole and Reason are very proud of is how they brought ideas from the fringes to the mainstream — and Ferguson is a prime example of how Poole’s neoliberal blueprints on privatizing criminal justice were eventually adopted in cities across the country

In Ferguson’s offender-fee system, city revenues from traffic fines make up 21% of the city budget and continue soaring. Those revenues are squeezed mostly from black drivers — 86% of motorists stopped in Ferguson are African-American, well above their 63% portion of the town’s population.
There are two pieces that I think need to be very carefully thought about.  First, as a matter of history, making criminal charges a means of raising revenue has been associated with the worst excesses of tyranny.  Think of the issues of High Treason and attainder during the War of the Roses and Tudor era in England.  Does anybody think the ability to seize people's property made these excesses better but reducing taxes (for example)?  So this is not an inevitable property of these systems, but it is worth thinking about carefully when implementation is being considered.

Second, market forces work best when the costs are borne by those for whom the service is provided to.  Here we need to be very tricky -- policing and trials are not usually services that criminals want provided to them  -- instead it may be a cost of doing business to them.  Nor do they have much influence in setting costs or process.  Instead, the service is provided to all of the non-criminals, who are made safer by the policing. 

So if we fund the system by charging criminals, we inherently break a key feedback loop of market forces.  Criminals cannot, for example, pick their judge or arresting officer.  Nor due we seek to compensate "users" who are incarcerated by mistake, but in other venues billing errors are routinely addressed. 

Instead, I would argue a fair justice system has market value.  A predictable legal environment and a good set of laws makes it easier for business to function efficiently and to invest in the future.  That is a public good, as much as clean air or automobile capable roads are. 

The Thirty Million Words Initiative

This is interesting [if you get a chance, listen to the audio at the link]

“By the end of the age of three, children who are born into poverty will have heard 30 million fewer words than their more affluent peers,” says Dr. Dana Suskind [A pediatric otolaryngologist at University of Chicago -- MP].

Dr. Dana Suskind is the director of the Thirty Million Words initiative – an education and research program out of Chicago.

...

The moment a baby is born their brain is already beginning to develop. That is why these early language interactions are so crucial. Scientists can actually measure the word gap or the number of words spoken at home. They use a little device called the LENA which stands for language environment analysis.

The LENA is about the size of a credit card. Babies wear it at chest level. Not only does it count or record words, it can also analyze what it records. The LENA is able to differentiate all the different kinds of sounds that are heard in a baby’s environment. One way to think of the LENA is that it’s like a language pedometer.

“So just like a regular pedometer counts the number of steps you take in a day the LENA counts the number of words a child is exposed to and how many conversations they have with their caregiver or parent,” Suskind says.

It’s not just the number of words spoken to babies but the quality of words spoken. Dr. Adriana Weisleder, a developmental psychologist who is an associate project director and co-investigator at the BELLE project says that, “in some families a lot of the speech to children is what they called business talk. The function of the speech is to get the child to do something right so they’re commands or imperatives. That happens in all families. It has to happen, right? Parents have to get their kids to do things. But when a high proportion of the speech that children hear is composed of those kinds of business talk or imperatives then that means they’re not getting a lot of the other rich talk and conversation.”

Still a device like the LENA can’t close the word gap all by itself.

“Just like a pedometer will not change the obesity and health crisis in the country, we can’t put everything on a piece of technology,” Suskind says.

One way Dr. Suskind’s Thirty Million Words initiative tries to close to gap is by actually going into homes. On top of going over the results from lena recordings – Thirty Million Words has created a curriculum for parents. It includes videos modeling ways caregivers should talk to their baby.

Families that speak more than one language at home can face a special challenge: what language should they speak to their kids in?

“It’s not just a moral and right thing, but the science is clear that parents and caregivers should be talking and interacting with their children in their native language. It does no good to be speaking in a language you don’t feel comfortable with,” Suskind says.

“Having a higher vocabulary even if it’s in Spanish still makes kids be more prepared for school.” Weisleder says.

Why is that? talking a lot to your child is about more than just teaching them words – it’s helping them understand basic concepts.

”If you know in spanish the words for horse and dog and house and barn. You know those words in spanish but you also know a lot of relationships between those things. You know that dogs and horses are animals and that a lot of dogs live in houses and horses might live in barns, lots of the different things.”

Bridging the word gap is not about getting babies ready to read Don Quixote by the age of four – it’s about setting up the building blocks so that children can be ready learn more easily once they get to school.

Tuesday, September 30, 2014

Why a predictably breakable rope is better than an unbreakable rope

Short answer: there is no such thing as an unbreakable rope.


There's an old story about an isolated monastery located high on the side of an unclimbable cliff. The only access to the monastery was by way of a basket that was hold up the side of the cliff on a single rope. One day a pilgrim who was climbing into the basket noticed that the rope looked old and parade. He asked the monk "when do you replace the rope?"

The monk replied "when it breaks."

If we generalize a bit, this becomes a useful analogy. We have a case where there is great cost associated with avoidable failure, but where there are also nontrivial costs associated with caution.

One common but probably misguided response to the situation is to buy a better rope i.e. come up with a system that is less likely to fail. If you have a shoddy system with lots of room for cheap and easy improvement, this approach makes a great deal of sense. If, on the other hand, you have already made all of the obvious and inexpensive upgrades, it probably makes more sense from a cost benefit perspective to start focusing on the question of when you replace the rope.

You frequently see this question coming up in connection to proxy variables. Particularly in the social sciences, researchers are constantly required to substitute an easily measured variable for the actual factor of interest. If we start with a "good rope" (a well-chosen proxy) then it will, under most circumstances, correlate strongly with the thing we are actually interested in.

There are plenty of "bad ropes" out there, proxies that have only weak relationships with the variables of interest even under the best circumstances, but that is a topic for another post. The disagreement here is with the otherwise responsible statisticians who make an effort to find the best possible proxy but who then do not spend enough time thinking about what happens when the rope breaks.

A few years ago, while I was doing risk models for a large bank, I found myself caught in a heated debate. We had a very good direct measure of how close people were to maxing out their line of credit. Unfortunately, this was also an expensive variable, so it was proposed that we substitute another, less direct measure. The argument for the substitution was that there was an extremely high correlation between the two variables. The counter argument put forward by most of the more experienced statisticians was that while this was true, that correlation tended to break down in extreme cases, particularly those where a person was about to go bad on all of their debts . Since the purpose of the model was to predict when people were about to default on their loans, this was a really unfortunate time for the relationship to fall apart.