## Monday, October 18, 2010

### In praise of orifice-derived numbers

HCZ founder Geoffrey Canada in an interview with City Limits Magazine:
You have set a 65 percent "tipping point" as a universal goal for your programs, after which you think success becomes inevitable. How did you determine that 65 percent was the tipping point?

Why that number? Why that number and not 70, 80 percent? There's no science there. You don't go look up, find the tipping point of a poor community—there's no science there. You take your best educated guess...

I'll tell you what my belief is, and what's my underlying logic. Kids do what their friends do. If your friends smoke, you smoke. If your friends drink, you drink. That's just the way things are. Kids do what they're around. If you're around kids who fight, you better learn how to fight. If you get a whole bunch of kids doing positive things instead of negative things, should you expect that to have an impact on other kids? Absolutely. But there's no science.

If the question is, is there science, has someone done a randomly controlled double-blind study? No, no. But ask anybody. You want the Harlem Children's Zone on your block, working with the kids on your corner, or not? You don't need a random study to decide what the answer to that is. You ask people, when a shooting happens, do you want folks from the Harlem Children's Zone to go in there and make sure no one else gets shot or not? You live in Harlem, guess who you're calling? You're calling me.

As a statistician, I am frequently asked quantify difficult-to-measure variables and estimate parameters for decision processes. In these situations I can always find some rational-sounding process for deriving a metric, but I can't always vouch for its robustness or even its relevance and there's a good chance all it will do is give the recipient a false sense of security. Some things just don't readily lend themselves to these approaches.

People like the notion of processes being data-driven. They feel safer knowing that there methodologies and statistics behind the decisions that affect them. Unfortunately a suspect methodology and a statistic that doesn't measure what it's meant to can do tremendous damage. If I'm ask to produce metrics when I don't have faith in the analyses or in our ability to measure what's relevant, then as a statistician, the best and most responsible answer I can give is often, "Hell, I don't know. What sounds reasonable to you?"*

There's a lot to be said for an arbitrary number that's reasonable to the people who actually work in the field, particularly if that number is also a good and useful rule of thumb. Of course, it could turn out to be a poor estimate of what you're after but the same holds for the results of bad data and analyses and it's a great deal easier to drop a number when it doesn't come with pages of impressive-looking but meaningless tables.

* I'm not going to wander off into a discussion of informative priors here, but these woods are filled with Bayesians so there's a good chance someone will pick up the scent.