Comments, observations and thoughts from two bloggers on applied statistics, higher education and epidemiology. Joseph is an associate professor. Mark is a professional statistician and former math teacher.
Wednesday, February 16, 2011
Forget Jeopardy, show me a computer that can play Eleusis
Natural language processing continues to chug along at a respectable pace. Things like Watson and even Google Translate represent remarkable advances. Still, they hardly seem like amazing advances in artificial intelligence. I'm not going to worry about the rise of the machines until they start beating us at games like Robert Abbott's Eleusis.
Abbott's game (old Eleusis -- you can buy a booklet of rules for the updated game from Mr. Abbott himself) made its national début in the Second Scientific American Book of Mathematical Puzzles and Diversions by Martin Gardner. It's easy to play but a bit complicated to score (not unnecessarily complicated -- there's a real flash of insight behind the process).
The dealer (sometimes referred to as 'Nature' or 'God' for what will be obvious reasons) writes a rule like "If the card on top is red, play a black card. If the card on top is is black, then play a red card." on a piece of paper then folds it and puts it away. The dealer then shuffles the deck, randomly selects a card, puts it face up in the center of the table then deals out the rest evenly to the players (the dealer doesn't get a hand). If the number of cards isn't divisible by the number of players the extra cards are put aside.
The first player selects any card from his or her hand and puts it on top of the starter card. Based on the hidden rule, the dealer says 'right' and the card stays on the pile or says 'wrong' and the card (called a mistake card) goes face up in front of the player. The players continue in turn
The object for players is to have as few mistake cards as possible. The object for the dealer is to have the largest possible range in players' scores.
At the end of the first hand, the score is calculated for the dealer. The scoring method is clever but a bit complicated. For n players (excluding the dealer), have each player count his or her mistake cards then multiply the smallest number by n-1 and subtract the product from the total number of mistake cards in front of the other players. For example, if there were four players with 7, 2, 9 and 8 mistake cards, you would multiply 2 (the lowest) times 3 (n-1) and subtract that from 24 (the sum of the rest).
In the second stage, the players take turns selecting cards from their mistake pile (leaving them face up so that other players can see what has been rejected). Play continues until someone goes out or until the dealer sees that no more cards can be accepted. At that point the rule is revealed.
Players' score are then calculated with a formula similar to the one used for the dealer: each player multiplies his or her mistake cards by n-1 then subtracts the product from the total of the other players' mistake cards. If the difference is negative, the score is zero. The player who goes out first or who has the fewest cards if no one goes out gets an additional six points.
While most 'new' games are actually collections of old ideas with new packaging, Abbott managed to come up with two genuinely innovative ideas for Eleusis: the use of induction and the scoring of the dealer. As someone who has spent a lot of time studying games, I may be even more impressed with the second. One of the fundamental challenges of game design is coming up with rules that encourage strategies that make the game more enjoyable for all the players. In this case, that means giving the dealer an incentive to come up with rules that are both challenging enough to stump some of the players and simple enough that someone will spot the pattern.
Eleusis has often been used as a tool for teaching the scientific method. You recognize a pattern, form a hypothesis, and test it. Gardner discusses this analogy at length. At one point, he even brings William James and John Dewey into the conversation.
The New York Times said that Robert Abbott's games were "for lovers of the unfamiliar challenge." Any AI people out there up to that challenge?
Tuesday, February 15, 2011
"Looking under the lamp post"
When you combine this natural desire to join the conversation with the constant pressure on journalists to come up with an angle for every story, the result is a natural tendency to converge on a standard narrative, particularly if this narrative plays off another hot topic.
Case in point, the recent events in Egypt have been repeatedly described using social networks and references to Facebook, Wikipedia and Twitter. The phrase 'egypt "revolution 2.0"' produces almost ninety thousand results (If you hear the term 'two point oh' to describe non-sarcastically anything other than a new product release, you can be pretty sure the speaker is trying to feel included).
But did revolutionaries friending each other and texting on their cell phones really contribute to the fall of the government? According to this excellent post over at Whimsley (via Thoma, of course), the answer is yes, but not as much as you might think.
The easiest people to talk to
Most obviously, it is much easier to talk to English speaking participants than non-English speakers. English speakers are far more likely to be part of the one-fifth or so of the country that has access to the Internet. (World Bank Development Indicators). And it is easy to contact people over the Internet, so we hear from people who are on the Internet. It is easy to follow Twitter feeds, so we hear Egyptian tweets.
The easiest story to tell
It isn't just the sources, though. The Facebook Revolution narrative is an interesting story to tell to a contemporary Western audience. For us, a story built around the familiar yet novel world of Facebook and social media is an easy way into the Egyptian rebellion. How many of us know much about the specifics of Egypt's history, its recent past, or the economic sources of discontent? It is a much quicker and lighter story to say "look at the Facebook page." We can even go and look at it ourselves (>>). Talking about strikes is more likely to lose an audience.
So every time prominent activist Wael Ghonim is mentioned, he is described as a "Google executive Wael Ghonim" even though he has explicitly said that "Google has nothing to do with this" (>>). Do we hear the employer of any of the other leaders? April 6 Movement founders Asmaa Mahfouz, Ahmed Maher and Ahmed Salah are commonly described as "activists". It is possible to track down Maher's occupation as a "civil engineer", but with no employer. The discrepancy is glaring, and so Google gets to be associated with the uprising, adding to the digital tone of the story.
Underreported players
As people look back for the roots of the rebellion, the April 6 Movement and the We Are Khaled Said Facebook page have received much of the attention. But there are other strands that fed into the protests. The April 6 Movement was created to commemorate an industrial strike, after all, at a textile factory. There have been more than 3,000 separate labour protests in Egypt since 2004 according to a report by the AFL-CIO. The Kefaya movement is considered by some experts to be a central organizer of the January 25 protests, along with Mohamed ElBaradei's organization (two-minute video with Samer Shehata).
An interesting perspective
And on the politics, it’s a mess. Right now we have conservatives simultaneously calling for huge spending cuts and also getting the line’s share of old people’s votes even while the vast majority of non-security spending is on old people. In essence, by first separating the domestic budget into “discretionary” and “entitlement” portions and then dividing the entitlement programs up into “what today’s old people get” versus “what tomorrow’s old people will get” the political class has created a large and vociferously right-wing class of people who are completely immune from the impact of their own calls for fiscal austerity. In my view, that reality is the biggest driver of our current political dysfunction.
I had not thought about things like this but it is a really good point. I dislike the idea of revising benefit levels because people plan their lives around these benefits and it seems unfair to change things mid-stream.
However, I had completely overlooked the political point involved. Social Security, Medicare and CHIP are 41% of the budget. Veterns and retirees are another 7%. This makes about half of the budget being focused on people over 55/60 years of age.
So I think I agree that this is a better point than the moral one. The conversation about the budget becomes a lot more sane if there is not a "protected class" of citizens. It's not a conclusion I like but I think it might be correct.
"Human see; human do."
If you're just joining us, we're speaking with V.S. Ramachandran. He is a behavioral neurologist and author of the new book "The Tell-Tale Brain: A Neural Scientist's Quest for What Makes Us Human."
You write a lot about mirror neurons and the role that they played on our evolution. You want to just tell us a little bit about that?
Dr. RAMACHANDRAN: Well, mirror neurons were not discovered by us, obviously. They were discovered by Giacomo Rizzolatti in Parma, Italy, and his colleagues. And what they refer to is in the front of the brain, the motor and pre-motor cortex, there are neurons that issue commands to your hands and other parts of your body to perform specific actions, semi-skilled actions, skilled actions or even non-skilled actions. So these are motor-command neurons which orchestrate specific sequence of muscle twitches for you to reach out and grab a peanut, for example, or put it in your mouth.
What Rizzolatti and his colleagues found was some of these neurons, as many as 20 percent or 30 percent, will fire not only when - let's say I'm measuring mirror neuron activity in your brain. So when you reach for a peanut, these neurons fire. But the astonishing thing is these neurons will also fire when you watch me reaching for a peanut so these are promptly dubbed mirror neurons for obvious reasons. So it's as though your brain is performing a virtual reality simulation of what's going on in my brain, saying, hey, the same neuron is firing now when he's doing that as would fire when I reach out and grab a peanut, therefore, that's what that guy's up to.
He's about to reach out and grab a peanut. So it's a mind-reading neuron. It's essential for you seeing other people as intentional beings who are about to perform certain specific intended actions.
DAVIES: And that might have helped us learn from one another and thereby advanced culturally far beyond our...
Dr. RAMACHANDRAN: That's correct. That's the stuff - that's kind of an obvious behind-site, but that's the claim I made, oh, about 10 years ago in a website run by Brockman called "Edge." And what I pointed out was - and others have pointed this out, too, is that mirror neurons obviously are required for imitation and emulation. So if I want to do something complicated that you're doing and I want to imitate it, I have to put myself in your shoes and view the world from your standpoint. And this is extremely important.
It seems like something trivial, you know, mimicry, but it's not. It's extremely important because imitation is vital for certain types of learning, rudimentary types of learning. These days you learn from books and other things, but in the early, early days when hominids were evolving, we learned largely from imitation. And there's a tremendous acceleration of evolution illusionary process. What I'm saying is maybe there are some outliers in the population who are especially smart simply because of genetic variation, who have stumbled, say, accidentally on an invention, like fire or skinning a bear.
Without the mirror neuron system being sophisticated, it would have died out, fizzled out immediately. But with a sophisticated mirror neuron system, your offsprings can learn that technique by imitation so it spreads like wild fire horizontally across a population and vertically across generations. And that's the dawn of what we call culture and therefore, of civilization.
Monday, February 14, 2011
Google can make you disappear
Interviewing a purveyor of black-hat services face-to-face was a considerable undertaking. They are a low-profile bunch. But a link-selling specialist named Mark Stevens — who says he had nothing to do with the Penney link effort — agreed to chat. He did so on the condition that his company not be named, a precaution he justified by recounting what happened when the company apparently angered Google a few months ago.
“It was my fault,” Mr. Stevens said. “I posted a job opening on a Stanford Engineering alumni mailing list, and mentioned the name of our company and a brief description of what we do. I think some Google employees saw it.”
In a matter of days, the company could not be found in a Google search.
“Literally, you typed the name of the company into the search box and we did not turn up. Anywhere. You’d find us if you knew our Web address. But in terms of search, we just disappeared.”
The company now operates under a new name and with a profile that is low even in the building where it claims to have an office. The landlord at the building, a gleaming, glassy midrise next to Route 101 in Redwood City, Calif., said she had never heard of the company.
USA Today has some bad graphs but at least it's not the New York Times
Is this the worst infographic ever to appear in NYT? USA Today is not something to aspire to.This strikes me as deeply unfair to USA Today. The paper has certainly run its share of bad graphs but these take things to a new level. It is as if the NYT used illustrations from "How to Lie with Statistics" as a starting point and then tried to top them.
Here's the "View of the U.S." where the lower the icon is, the higher its approval.

And here's the "U.S. Pakistan Policy" where the scrolls are arranged so you can't really compare their sizes (I initially thought they were going for some depth effect).
And here's the "Greatest Threat" which takes Huff's height/volume examples to the next level by using images of different shapes and densities.
Finally there's this amazing piece of work:
Just glancing at this you would probably conclude that the amount of blue in the circles corresponds to percentage in agreement. For example, looking at the middle circle you'd assume that almost all of those surveyed were in disagreement. You'd be wrong. More agreed than disagreed. (This was also noted by one of the commenters on Gelman's site.)While they don't quite match this, these graphs may be the worst we've seen from a major paper in recent memory.
[adapted in part from a comment I left on Andrew Gelman's site]
Great moments in metawork
"To seek out new data and new analytic techniques."
The attendees were all experienced modellers and data miners, some fairly high ranking with commensurate salaries. Everyone in that room had something else they needed to be doing and, except for the senior manager present, I doubt that anyone present saw any real value in the exercise. Still, word had come down from the top that every distinct subgroup in the company needed its own mission statement so there we were, boldly splitting that famous infinitive one more time.
On the bright side, at least this was one time we didn't have to have a pre-meeting.
"The Economics of Blogging and The Huffington Post"
That might have been a mistake on my part.
(thanks again to Felix Salmon)
Concerns with data driven reform
Consider the effect of No Child Left Behind. I've seen a noticeable decline in basic math skills of students of all levels in the last 5 years. Every year, I will discovered a new deficiency that was not seen from the previous years (we are talking about Calculus students not able to add fractions). Yet NCLB was assumed to be "working" since the scores were going up. It seems that K-12 was devoting too much time preparing the students for tests, at the cost of killing students' interest in math, trading quality instruction for test-taking skills. Is NCLB a factor in the study? Are socio-economic factors examined in the study?
or CC Physicist who stated:
I look at what Asst Prof wrote as an indication that a Dean, chair, and mentor didn't do a good job of getting across the history of assessment. Do you know what "Quality Improvement" program was developed a decade earlier, and what the results were of the outcomes assessment required from that round of reaffirmation of accreditation? Probably not, since we have pretty good communication at our CC but all the negative results from our plan were swept under the rug. The only indication we had that they weren't working was the silent phase-out of parts of that plan. Similarly, data that drove what we did a decade ago were not updated to see what has changed.
I think these two statements capture, very nicely, the main issue I have with the current round of educational reform. One, if you make meeting a specific metric (as a measure of on underlying goal) a high enough priority then people will focus on the metric and not the actual goal. After all, if you don’t then your name could be posted in LA Times although with your underperformance on the stated metric. So we’d better be sure that the metric that we are using is very robust in its relation to the underlying goal. In other words, that it is a very good representation of the curriculum that we want to see taught and measures the skills we want to see students acquire.
Two, trust in evidence based reform requires people to be able to believe the data. This is one area where medical research is leaps and bounds ahead of educational research. A series of small experiments are attempted (often randomized controlled trials) while the standard of care continues to be used in routine patient care. Only when the intervention shows evidence of effectiveness in the trial environment is it translated into routine care.
In education, such trials are rare indeed. Let us exclude natural experiments for the moment; if we care enough to change the education policy of a country and to violate employment contracts then it’s fair to hold ourselves to a high standard of evidence. After all, the lotteries (for example) are not a true experiment and it’s hard to be sure that the lottery itself is completely randomized.
The problem is that educational reforms look like “doing something”. But what happens if the reforms are either counterproductive or ineffective (and implanting an expensive reform that does nothing has a high opportunity cost). The people implementing the reforms are often gone in five to ten years but the teachers (at least now while they have job security) remain to clean up the wreckage afterwards.
I think that this links well to Mark's point about meta-work: it's hard to evaluate the contributions of meta-work so it may look like an administrator is doing a lot when actually they are just draining resources away from the core functions of teaching.
So when Dead Dad notes: “Apparently, a national study has found that colleges that have signed on to ATD have not seen statistically significant gains in any of the measures used to gauge success.” Why can’t we use this evidence to decide that the current set of educational reform ideas aren’t necessarily working well? Why do we take weak evidence of the decline of American education at face value and ignore strong evidence of repeated failure in the current reform fads?
Or is evidence only useful when it confirms our pre-conceptions?
Metawork
As he liked to explain it, metawork is not, in and of itself, a bad thing. A certain amount is necessary for a well-functioning organization. It's not unusual for new companies to fail because of an overly rich work to metawork mixture.
But, my friend went on, metawork is like a gas -- it expands to fill all available space, both because it's easy to create metawork projects and because those projects can often be stretched to whatever time is available to them (you can always schedule an extra meeting). Furthermore, once it has established a foothold, it has a way of becoming part of the corporate culture.
There are also other reasons why companies tend to grow more metawork heavy as they mature and expand:
Major metawork initiatives tend to be top down (no customer ever said, "I like this company's products but I have a feeling they aren't having enough team-building seminars."). From a career standpoint, it is always a good idea to give a high priority to projects that people above you consider important;
Metawork projects almost always sound good. They have impressive sounding goals like improving efficiency, raising morale, making the company more nimble and responsive, or moving to data-driven strategies (more on that one in future posts). They suggest big-picture, forward-thinking approaches that make fixing problems like billing glitches seem prosaic, perhaps even trivial;
Metawork tends to be safer than the other kind. Let's say a company launches two big and badly-conceived initiatives, a new product launch and a 'data-driven' reworking of the project management process. The product sells badly and the new process eats up man hours without making projects run faster or smoother. Both end up costing the company about the same amount of money, but the product's failure is public and difficult to ignore while the process's failure is internal and can be denied with some goalpost moving and willing suspension of disbelief (something that's easy to generate for a VP's pet project);
As mentioned before, metawork isn't all pre-meetings and mission statements. Some kinds of metawork are essential (payroll comes to mind). Other kinds can help a company improve its profitability and stability (like employee morale studies in a labor-intensive industry with high turnover). Employee can be resistant to some of these good, important initiatives, but it's worth keeping a couple of facts in mind:
There is a lot of bad metawork out there;
The employees who most resent doing metawork are often the employees who are doing the most of the other kind of work.
Sunday, February 13, 2011
One more simple game for your weekend
Dice A {2,2,4,4,9,9}
Dice B {1,1,6,6,8,8}
Dice C {3,3,5,5,7,7}
The game has three rounds. First you roll against Pierre, then you roll against Blaise, then Pierre and Blaise roll against each other. The winner of each round is the one who rolls the higher number. The overall winner is the player who wins the most rounds.
Which die should you choose?
[Here's the relevant link (try not to look at the status bar -- the address gives away too much). It's a fun, trivial oddity but it raises some interesting questions about how numbers we trust can do unexpected things. More on that later.]
Saturday, February 12, 2011
Weekend Gaming -- Hexagonal Chess
Case in point, Gliński's hexagonal chess.
Gliński's chess variant is hugely popular in Europe (more than 100,000 sets have been sold). You can get the rules at my Kruzno site, but you can probably figure most of them out for yourself. The only pieces that might give you trouble are the bishops and,to a lesser extent, the knights.
Bishops come in three colors, which points out an interesting topological feature of a hexagonal grid which I'm betting you can spot for yourself.
It's a strange and intriguing game and yet another reason why every house should have a hexboard.
Friday, February 11, 2011
"Why the Efficient Market Hypothesis (Weak Version) Says Nothing about the Ability to Identify Bubbles"
Let’s put aside the possibility that even the weak EMH can be wrong from time to time. We don’t need to go there; the error is more basic than this.
Let’s put ourselves back in 2005. It is two years before the unraveling of the financial markets, but I don’t know this; all I know is what I can see in front of me, publicly available 2005 data. I can look at this and see that there is a housing bubble, that prices are rising far beyond historical experience or relative to rents. The “soft” warning signs are all around me, like the explosion of cheap credit, the popularity of credit terms predicated on ever-rising prices, and the talk of a new era in real estate. Based on my perceptions, I anticipate a collapse in this market. What can I do?
If I am an investor, I can short housing in some fashion. My problem is that I have no idea how long the bubble will go on, and if I take this position too soon I could lose a bundle. In fact, anyone who went short in 2005 and passed on the following two years are price frothery grossly underperformed relative to the market as a whole. Indeed, you might not have the liquidity to hold your position for two long years and could end up losing everything. Of course, it is also possible that the bubble could have burst a year or two early and your bets could have paid off. What the EMH tells us is that, as an investor, not even your prescient analysis of the fundamentals of the housing market would enable you to outperform more myopic investors or even a trading algorithm based on a random number generator.
The logical error lies in confusing the purposes of an investor with those of a policy analyst. Suppose I work for the Fed, and my goal is not to amass a personal stash but to formulate economic policies that will promote prosperity for the country as a whole. In that case, it doesn’t much matter whether the bubble bursts in 2006, 2007 or 2010. In fact, the longer the bubble goes on, the more damage will result from its deflation. At the policy level, the relevant question is whether trained analysts, assembling data and drawing on centuries of experience in financial manias, can outperform, say, tarot cards in identifying bubbles. The EMH does not defend tarot.
To profit from one’s knowledge of a market condition one needs to be able to outperform the mass of investors in predicting market turns, which the EMH says you can’t do. Good policy may have almost nothing to do with the timing of market turns, however.

