Comments, observations and thoughts from two bloggers on applied statistics, higher education and epidemiology. Joseph is an associate professor. Mark is a professional statistician and former math teacher.
Wednesday, October 8, 2014
“I can no longer accept cash in bags in a Pizza Hut parking lot” -- time to add Pennsylvania to the list
In an article entitled READING, WRITING, RANSACKING, Charles P. Pierce makes me think that I haven't been spending nearly enough time looking at education reform in the Keystone State. The quote from the title comes Pierce's account of the federal investigation of former Pennsylvania Cyber Charter School leader Nick Trombetta:
A bigger and much more familiar scandal is the lack of accountability:
The bags of cash, a private plane bough by Avanti but used mostly by Trombetta, a Florida vacation home and a home in Mingo Junction, Ohio, for Trombetta’s former girlfriend all were described as perks enjoyed by Trombetta as part of a scheme to siphon money from taxpayers’ funds sent to PA Cyber for more than four years.The case is actually small time compared to the other scandals going on in the state, but you have to admit it's a great quote.
A bigger and much more familiar scandal is the lack of accountability:
For reasons that aren't clear, millions of dollars have moved between the network of charter schools, their parent nonprofit and two property-management entities. The School District is charged with overseeing city charters, but "does not have the power or access to the financial records of the parent organization," according to District spokesperson Fernando Gallard. "We cannot conduct even limited financial audits of the parent organization." That's despite the fact that charters account for 30 percent of the District's 2013-'14 budget. Aspira declined to comment. The $3.3 million that the four brick-and-mortar charters apparently have loaned to Aspira are in addition to $1.5 million in lease payments to Aspira and Aspira-controlled property-management entities ACE and ACE/Dougherty, and $6.3 million in administrative fees paid to Aspira in 2012.Add to that some extraordinarily nasty state politics involving approval-challenged Pennsylvania governor Tom Corbett, the state-run Philadelphia School Reform Commission (which has a history of making teachers' lives difficult basically for the fun of it) and a rather suspicious poll:
"With Governor Corbett's weak job approval, re-elect and ballot support numbers, the current Philadelphia school crisis presents an opportunity for the Governor to wedge the electorate on an issue that is favorable to him," the poll concludes. "Staging this battle presents Corbett with an opportunity to coalesce his base, focus on a key emerging issue in the state, and campaign against an 'enemy' that's going to aggressively oppose him in '14 in any case."I don't know enough about Pennsylvania politics to competently summarize this, let alone intelligently comment on it but it's difficult to imagine an interpretation that makes things looks good.
Tuesday, October 7, 2014
Now they've got me defending the efficient market theorem...
I know it's trivial, but this one has always annoyed me.
There are cases where the conventional wisdom is so screwed up that the market reads bad news as good news and rewards stupidity, but otherwise, in a reasonably efficient market, stocks only go up when bad news beats expectations if they had already gone down as the expectations had rolled in. They are, in other words, making up some of the lost ground. Financial reporters love the "went up on bad news" story but they almost invariably fail to mention how the stock had been doing before.
Don't get me wrong. I'm still not a fan of the EMT, but on this one, at least, I'm willing to give them a pass.
There are cases where the conventional wisdom is so screwed up that the market reads bad news as good news and rewards stupidity, but otherwise, in a reasonably efficient market, stocks only go up when bad news beats expectations if they had already gone down as the expectations had rolled in. They are, in other words, making up some of the lost ground. Financial reporters love the "went up on bad news" story but they almost invariably fail to mention how the stock had been doing before.
Don't get me wrong. I'm still not a fan of the EMT, but on this one, at least, I'm willing to give them a pass.
Monday, October 6, 2014
I'm going to let someone else bitch about the New York Times for a while
Besides, when itt comes to take-downs of bad financial journalism, there's no one sharper than Felix Salmon.
In "Annals of NYT innumeracy, Bank Rossiya edition," Salmon takes apart a recent article entitled “It Pays to be Putin’s Friend.” No doubt the basic premise is true, but the examples described by the NYT don't support the point at all. Salmon points out lots of sloppiness in the piece but this is arguably the money shot.
In "Annals of NYT innumeracy, Bank Rossiya edition," Salmon takes apart a recent article entitled “It Pays to be Putin’s Friend.” No doubt the basic premise is true, but the examples described by the NYT don't support the point at all. Salmon points out lots of sloppiness in the piece but this is arguably the money shot.
So [Sergei P.] Roldugin took out a loan, of unknown size, to buy a stake of 3.2% in Bank Rossiya. How on earth does that make him worth anywhere near $350 million?
And here the light slowly dawns — the NYT has taken the sum total of Bank Rossiya’s assets, and used that number as the the value of the bank itself. ($350 million, you see, is 3.2% of $11 billion.)
Of course you can’t value a bank by just looking at its assets, you first need to subtract its liabilities. The NYT story leads with “State corporations, local governments and even the Black Sea Fleet in Crimea” moving their bank accounts to Bank Rossiya — all of those deposits are liabilities of the bank, which need to be subtracted from its assets before you can even begin to arrive at an overall valuation for the bank itself. Just looking at the assets, without looking at the liabilities, is a bit like scoring a sports game by looking only at the points scored by one team.
Probably, most of the value in Bank Rossiya is to be found in the commodity and media assets which it seems to have been able to acquire on the cheap. (The bank itself, qua bank, might well be worth nothing at all.) And no one’s going to find out the true value of those assets by looking at the official size of Bank Rossiya’s balance sheet. It seems to me, indeed, that Bank Rossiya is in large part being used as a holding company, a reasonably safe place where Vladimir Putin’s billionaire friends can keep some of the valuable assets they’ve managed to acquire over the years. I’m just guessing here, but I doubt they have any particular desire to share 3.2% of those assets with some random cellist [Roldugin]. To simply take the official size of Rossiya’s balance sheet, and declare it to be the value of the bank: that’s just bonkers.
Friday, October 3, 2014
Examining the rope – – Rotten Tomatoes edition
[You can find the origin of the metaphor here]
Our last Rotten Tomatoes post may have come out a little harsher than I intended. I probably focused too much on the specific glitch and not enough on the larger point, namely that metrics almost never entirely capture what they claim to. Identifying and fixing problems is important, but we also have to acknowledge our imitations.
If we are stuck with imperfections then we will just have to learn to live with them. A big part of that is trying to figure out when our metrics can be relied upon and when they are likely to blowup in our faces.
Let's take Rotten Tomatoes for example. In many ways, the website provides an excellent tool for quantitatively measuring the critical reaction to a movie. It is broad-based, consistent, and as objective as we can reasonably hope for.
But is it the best possible measure in all conceivable circumstances? If not, when does it break down?
When you see a 60% fresh rating that means that 60% of the reviews examined were considered positive. You will notice that is a binary variable. The most enthusiastic of reviews is put in the same category as the mildly favorable. The inevitable result is that sometimes a film will rank lower on this binary average then it would have on a straight average of star rankings.
Just to be clear, there are some definite advantages to this yes/no approach. As anyone who has dealt with satisfaction scales knows, you can get into all sorts of trouble making interval assumptions about that one through five.
Can knowing their binary foundation help us make better use of the Rotten Tomatoes scores?
If we can make certain assumptions about the distribution of scores, we can tell a lot about which films are likely to be favored. Keep in mind that a good review counts the same as a great one. Therefore a film that is liked by everybody will do better than a film that is loved by most but leaves a few indifferent or hostile.
Without getting into relative merits (all are great films), consider Philadelphia Story and the big three from Martin Scorsese, Taxi Driver/Raging Bull/Goodfellas. By many measures, such as the influential Sight & Sound poll (according to Ebert "by far the most respected of the countless polls of great movies--the only one most serious movie people take seriously."), all three Scorsese pictures are among the most critically hailed movies ever. All three have very good scores on the "Tomatometer" but none have a perfect score. The same goes for films like Bonnie and Clyde, The Magnificent Ambersons, and Bicycle Thieves.
Philadelphia Story, on the other hand, is much less likely to get nominated as greatest film ever, but it is a movie that virtually everyone likes. It's an excellent film, skillfully directed, starring three of the most charming actors ever to come out of Hollywood. Not surprisingly, it has a perfect score on Rotten Tomatoes.
This is not to say that Sight & Sound is better than Rotten Tomatoes. Every scoring system is arbitrary, sometimes plays favorites and never exactly captures what we expect it to measure. The lesson here is that, if you want to use a metric in an argument, you need to know how that metric was derived and what its strengths and weaknesses. You can't find a perfect metric but you can have a pretty good idea where the imperfections are.
Our last Rotten Tomatoes post may have come out a little harsher than I intended. I probably focused too much on the specific glitch and not enough on the larger point, namely that metrics almost never entirely capture what they claim to. Identifying and fixing problems is important, but we also have to acknowledge our imitations.
If we are stuck with imperfections then we will just have to learn to live with them. A big part of that is trying to figure out when our metrics can be relied upon and when they are likely to blowup in our faces.
Let's take Rotten Tomatoes for example. In many ways, the website provides an excellent tool for quantitatively measuring the critical reaction to a movie. It is broad-based, consistent, and as objective as we can reasonably hope for.
But is it the best possible measure in all conceivable circumstances? If not, when does it break down?
When you see a 60% fresh rating that means that 60% of the reviews examined were considered positive. You will notice that is a binary variable. The most enthusiastic of reviews is put in the same category as the mildly favorable. The inevitable result is that sometimes a film will rank lower on this binary average then it would have on a straight average of star rankings.
Just to be clear, there are some definite advantages to this yes/no approach. As anyone who has dealt with satisfaction scales knows, you can get into all sorts of trouble making interval assumptions about that one through five.
Can knowing their binary foundation help us make better use of the Rotten Tomatoes scores?
If we can make certain assumptions about the distribution of scores, we can tell a lot about which films are likely to be favored. Keep in mind that a good review counts the same as a great one. Therefore a film that is liked by everybody will do better than a film that is loved by most but leaves a few indifferent or hostile.
Without getting into relative merits (all are great films), consider Philadelphia Story and the big three from Martin Scorsese, Taxi Driver/Raging Bull/Goodfellas. By many measures, such as the influential Sight & Sound poll (according to Ebert "by far the most respected of the countless polls of great movies--the only one most serious movie people take seriously."), all three Scorsese pictures are among the most critically hailed movies ever. All three have very good scores on the "Tomatometer" but none have a perfect score. The same goes for films like Bonnie and Clyde, The Magnificent Ambersons, and Bicycle Thieves.
Philadelphia Story, on the other hand, is much less likely to get nominated as greatest film ever, but it is a movie that virtually everyone likes. It's an excellent film, skillfully directed, starring three of the most charming actors ever to come out of Hollywood. Not surprisingly, it has a perfect score on Rotten Tomatoes.
This is not to say that Sight & Sound is better than Rotten Tomatoes. Every scoring system is arbitrary, sometimes plays favorites and never exactly captures what we expect it to measure. The lesson here is that, if you want to use a metric in an argument, you need to know how that metric was derived and what its strengths and weaknesses. You can't find a perfect metric but you can have a pretty good idea where the imperfections are.
Thursday, October 2, 2014
Understanding Common Core-aligned math homework
I volunteer a couple of times a week with a group that does after school tutoring for urban students in LA. My role is "math floater." I walk around the room and help the kids, and sometimes the tutors, with math problems. When the kids ask for help, it's usually just your basic math question, but when the tutors ask for help it's often less about the math and more about the unfamiliar approach the assignment takes to solving a familiar problem.
This is perhaps most exasperating for those tutors with math backgrounds. You can imagine what it must be like to have a degree in engineering and yet be stumped by an eighth-grader's pre-algebra homework. Of course, it's not the math that's throwing them; it's all the weird and arbitrary steps that have been layered onto the math.
After struggling a bit myself, I realized that the key was to approach these problems as bad translations of unknown texts. If I looked hard enough, I could usually find an antecedent, a good lesson (something I had read in PĆ³lya or seen demonstrated by a master teacher or used with success in one of my classes) that had somehow devolved into the misshapen thing sitting in front of the student.
Recently, I ducked into the tutoring center when I wasn't scheduled to work. I just stepped in to use the bathroom but before I got across the room, I heard a couple of tutors calling my name. They were struggling with a third or fourth grade problem where the student had to perform a number of steps including filling out a three by three grid in order to find the product of two three-digit numbers. The answer kept coming out wrong and none of the tutors could figure out why since none of them were sure how the process was supposed to go.
The point of the question was to illustrate the distributive property. Handled properly, the general format could have made for a pretty good problem. As was it was a disaster. Developmentally inappropriate, badly explained, overly long (two-digit numbers would have made the point just as well), devoid of relevant context. Like a bad translation of a bad translation of a good problem. That got me wondering if perhaps the process for coming up these problems worked something like this...
This is perhaps most exasperating for those tutors with math backgrounds. You can imagine what it must be like to have a degree in engineering and yet be stumped by an eighth-grader's pre-algebra homework. Of course, it's not the math that's throwing them; it's all the weird and arbitrary steps that have been layered onto the math.
After struggling a bit myself, I realized that the key was to approach these problems as bad translations of unknown texts. If I looked hard enough, I could usually find an antecedent, a good lesson (something I had read in PĆ³lya or seen demonstrated by a master teacher or used with success in one of my classes) that had somehow devolved into the misshapen thing sitting in front of the student.
Recently, I ducked into the tutoring center when I wasn't scheduled to work. I just stepped in to use the bathroom but before I got across the room, I heard a couple of tutors calling my name. They were struggling with a third or fourth grade problem where the student had to perform a number of steps including filling out a three by three grid in order to find the product of two three-digit numbers. The answer kept coming out wrong and none of the tutors could figure out why since none of them were sure how the process was supposed to go.
The point of the question was to illustrate the distributive property. Handled properly, the general format could have made for a pretty good problem. As was it was a disaster. Developmentally inappropriate, badly explained, overly long (two-digit numbers would have made the point just as well), devoid of relevant context. Like a bad translation of a bad translation of a good problem. That got me wondering if perhaps the process for coming up these problems worked something like this...
Wednesday, October 1, 2014
Two ways of looking at the achievement gap and how the reform debate often misses them both
The following came out of a phone conversation I had this weekend with Joseph. I'll need to get back to this later but for now here's a thumbnail version just to have something on the record.
When we talk about the achievement gap in education, there are two distinct but valid ways of approaching the question:
The first is in terms of variability. The people in the bottom quartile are, by most measures, getting a much worse education than the remaining three quarters of the population;
The second involves correlation. People in that bottom quartile are disproportionately likely to be poor, to be black or Hispanic, or to speak English as a second language.
You address the first by raising scores for those at the bottom. You address the second by changing the order. Reducing the gap is still desirable regardless of the definition used -- we don't want any of our schools to be bad nor do we want an education system that entrenches the class system -- and there are many things we can do that will improve both, but it is important to remember that we are talking about two distinct objectives.
To further complicate the picture, proposals that are meant to improve educational outcomes in general are often pitched as ways to address the achievement gap.
All three goals (improving overall outcomes, reducing variability and breaking the correlation) are important -- I'd argue the third one is absolutely vital -- but whenever we need to be clear about what we are trying to do.
When we talk about the achievement gap in education, there are two distinct but valid ways of approaching the question:
The first is in terms of variability. The people in the bottom quartile are, by most measures, getting a much worse education than the remaining three quarters of the population;
The second involves correlation. People in that bottom quartile are disproportionately likely to be poor, to be black or Hispanic, or to speak English as a second language.
You address the first by raising scores for those at the bottom. You address the second by changing the order. Reducing the gap is still desirable regardless of the definition used -- we don't want any of our schools to be bad nor do we want an education system that entrenches the class system -- and there are many things we can do that will improve both, but it is important to remember that we are talking about two distinct objectives.
To further complicate the picture, proposals that are meant to improve educational outcomes in general are often pitched as ways to address the achievement gap.
All three goals (improving overall outcomes, reducing variability and breaking the correlation) are important -- I'd argue the third one is absolutely vital -- but whenever we need to be clear about what we are trying to do.
Limits of Market Forces: a never-ending saga
This is a post by Joseph.
I was reading this article and was struck by this passage:
Second, market forces work best when the costs are borne by those for whom the service is provided to. Here we need to be very tricky -- policing and trials are not usually services that criminals want provided to them -- instead it may be a cost of doing business to them. Nor do they have much influence in setting costs or process. Instead, the service is provided to all of the non-criminals, who are made safer by the policing.
So if we fund the system by charging criminals, we inherently break a key feedback loop of market forces. Criminals cannot, for example, pick their judge or arresting officer. Nor due we seek to compensate "users" who are incarcerated by mistake, but in other venues billing errors are routinely addressed.
Instead, I would argue a fair justice system has market value. A predictable legal environment and a good set of laws makes it easier for business to function efficiently and to invest in the future. That is a public good, as much as clean air or automobile capable roads are.
I was reading this article and was struck by this passage:
Poole lamented in his blueprint that the country was still not ready in 1980, and he warned his policymaker readers to expect resistance at the local level if they tried to push through programs transferring the costs for criminal justice (and policing) from general taxpayers to “users.” But one thing that Poole and Reason are very proud of is how they brought ideas from the fringes to the mainstream — and Ferguson is a prime example of how Poole’s neoliberal blueprints on privatizing criminal justice were eventually adopted in cities across the country
In Ferguson’s offender-fee system, city revenues from traffic fines make up 21% of the city budget and continue soaring. Those revenues are squeezed mostly from black drivers — 86% of motorists stopped in Ferguson are African-American, well above their 63% portion of the town’s population.There are two pieces that I think need to be very carefully thought about. First, as a matter of history, making criminal charges a means of raising revenue has been associated with the worst excesses of tyranny. Think of the issues of High Treason and attainder during the War of the Roses and Tudor era in England. Does anybody think the ability to seize people's property made these excesses better but reducing taxes (for example)? So this is not an inevitable property of these systems, but it is worth thinking about carefully when implementation is being considered.
Second, market forces work best when the costs are borne by those for whom the service is provided to. Here we need to be very tricky -- policing and trials are not usually services that criminals want provided to them -- instead it may be a cost of doing business to them. Nor do they have much influence in setting costs or process. Instead, the service is provided to all of the non-criminals, who are made safer by the policing.
So if we fund the system by charging criminals, we inherently break a key feedback loop of market forces. Criminals cannot, for example, pick their judge or arresting officer. Nor due we seek to compensate "users" who are incarcerated by mistake, but in other venues billing errors are routinely addressed.
Instead, I would argue a fair justice system has market value. A predictable legal environment and a good set of laws makes it easier for business to function efficiently and to invest in the future. That is a public good, as much as clean air or automobile capable roads are.
The Thirty Million Words Initiative
This is interesting [if you get a chance, listen to the audio at the link]
“By the end of the age of three, children who are born into poverty will have heard 30 million fewer words than their more affluent peers,” says Dr. Dana Suskind [A pediatric otolaryngologist at University of Chicago -- MP].
Dr. Dana Suskind is the director of the Thirty Million Words initiative – an education and research program out of Chicago.
...
The moment a baby is born their brain is already beginning to develop. That is why these early language interactions are so crucial. Scientists can actually measure the word gap or the number of words spoken at home. They use a little device called the LENA which stands for language environment analysis.
The LENA is about the size of a credit card. Babies wear it at chest level. Not only does it count or record words, it can also analyze what it records. The LENA is able to differentiate all the different kinds of sounds that are heard in a baby’s environment. One way to think of the LENA is that it’s like a language pedometer.
“So just like a regular pedometer counts the number of steps you take in a day the LENA counts the number of words a child is exposed to and how many conversations they have with their caregiver or parent,” Suskind says.
It’s not just the number of words spoken to babies but the quality of words spoken. Dr. Adriana Weisleder, a developmental psychologist who is an associate project director and co-investigator at the BELLE project says that, “in some families a lot of the speech to children is what they called business talk. The function of the speech is to get the child to do something right so they’re commands or imperatives. That happens in all families. It has to happen, right? Parents have to get their kids to do things. But when a high proportion of the speech that children hear is composed of those kinds of business talk or imperatives then that means they’re not getting a lot of the other rich talk and conversation.”
Still a device like the LENA can’t close the word gap all by itself.
“Just like a pedometer will not change the obesity and health crisis in the country, we can’t put everything on a piece of technology,” Suskind says.
One way Dr. Suskind’s Thirty Million Words initiative tries to close to gap is by actually going into homes. On top of going over the results from lena recordings – Thirty Million Words has created a curriculum for parents. It includes videos modeling ways caregivers should talk to their baby.
Families that speak more than one language at home can face a special challenge: what language should they speak to their kids in?
“It’s not just a moral and right thing, but the science is clear that parents and caregivers should be talking and interacting with their children in their native language. It does no good to be speaking in a language you don’t feel comfortable with,” Suskind says.
“Having a higher vocabulary even if it’s in Spanish still makes kids be more prepared for school.” Weisleder says.
Why is that? talking a lot to your child is about more than just teaching them words – it’s helping them understand basic concepts.
”If you know in spanish the words for horse and dog and house and barn. You know those words in spanish but you also know a lot of relationships between those things. You know that dogs and horses are animals and that a lot of dogs live in houses and horses might live in barns, lots of the different things.”
Bridging the word gap is not about getting babies ready to read Don Quixote by the age of four – it’s about setting up the building blocks so that children can be ready learn more easily once they get to school.
Tuesday, September 30, 2014
Why a predictably breakable rope is better than an unbreakable rope
Short answer: there is no such thing as an unbreakable rope.
There's an old story about an isolated monastery located high on the side of an unclimbable cliff. The only access to the monastery was by way of a basket that was hold up the side of the cliff on a single rope. One day a pilgrim who was climbing into the basket noticed that the rope looked old and parade. He asked the monk "when do you replace the rope?"
The monk replied "when it breaks."
If we generalize a bit, this becomes a useful analogy. We have a case where there is great cost associated with avoidable failure, but where there are also nontrivial costs associated with caution.
One common but probably misguided response to the situation is to buy a better rope i.e. come up with a system that is less likely to fail. If you have a shoddy system with lots of room for cheap and easy improvement, this approach makes a great deal of sense. If, on the other hand, you have already made all of the obvious and inexpensive upgrades, it probably makes more sense from a cost benefit perspective to start focusing on the question of when you replace the rope.
You frequently see this question coming up in connection to proxy variables. Particularly in the social sciences, researchers are constantly required to substitute an easily measured variable for the actual factor of interest. If we start with a "good rope" (a well-chosen proxy) then it will, under most circumstances, correlate strongly with the thing we are actually interested in.
There are plenty of "bad ropes" out there, proxies that have only weak relationships with the variables of interest even under the best circumstances, but that is a topic for another post. The disagreement here is with the otherwise responsible statisticians who make an effort to find the best possible proxy but who then do not spend enough time thinking about what happens when the rope breaks.
A few years ago, while I was doing risk models for a large bank, I found myself caught in a heated debate. We had a very good direct measure of how close people were to maxing out their line of credit. Unfortunately, this was also an expensive variable, so it was proposed that we substitute another, less direct measure. The argument for the substitution was that there was an extremely high correlation between the two variables. The counter argument put forward by most of the more experienced statisticians was that while this was true, that correlation tended to break down in extreme cases, particularly those where a person was about to go bad on all of their debts . Since the purpose of the model was to predict when people were about to default on their loans, this was a really unfortunate time for the relationship to fall apart.
There's an old story about an isolated monastery located high on the side of an unclimbable cliff. The only access to the monastery was by way of a basket that was hold up the side of the cliff on a single rope. One day a pilgrim who was climbing into the basket noticed that the rope looked old and parade. He asked the monk "when do you replace the rope?"
The monk replied "when it breaks."
If we generalize a bit, this becomes a useful analogy. We have a case where there is great cost associated with avoidable failure, but where there are also nontrivial costs associated with caution.
One common but probably misguided response to the situation is to buy a better rope i.e. come up with a system that is less likely to fail. If you have a shoddy system with lots of room for cheap and easy improvement, this approach makes a great deal of sense. If, on the other hand, you have already made all of the obvious and inexpensive upgrades, it probably makes more sense from a cost benefit perspective to start focusing on the question of when you replace the rope.
You frequently see this question coming up in connection to proxy variables. Particularly in the social sciences, researchers are constantly required to substitute an easily measured variable for the actual factor of interest. If we start with a "good rope" (a well-chosen proxy) then it will, under most circumstances, correlate strongly with the thing we are actually interested in.
There are plenty of "bad ropes" out there, proxies that have only weak relationships with the variables of interest even under the best circumstances, but that is a topic for another post. The disagreement here is with the otherwise responsible statisticians who make an effort to find the best possible proxy but who then do not spend enough time thinking about what happens when the rope breaks.
A few years ago, while I was doing risk models for a large bank, I found myself caught in a heated debate. We had a very good direct measure of how close people were to maxing out their line of credit. Unfortunately, this was also an expensive variable, so it was proposed that we substitute another, less direct measure. The argument for the substitution was that there was an extremely high correlation between the two variables. The counter argument put forward by most of the more experienced statisticians was that while this was true, that correlation tended to break down in extreme cases, particularly those where a person was about to go bad on all of their debts . Since the purpose of the model was to predict when people were about to default on their loans, this was a really unfortunate time for the relationship to fall apart.
Monday, September 29, 2014
The great buried lede of the Common Core debate
Lee Fang has another solid piece of investigative journalism at the Nation. It covers a lot of important ground (I'd recommend reading it for yourself), but I did want to single out a couple of paragraphs that hit on a previously mentioned point.
Intentionally or not, the speed of the implementation greatly increases the costs. In terms of both materials and training, a more gradual phase in would save a lot of money (it would also allow for field testing and fine tuning but that's a topic for another post). We should and will have a discussion about the pedagogical issues with the Common Core (you can get a head start on the debate here and here), but when we are talking about public policy proposals, proponents always need to show that their plans are cost effective.
The Department of Education under Obama has seen a flow of revolving door hires from the education investment community. In May of this year, the Senate confirmed Ted Mitchell, the chief executive of the NewSchools Venture Fund, as the Under Secretary for the US Department of Education. Prior to his government position, Mitchell, a personal investor in an array of education start-ups, forged a partnership last year with the creators of Facebook app FarmVille to create new education game products. James Shelton, the Deputy Secretary, is a longtime education investor and the former co-founder of LearnNow, a charter chain that was sold to Edison Learning, a for-profit charter management company.For all the controversy, there are some details on the Common Core story that we should all be able to agree on: it has been produced and implemented with remarkable speed; some of the major stakeholders (particularly teachers) feel they were left out of much of the process; the initiative has become one of the most hotly debated aspects of education reform; a great deal of money is at stake here.
In an interview with EdSurge, a trade outlet, Shelton explained that the Common Core standards will allow education companies to produce products that “can scale across many markets,” overcoming the “fragmented procurement market” that has plagued investors seeking to enter the K-12 sector. Moreover, Shelton and his team manage an education innovation budget, awarding grants to charter schools and research centers to advance the next breakthrough in education technology. Increased research and development in education innovation, Shelton wrote in testimony to Congress, will spark the next “equivalent of Google or Microsoft to lead the global learning technology market.” He added, “I want it to be a US company.”
Intentionally or not, the speed of the implementation greatly increases the costs. In terms of both materials and training, a more gradual phase in would save a lot of money (it would also allow for field testing and fine tuning but that's a topic for another post). We should and will have a discussion about the pedagogical issues with the Common Core (you can get a head start on the debate here and here), but when we are talking about public policy proposals, proponents always need to show that their plans are cost effective.
Sunday, September 28, 2014
"Marvel, Jack Kirby Heirs Settle Dispute Over Superhero Rights"
From Variety
You can find the rest of the thread here:
Intellectual property and Marvel
An IP post for the Fourth of July
A bit more background on the Jack Kirby IP case
More on the Jack Kirby copyright case
“Marvel and the family of Jack Kirby have amicably resolved their legal disputes, and are looking forward to advancing their shared goal of honoring Mr. Kirby’s significant role in Marvel’s history,” the litigants announced in a joint statement on Friday.I suspect Disney pretty much had to settle this and hopefully the Kirby heirs negotiated with this in mind. As mentioned before (Do copyright extensions drive innovation? -- Hollywood blockbuster edition), the entertainment industry's current model is based on accumulating huge content libraries then lobbying for an endless series of copyright extensions. Even with a corporate-friendly court, I can't imagine the major players would want to risk disturbing the status quo.
You can find the rest of the thread here:
Intellectual property and Marvel
An IP post for the Fourth of July
A bit more background on the Jack Kirby IP case
More on the Jack Kirby copyright case
Friday, September 26, 2014
Lots of red flags on this one
This is another one of those education stories where it's difficult to figure out what's actually going on but easy to see that the standard narrative has some pretty big plot holes.
I came across this narrative in an article by Ben Wieder that opens with the following:
Consider the section on Financial Resources.
This is hardly surprising. The Broad Foundation comes out of a culture that embraces both the power of privatization and the great-man theory of executive leadership. It is difficult to shock them with spending in areas they like (you can add technology to that list, John Deasy of the billion dollar iPad fiasco is another product of Broad). When the framework talks about financial soundness, the authors are more likely thinking about reductions in class sizes and pay raises for teachers who earn graduate degrees (Broad appears to be more comfortable with the idea of paying administrators for questionable degrees).
Both in what they look for and in what they overlook, it appears that the people at the Broad Foundation are using this prize to reward politicians and administrators who conform to their agenda. There is nothing wrong with this -- it is, after all, their money -- but journalists covering this story have a professional obligation to go beyond the press releases.
I came across this narrative in an article by Ben Wieder that opens with the following:
Former British Prime Minister Tony Blair and U.S. Secretary of Education Arne Duncan were both on hand Monday morning to crown the school districts in Gwinnett County, Georgia, and Orange County, Florida, as the first dual winners of the Broad Prize for Urban Education. They will split the $1 million prize, which comes in the form of scholarships worth up to $20,000 for graduating seniors.Lots of things here to make a fellow cautious -- impressive claims about a fuzzy target coming from a well-financed advocacy group -- and if there's a field that demands heightened caution, it's education reform. Perhaps even more than Bill Gates, Broad is the money man most associated with the Taylorist agenda in the education reform movement. Wieder does mention this concern, but he doesn't dig.
The prize, described variously as the Nobel or the Oscar or the Pulitzer of the education reform movement and sponsored by billionaire Eli Broad and his wife, Edythe, aims to “regain the American public’s confidence in public schools by spotlighting districts making significant gains in student achievement.” Both districts were cited for above-average academic performance for low-income and minority students relative to other districts in their states.
Skeptics question whether the foundation’s choice is influenced more by districts’ alignment with its policy goals than by student performance. “The Broad operation is so inherently ideological,” said Gary Orfield a professor in the graduate school of education at UCLA.That "purposeful wall" quote is really troubling, particularly when you follow the link Wieder provides. As best I can make the process out, while the finalists appear to be selected through fairly standard academic metrics, getting the big prize seems to depend on meeting a set of standards that very closely line up with the foundation's reform movement agenda.
But Nancy Que, director of the prize, says a “purposeful wall” is maintained between the foundation’s “reform” agenda and the prize selection. Each year, a review board looks at academic markers for the 75 largest urban school districts, including performance on state tests, graduation rates, and participation and performance on the SAT, ACT and Advanced Placement exams to determine finalists. The winner is selected after site visits to all of the finalist districts, taking into account their leadership and governance policies.
A team of experienced researchers and practitioners led by RMC Research Corporation, an education consulting company, then conducts site visits to each finalist district to gather additional quantitative and qualitative data. District policies and practices affecting teaching and learning are qualitatively analyzed according to a rubric for evaluating the quality of district-wide policies and practices. The criteria are grounded in research-based school and district practices found to be effective in three key areas: teaching and learning, district leadership, and operations and support systems.The framework consists largely of reform dog-whistles like standards-based curriculum and rigorous evidence-based instruction (because we all know the importance of rigor). Other parts are arbitrary and raise some interesting questions.
Consider the section on Financial Resources.
INDICATOR FR-1. The district is financially sound, implements prudent financial planning processes, and displays strong fiscal accountability.The first obvious question, how exactly this relates to the stated goals of improving student performance and closing the achievement gap, pales next to the question of what the Broad Foundation considers "fiscal accountability." Looking at the rather short list of previous winners, a couple of familiar names pop out, names associated with spending most of us would consider extravagant and wasteful. Gwinnett County Public Schools (already an odd choice for the award given its relatively upscale demographics, particularly compared to nearby Fulton) compensates its superintendent at the rate of nearly 400K. Even worse, Miami-Dade County Public Schools is in the midst of an ongoing scandal, as are most Florida school districts, due to a state policy of handing large checks to any con artist with a charter school application.
• The district is financially sound, having adequate fiscal reserves to meet current obligations and state-required minimums for reserves.
• The district budgeting process includes prudent financial planning and forecasting to anticipate fluctuations in funding sources and balance budgets without sacrificing educational quality.
• The district displays strong fiscal accountability, promoting cost effectiveness, employing effective internal controls over expenditures, and forecasting so there is little need to reconcile differences between anticipated and actual expenditures during the fiscal year.
This is hardly surprising. The Broad Foundation comes out of a culture that embraces both the power of privatization and the great-man theory of executive leadership. It is difficult to shock them with spending in areas they like (you can add technology to that list, John Deasy of the billion dollar iPad fiasco is another product of Broad). When the framework talks about financial soundness, the authors are more likely thinking about reductions in class sizes and pay raises for teachers who earn graduate degrees (Broad appears to be more comfortable with the idea of paying administrators for questionable degrees).
Both in what they look for and in what they overlook, it appears that the people at the Broad Foundation are using this prize to reward politicians and administrators who conform to their agenda. There is nothing wrong with this -- it is, after all, their money -- but journalists covering this story have a professional obligation to go beyond the press releases.
Thursday, September 25, 2014
Checking back in on "Netflix and the big swinging check syndrome"
Whenever possible it's good to follow-up.
A few weeks ago, the news broke that:
Netflix Acquires ‘The Blacklist’ For $2 Million An Episode
Except, of course, they didn't. As I noted in a post (with a title I should be a little less proud of):
But the "presumably stay with Hulu" part bothered me quite a bit. The point was left fairly vague in the news story I linked to and, if Netflix actually had managed to block Hulu from streaming the show, that would change the picture considerably.
So the day after the debut I checked the show's status.
A few weeks ago, the news broke that:
Netflix Acquires ‘The Blacklist’ For $2 Million An Episode
Except, of course, they didn't. As I noted in a post (with a title I should be a little less proud of):
For starters, you will notice that the headline is somewhat misleading. Netflix did not "acquire" the Black List in the sense that, say ABC would have. The show will still be running on NBC next year. Nor did it acquire the rights to stream the episodes during the regular season; those will presumably stay with Hulu. What Netflix did acquire was the right to stream the previous year's episodes.I was in the middle of a thread on how Netflix was yet another example of business journalists taking an appealing narrative -- visionary CEO using big data to transform his industry and make his company the next HBO -- and selectively ignoring the facts that contradicted it while more or less inventing others to support it.
But the "presumably stay with Hulu" part bothered me quite a bit. The point was left fairly vague in the news story I linked to and, if Netflix actually had managed to block Hulu from streaming the show, that would change the picture considerably.
So the day after the debut I checked the show's status.
Wednesday, September 24, 2014
The Con(firmation) Artists of the New York Times
I was gathering notes for yet another post on the sad state of fact-checking at the New York Times, this time concerning Alessandra Stanley when I came across this from then executive editor Bill Keller:
I also read the Cronkite piece that prompted Keller to describe Stanley as brilliant. It too was awful, consisting almost entirely of threadbare cliches ("that his outsize tenure bracketed a bygone era when America was, if not a more confident nation, certainly a more trusting one").
Thinking about the Dowd analogy as I went through the tired and badly thought-out memes of Stanley's essays, it struck me that, like David Brooks, David Carr, and her friend Dowd, Stanley was yet another of the New York Times' con(firmation) artists.
What makes a con(firmation) artist? First and foremost, of course, is the desire to confirm the beliefs and narratives held by their colleagues. All of these journalists have poor track records when it comes to factual accuracy but they largely escape the consequences of these lapses because they are saying things that other journalist believe to be true (or perhaps more accurately want to be true).
Con(firmation) artists also rely on a veneer of "new journalism" to conceal the cracks in their work. When you read the flashy prose , the big analogies, the constant editorial sides, you can almost imagine them saying "it worked in the Electric Kool-Aid Acid Test."
There are at least two major problems with this use of new journalism. The first is that the original generation of new journalists were extraordinarily hard working and were held to demanding standards by editors like Clay Felker. The second, and more important, is the fact that the original new journalists and the con(firmation) artists had opposite objectives . The goal then was to be original and unexpected. When Tom Wolfe discussed the fashions of the radical left, he came to new and surprising conclusions. When David Brooks talks about Home Shopping Network or David Carr talks about Netflix, they get their facts wrong but they reach conclusions that agree with the conventional wisdom of their peers.
This combination of pretension and pandering has given these writers extraordinary standing in their communities. It has also allowed them to do considerable damage to their professions.
* With the caveat that Keller may not know what the word 'fulsome' means, here is the correction in all of its epic glory:
Q: The NYT is taking considerable criticism for Ms. Stanley's piece, with many folks learning about the error via the Public Editor's column.It was an almost perfect example of why I have such problems with the New York Times, arrogant, dismissive of critics. Perhaps more importantly, it demonstrated the Keller's terrible journalistic taste and judgment. I went back and looked over the Shonda Rhimes piece again to confirm my first impression of Stanley's talents. It was, if anything, worse on second reading. It read like Stanley doing a bad job impersonating Maureen Dowd doing a bad job impersonating Pauline Kael. (I am a huge fan of Kael. However, as with Bob Dylan, there are things she can do brilliantly which you probably shouldn't try.)
A: Just to be clear (and I'm sure you know this) we published a fulsome correction* on July 22. Many folks may have learned about this episode from Clark's column, but many (including Clark) learned about it because we published a correction, which is also appended in perpetuity to the archived article. The evidence for what I'm about to say is purely anecdotal, but I think a lot of readers check the Corrections column with the same avidity they apply to the obits. On a good day they will come across something like our March 11 correction of a 1906 article that inaccurately cited the text of an inscription inside Abraham Lincoln's pocket watch. On a REALLY good day they may come across something like this one, from October, 2000: "An article in The Times Magazine last Sunday about Ivana Trump and her spending habits misstated the number of bras she buys. It is two dozen black, two dozen beige and two dozen white, not two thousand of each."
But I digress.
While I'm telling you what you obviously already know: One thing that sets a serious newspaper apart from most other institutions in our society is that we own up to our mistakes with corrections, editor's notes and other accountability devices, including the public editor's column. We hate getting stuff wrong and we work hard to avoid mistakes. But when we make them, we try to set the record straight.
...
Q: Specifically, some people inside the paper believe that Alessandra has been allowed to continue as a critic, without sufficient punishment, because she is close with Jill Abramson. Your response?
A: We love a conspiracy theory, but the truth is simple: Alessandra has been allowed to continue as a critic because she is -- in my opinion, among others -- a brilliant critic.
I also read the Cronkite piece that prompted Keller to describe Stanley as brilliant. It too was awful, consisting almost entirely of threadbare cliches ("that his outsize tenure bracketed a bygone era when America was, if not a more confident nation, certainly a more trusting one").
Thinking about the Dowd analogy as I went through the tired and badly thought-out memes of Stanley's essays, it struck me that, like David Brooks, David Carr, and her friend Dowd, Stanley was yet another of the New York Times' con(firmation) artists.
What makes a con(firmation) artist? First and foremost, of course, is the desire to confirm the beliefs and narratives held by their colleagues. All of these journalists have poor track records when it comes to factual accuracy but they largely escape the consequences of these lapses because they are saying things that other journalist believe to be true (or perhaps more accurately want to be true).
Con(firmation) artists also rely on a veneer of "new journalism" to conceal the cracks in their work. When you read the flashy prose , the big analogies, the constant editorial sides, you can almost imagine them saying "it worked in the Electric Kool-Aid Acid Test."
There are at least two major problems with this use of new journalism. The first is that the original generation of new journalists were extraordinarily hard working and were held to demanding standards by editors like Clay Felker. The second, and more important, is the fact that the original new journalists and the con(firmation) artists had opposite objectives . The goal then was to be original and unexpected. When Tom Wolfe discussed the fashions of the radical left, he came to new and surprising conclusions. When David Brooks talks about Home Shopping Network or David Carr talks about Netflix, they get their facts wrong but they reach conclusions that agree with the conventional wisdom of their peers.
This combination of pretension and pandering has given these writers extraordinary standing in their communities. It has also allowed them to do considerable damage to their professions.
* With the caveat that Keller may not know what the word 'fulsome' means, here is the correction in all of its epic glory:
This article has been revised to reflect the following correction:If that's not enough, Gawker and CJR have more.
Correction: July 22, 2009
An appraisal on Saturday about Walter Cronkite’s career included a number of errors. In some copies, it misstated the date that the Rev. Dr. Martin Luther King Jr. was killed and referred incorrectly to Mr. Cronkite’s coverage of D-Day. Dr. King was killed on April 4, 1968, not April 30. Mr. Cronkite covered the D-Day landing from a warplane; he did not storm the beaches. In addition, Neil Armstrong set foot on the moon on July 20, 1969, not July 26. “The CBS Evening News” overtook “The Huntley-Brinkley Report” on NBC in the ratings during the 1967-68 television season, not after Chet Huntley retired in 1970. A communications satellite used to relay correspondents’ reports from around the world was Telstar, not Telestar. Howard K. Smith was not one of the CBS correspondents Mr. Cronkite would turn to for reports from the field after he became anchor of “The CBS Evening News” in 1962; he left CBS before Mr. Cronkite was the anchor. Because of an editing error, the appraisal also misstated the name of the news agency for which Mr. Cronkite was Moscow bureau chief after World War II. At that time it was United Press, not United Press International.
This article has been revised to reflect the following correction:
Correction: August 1, 2009
An appraisal on July 18 about Walter Cronkite’s career misstated the name of the ABC evening news broadcast. While the program was called “World News Tonight” when Charles Gibson became anchor in May 2006, it is now “World News With Charles Gibson,” not “World News Tonight With Charles Gibson.”
Subscribe to:
Posts (Atom)