Showing posts with label fitness landscapes. Show all posts
Showing posts with label fitness landscapes. Show all posts

Wednesday, March 20, 2013

Apple, J.C. Penney* and fitness landscapes in everything

James Kwak has an excellent piece on Ron Johnson's unfortunate run as CEO of J.C. Penney.
According to today’s Wall Street Journal article, Johnson quickly eliminated coupons and most sales at J.C. Penney.

“Johnson bristled when a colleague suggested that he test his new no-discounts strategy at a few stores. . . . ‘We didn’t test at Apple,’ the executive recalled Mr. Johnson . . . saying.”

Well, yeah. Apple doesn’t discount because they sell stuff that people really, really want and that they can’t get anyplace else. And they don’t test because Steve Jobs refused to. At Penney? Sales have fallen by about 30 percent.

This doesn’t mean Johnson is stupid, or that he’s going to fail as CEO. Apparently he has partially reversed his early decision, which is a good sign. But it brings up a common feature of external CEO hires. Companies in a perceived crisis often look outside for a new leader, hoping for a superman (or -woman) who can singlehandedly turn around the organization. Not completely illogically, they tend to look for people at successful companies. “Make us more like X,” they pray. In Penney’s case, X = Apple.

There are two important questions they tend not to ask, however. First, was Apple successful because of Johnson, or was he just along for the ride? Yes, he was the main man behind the Apple Store (although, according to Walter Isaacson’s book, Steve Jobs was really the genius behind everything). But was the success of the Apple Store just a consequence of the success of the iPhone?

Second, even if Johnson was a major contributor to Apple’s success, how much of his abilities are transferable to and relevant to J.C. Penney? There’s a big difference between selling the most lusted-after products on the planet and selling commodities in second-rate malls. When someone has been successful in one context, how much information does that really give you about how he will perform in a new environment?
The obvious interpretation here is as a cautionary tale of executive hubris, but you can also look at it in terms of fitness landscapes (the following will be fairly straightforward, but if the concept doesn't ring a bell you might want to check herehere, and of course, here).

Let's try thinking in terms of the retail fitness landscape (presented with the usual caveat that I'm way out of my field). Just how distant is the Apple Store from J.C.P.?

Apple Stores are a relatively small boutique chain (400 stores total, 250 in the U.S.) concentrated heavily in prime commercial urban and upscale suburban areas. Their customer demographics tend toward upper income, fashion-conscious early adopters with a demonstrated willingness to pay a premium for quality. Inventories consist of a few heavily-branded, high-quality, high mark-up items, all of which come from one very visible supplier with an excellent reputation. This allows an unusual (perhaps unique -- there's not another Apple) symbiotic relationship. The stores give the supplier a presence and a profit center while the stores benefit from the supplier's powerful brand, large advertising budget and unparalleled PR operation.

In terms of customers, products, brand, retail space, vendors relations, logistics, scale and business model, moving from the Apple Store to JCP was a shift to a distant part of the retail landscape. What Johnson did, in essence, was say "these coordinates are associated with an extremely high point on the landscape (the Apple Store). Even though we've made large shifts in many of these dimensions, we can keep the same coordinates for the other dimensions and we'll find another optima."

To put this in context, here's a useful example from T. Grandon Gill
Suppose, for example, you had a fitness function that mapped the list of ingredients to an objectively determined measure of “taste fitness” for all the recipes in a cookbook. If you were to do a regression on taste (dependent variable) using the ingredients (independent variables), you might find—for instance—that garlic shows a high positive significance. What would that tell you (other than, possibly, that the individuals rating the recipes enjoyed garlic)? What it would definitely not tell you is that you could improve your recipe for angel cake by adding garlic to it. Indeed, the whole notion of applying a technique that assumes linear decomposability to a fitness landscape that is so obviously not decomposable is preposterous.
Substitute a low level of coupons for a high level of garlic and you have a pretty good picture of the JCP strategy.

How do we know the retail landscape is rugged? We don't, but we do have considerable evidence that certain approaches work better in some circumstances than they do in others (i.e. there are multiple local optima). More to the point, Johnson's entire strategy pretty much assumed that the many small and large players in the department store area (including Macy's, Sear, Dillards, Kohls, the pre-Johnson JCP and countless smaller chains and individual stores) were trapped in one or more low-quality optima. When you have this many diverse companies in a market this competitive and this mature, you expect to see a fair amount of something analogous to gradient searching ("That worked; let's do more to it."). If they haven't settled on your optimum point, it's almost certainly because they settled on another.

The lessons -- when you move into an established market you should probably assume the established players know the field and you should probably not assume that what worked somewhere else will work here -- could be (and were) reached without referring to fitness landscapes, but they do make a good framework for approaching a wide variety of problems.

Johnson moved to an unfamiliar region of a probably rugged landscape and refused to explore the surrounding area for higher points despite the fact that numerous other players that had explored the region had settled on a completely different points. When you phrase it this way, it doesn't sound good (of course, Johnson's approach doesn't sound good when you phrase it most ways).


* The 'C' stands for 'Cash' -- no, really.

Tuesday, March 12, 2013

Landscapes in everything

SLIGHTLY UPDATED

One of the issues I have with economics exceptionalism is the word 'everything,' as in "markets in everything" or "the hidden side of everything." Not that there's anything wrong with applying economic concepts to a wide variety of questions (I do it myself), but at some point they become overused and start crowding out ideas that are better in a given context.

Think about all the times you heard phrases like the 'marriage market' often followed by the implicit or explicit suggestion that the tools of economics hold the key to understanding all sorts of human behavior even in cases where the underlying assumptions of those tools probably don't apply. Now, for example, compare that to the number of times you've recently heard someone describe something as a fitness landscape when they weren't talking about evolution or physics (OK, that's not the term physicists generally use but the concept is basically the same).

Landscapes are a powerful and widely applicable concept, arguably more so than markets (they are also a long-time fascination of mine). Ideas like gradient searches, perturbation, annealing and, most of all, local optimization are tremendously useful, both to explain complex problems and to suggest approaches for solving them. Once you start thinking in those terms you can see landscapes about as often as Tyler Cowen sees markets.

You can even find researchers coming up with the kind of unexpected, everyday examples that you might expect in a Steven Levitt column.

My favorite recent example (at least recent to me) is T. Grandon Gill's observation that recipes in a cookbook are essentially the coordinates of local optima on a culinary fitness landscape where the amount of each ingredient are the dimensions and taste is the fitness function (technically we should add some dimensions for preparation and make some allowance for the subjectivity of taste, but I'm keeping things simple).

This is a great example of a rugged landscape that everyone can relate to. You can find any number of delicious recipes made with the same half dozen or so ingredients. As you start deviating from one recipe (moving away from a local optima), the results tend to get worse initially, even if you're moving toward a better recipe.

Approaching something as a rugged landscape can provide powerful insights and very useful tools, which leads to another concern about economic exceptionalism -- economics as a field tends to make little use of these models and many economists routinely make modeling assumptions that simply make no sense if the surface being modeled really is rugged.

I asked Noah Smith* about this and as part of his reply he explained:
But for analyzing the equilibrium state of the economy - prices and quantities - economists tend to try as hard as they can to exclude multiple equilibria. Often this involves inventing arbitrary equilibrium criteria with zero theoretical justification. This is done routinely in micro (game theory) as well as in macro. An alternative procedure, commonly used in macro by DSGE practitioners, is to linearize all their equations, thus assuring "uniqueness". Some researchers are averse to this practice, and they go ahead and publish models that have multiple equilibria; however, there is a strong publication bias against models that have multiple equilibria, so many economists are afraid to do this. An exception is that some models with two equilibria (a "good" equilibrium and a "bad" or "trap" equilibrium) do get published and respected. Models with a bunch of equlibria, or where the economy is unstable and tends to shift between equilibria on its own at a high frequency, are pretty frowned upon.
This doesn't mean that economists can't work with these concepts, but it does mean that as economists increasingly dominate the social sciences, approaches that don't fit with the culture and preferred techniques of economics are likely to be underused.

And some of those techniques are damned useful.

* now with source.

Friday, March 16, 2012

How our inability to distinguish between independence and contrarianism encourages Steve Landsburg to be, let's just say, a less effective pundit

[I decided that the tone was getting a bit sharp in this debate so I'm dialing things down a bit. This entailed some very slight rewriting but none of these changes the substance of the post]

Before getting to the main thesis, let's confirm just how bad this incident was. A radio personality with millions of listeners grossly misrepresented the comments of a private citizen speaking out on an issue then used those distortions to make offensive and badly-reasoned attacks on the the woman. The situation at that point was bad enough but we don't really achieve horrible until Landsburg jumped in. Not only did Landsburg throw his reputation behind Limbaugh's illogical and factually challenged comments, he actually added additional [poor] arguments to the abuse this woman has had to put up with.

Noah Smith, Scott Lemieux, my co-blogger and others have done an excellent job addressing the lies and idiocy of this affair (check out how this blogger dismembers the I'm-mocking-the-postion-not-the-person defense) . The question for now is how this happened. How did a mid-level economist manage to reach such national prominence by writing a series painfully sophomoric books and articles?

Part of the answer, I'd argue, lies in the way journalists and editors now treat the counterintuitive. Publications like Slate give us a steady diet of pieces that take some claim that seems obviously true and argue the opposite. These publications would have us believe that this practice is a sign of intellectual independence and healthy diversity of opinion. It's not.

Contrarianism is closer to the opposite of independence, a point that's easiest to explain if we think in the idealized terms of a simplified fitness landscape. and draw an analogy between the defensibility of an argument associated with a certain position and the fitness of a phenotype associated with a certain genotype. (more on landscapes here)

Of course, it would take a lot of variables to realistically describe this landscape but the basic concepts still hold even if we simplify it to a bare-bones x, y and v(x,y). For every position (x,y) you can take, there's a resulting viability (v). Some positions are easy to defend (v is high). Some are difficult (v is low). Pundits and news analysts who try to find the best positions to argue are therefore performing an optimization algorithm (though most probably never thought about it in those terms).

For the most part, we can place this commentary and analyses in three general categories:

Neighborhood

Independent/semi-independent

Contrarian


The neighbor searcher tries to find the most defensible position within the neighborhood of a starting point. The best example I can think of here is the work David Frum specialized in until fairly recently. Frum was not being independent with his pieces in the Wall Street Journal or public radio (the terminal point of his searches was almost always within the neighborhood of the established conservative consensus) but he was arguably doing something as or more important, thoroughly exploring the landscape of the region and encouraging evolutionary shifts to sounder, more defensible positions.

The independent searcher, by contrast, goes where the search leads regardless of the starting position. The semi-independent searcher adds the condition that the terminal point has to be original (in other words, you can't end up on a point that someone else has already argued). Technically, originality and independence are in opposition here but in practice, they tend to complement each other.

And the two categories tend to complement each other as well. To grossly oversimplify, one group searches x+1 to x-1 and y+1 to y-1; the other group searches everywhere else. Given the fact the consensuses originally form around what seem at the time to be good ideas, it makes sense to explore their neighborhoods (if it helps, you could think of this in terms of Bayesian priors), but it also makes sense to keep exploring new territory. David Brooks and Frank Rich refine and improve their relative corners of the political landscape while writers like Jonathan Chait or William Safire range further and are more likely to reach unexpected conclusions.

The contrarian approach is to start with a position (x.y) that seems obviously true (often because it is true) then jump to either (-x,y) or (x,-y) and argue from there. It can, at first glance, look like the result of an independent search,but it is actually far more constrained than the neighborhood searches of Frum and Rich. Both of those writers would shift positions based on their reasoning and would insist on finding a defensible point before sitting down to the keyboard.

The typical contrarian piece hews so closely to its initial (-x,y) that there's no indication of a search at all. By all appearances, the writer simply jumps to the contrarian position and starts typing.

Contrarian writing crowds out good journalism and pumps misinformation and faulty arguments into the discourse. This would be bad at any time, but in the current state of journalism, it's disastrous. Here's a list of dangerous trends in journalism from an earlier post (with a link added from a different paragraph):

1. Reliable information sources like the CBO are undermined;

2. An increasing amount of our information comes from unreliable subsidized sources like Heritage;

3. Journalists suffer no penalty for publishing inaccurate information;

4. Journalists also fashion for themselves an incredibly self-serving ethical rule that lets them, in the name of balance, avoid the consequences that would have to be faced if they honestly assigned responsibility for screw-ups;

5. A growing tendency to converge on a narrative makes the media easier to manipulate.
All of these factors make it more difficult for our society to deal with bad data and contrarians are a rich source of some of the worst.

In a healthy journalistic system, counter-intuitive claims would be held to a higher standard (at least if we think like Bayesians) and if a logically or factually flawed argument made it through, both the authors and the editors would feel pressure to see that it didn't happen again.

In our current system, counter-intuitive claims are held to a lower standard (because they generate traffic) and serial offenders can actually build careers by badly arguing points that probably aren't true. Editors have lost all interest in fact-checking and outside efforts at debunking are usually treated as he said/she said.

It's easy to object to the positions Landsburg takes, but perhaps the truly offensive aspect here is the way Landsburg and the other contrarians reach those positions.


Sunday, April 10, 2011

Hamsters, fitness landscapes and an excuse for a repost

All Things Considered had an interesting little story today about the origin of the hamster. I was particularly intrigued by this part:

More troubles followed in the lab. There was more hamster cannibalism, and five others escaped from their cage — never to be found. Finally, two of the remaining three hamsters started to breed, an event hailed as a miracle by their frustrated caretakers.

Those Adam-and-Eve hamsters produced 150 offspring, Dunn says, and they started to travel abroad, sent between labs or via the occasional coat pocket. Today, the hamsters you see in pet stores are most likely descendants of Aharoni's litter.

Because these hamsters are so inbred, they typically have heart disease similar to what humans suffer. Dunn says that makes them ideal research models.

This reminded me of a post from almost a year ago on the subject of lab animals. It also reminded me that I still haven't gotten around to the follow-up I had in mind. Maybe all this reminding will translate into some motivating and I'll actually get the next post on the subject written.
In this post I discussed gradient searches and the two great curses of the gradient searcher, small local optima and long, circuitous paths. I also mentioned that by making small changes to the landscape being searched (in other words, perturbing it) we could sometimes (with luck) improve our search metrics without significantly changing the size and location of our optima.

The idea that you can use a search on one landscape to find the optima of a similar landscape is the assumption behind more than just perturbing. It is also the basis of all animal testing of treatments for humans. This brings genotype into the landscape discussion, but not in the way it's normally used.

In evolutionary terms, we look at an animal's genotype as a set of coordinates for a vast genetic landscape where 'height' (the fitness function) represents that animal's fitness. Every species is found on that landscape, each clustering around its own local maximum.

Genotype figures in our research landscape, but instead of being the landscape itself, it becomes part of the fitness function. Here's an overly simplified example that might clear things up:

Consider a combination of two drugs. If we use the dosage of each drug as an axis, this gives us something that looks a lot like our first example with drug A being north/south, drug B being east/west and the effect we're measuring being height. In other words, our fitness function has a domain of all points on our AB plane and a range corresponding to the effectiveness of that dosage. Since we expect genetics to affect the subjects reaction [corrected a small typo here] to the drugs, genotype has to be part of that fitness function. If we ran the test on lab rats we would expect a different result than if we tested it on humans but we would hope that the landscapes would be similar (or else there would be no point in using lab rats).

Scientists who use animal testing are acutely aware of the problems of going from one landscape to another. For each system studied, they have spent a great deal of time and effort looking for the test species that functions most like humans. The idea is that if you could find an animal with, say, a liver that functions almost exactly like a human liver, you could do most of your controlled studies of liver disease on that animal and only use humans for the final stages.

As sound and appealing as that idea is, there is another way of looking at this.

On a sufficiently high level with some important caveats, all research can be looked at as a set of gradient searches over a vast multidimensional landscape. With each study, researchers pick a point on the landscape, gather data in the region then use their findings [another small edit] and those of other researchers to pick their next point.

In this context, important similarities between landscapes fall into two distinct categories: those involving the positions and magnitudes of the optima; and those involving the search properties of the landscape. Every point on the landscape corresponds to four search values: a max; the number of steps it will take to reach that max; a min; and the number of steps it will take to reach that min. Since we usually want to go in one direction (let's say maximizing), we can generally reduce that to two values for each point, optima of interest and time to converge.

All of this leads us to an interesting and somewhat counterintuitive conclusion. When searching on one landscape to find the corresponding optimum of another, we are vitally interested in seeing a high degree of correlation between the size and location of the optima but given that similarity between optima, similarity in search statistics is at best unimportant and at worst a serious problem.

The whole point of repeated perturbing then searching of a landscape is to produce a wide range of search statistics. Since we're only keeping the best one, the more variability the better. (Best here would generally be the one where the global optimum is associated with the largest region though time to converge can also be important.)

Wednesday, January 26, 2011

Repost -- Fitness Landscapes, Ozark Style

[I'm working on a long post that uses fitness landscapes, so I thought I'd rerun some previous posts to get the conversation going.]

I grew up with a mountain in my backyard... literally. It wasn't that big (here in California we'd call it a hill) but back in the Ozarks it was a legitimate mountain and we owned about ten acres of it. Not the most usable of land but a lovely sight.

That Ozark terrain is also a great example of a fitness landscape because, depending on which side you look at, it illustrates the two serious challenges for optimization algorithms. Think about a mountainous area at least partially carved out by streams and rivers. Now remove all of the rocks, water and vegetation drop a blindfolded man somewhere in the middle, lost but equipped with a walking stick and a cell phone that can get a signal if he can get to a point with a clear line of sight to a cell tower.

With the use of his walking stick, the man has a reach of about six feet so he feels around in a circle, finds the highest point, takes two paces that direction then repeats the process (in other words, performs a gradient search). He quickly reaches a high point. That's the good news; the bad news is that he hasn't reached one of the five or six peaks that rise above the terrain. Instead, he has found the top of one of the countless hills and small mountains in the area.

Realizing the futility of repeating this process, the man remembers that an engineer friend (who was more accustomed to thinking in terms of landscape minima) suggested that if they became separated he should go to the lowest point in the area so the friend would know where to look for him. The man follows his friend's advice only to run into the opposite problem. This time his process is likely to lead to his desired destination (if he crosses the bed of a stream or a creek he's pretty much set) but it's going to be a long trip (waterways have a tendency to meander).

And there you have the two great curses of the gradient searcher, numerous small local optima and long, circuitous paths. This particular combination -- multiple maxima and a single minimum associated with indirect search paths -- is typical of fluvial geomorphology and isn't something you'd generally expect to see in other areas, but the general problems of local optima and slow convergence show up all the time.

There are, fortunately, a few things we can do that might make the situation better (not what you'd call realistic things but we aren't exactly going for verisimilitude here). We could tilt the landscape a little or slightly bend or stretch or twist it, maybe add some ridges to some patches to give it that stylish corduroy look. (in other words, we could perturb the landscape.)

Hopefully, these changes shouldn't have much effect on the size and position of the of the major optima,* but they could have a big effect on the search behavior, changing the likelihood of ending up on a particular optima and the average time to optimize. That's the reason we perturb landscapes; we're hoping for something that will give us a better optima in a reasonable time. Of course, we have no way of knowing if our bending and twisting will make things better (it could just as easily make them worse), but if we do get good results from our search of the new landscape, we should get similar results from the corresponding point on the old landscape.

In the next post in the series, I'll try to make the jump from mountain climbing to planning randomized trials.

* I showed this post to an engineer who strongly suggested I add two caveats here. First, we are working under the assumption that the major optima are large relative to the changes produced by the perturbation. Second our interest in each optima is based on its size, not whether it is global. Going back to our original example, let's say that the largest peak on our original landscape was 1,005 feet tall and the second largest was 1,000 feet even but after perturbation their heights were reversed. If we were interested in finding the global max, this would be be a big deal, but to us the difference between the two landscapes is trivial.

These assumptions will be easier to justify when start applying these concepts in the next post in the series. For now, though, just be warned that these are big assumptions that can't be made that often.

Wednesday, June 30, 2010

Bob Dylan, the Monkees and the flooded landscape analogy

Seth Godin's comments on Bob Dylan and the Monkees (which comes to us via Gelman via DeWitt via Tinkers via Evers via Chance) got me thinking about fitness landscapes. Here's the quote:
Let me first describe a distinction between the Monkees and Bob Dylan. Bob Dylan gets laughed or booed off the stage every ten years, whether he wants to or not. He got booed off the stage when he went electric and again when he went gospel, and most recently with his horrendous Christmas album. The Monkees never get booed off stage, because the Monkees play "Last Train to Clarksville" exactly the same way they did it 30 or 40 years ago. Here's the thing: Bob Dylan keeps selling out stadiums and no one goes to see the Monkees, because the Monkees aren't doing anything worth noticing. There are people who have succeeded who just keep playing the same song over and over again, whatever that is that they do.
Think of a musician's career as a landscape where creative decisions like repertory, genre, style, arrangements give the location and concert sales are the fitness function. (see here and here for previous posts on landscapes)

In Godin's example, the Monkees have stuck very close to a local maxima that has sank over the years (the sticking close part doesn't actually match reality all that well -- Mike Nesmith had a run of innovative and interesting projects in the early days of music video -- but for the sake of the post let's overlook that part). Any small to moderate change in repertory or arrangement or style would move them to a lower point on the landscape.

I think I may be stealing this from Stuart Kuafmann, but let's flesh out the metaphor a bit and add water. Our landscape dwellers can travel freely on dry land but they can only swim very short distances. Exactly how does this relate to our real life example? Remember that altitude in our landscape corresponds to ticket sales. In order to stay viable, ticket sales for a touring act have to stay above a certain level. If the sales fall below that level, the act loses bookings and can no longer cover its expenses. Of course, like any other business, the act can run at a loss for a while (swim) but that's obviously not a long term solution.

Godin suggest that a willingness to, in our analogy, move to another optima is the key to success. Dylan made the move and thrived. The Monkees stayed put and whithered. But how comparable were the two situations?

Dylan had a steady source of income from other artists covering his songs. In landscape terms, he was a good swimmer (of course, so was Nesmith who got a tiny check every time you used that little Liquid Paper brush). More importantly, Dylan didn't have that far to swim. He might not even have needed to get wet. At least a portion of Dylan's fan base were going to stay with him no matter where he went on the musical landscape and given his reputation (and phenomenal talent, though I'm trying to leave that out of the discussion), there was a maxima waiting for him at pretty much every genre and subgenre of popular music. Those moves might not have been as artistically or commercially successful as the ones he made but Dylan was going to remain viable no matter where he went.

What about about the Monkees? Musically they weren't a bad line-up. Dolenz was a veteran child actor, Jones was a Tony nominee for Oliver! and Tork and Nesmith were both accomplished musicians. Highly successful careers have certainly been built on less, but what did their career landscape look like? Compared to Dylan's collection of tightly-packed peaks, the Monkees had a lonely island surrounded by what looked like a large and empty ocean. The vast majority of their fan base was location specific. When they moved away from that location they hit deep water very quickly.

It is, of course, possible that the group could have focused on coming up with new songs and a new sound with the hope of finding a new audience. This is a dynamic landscape, and where the artist chooses to go is one of the factors that affects it. There might not be a concert market for the Monkees playing new grass or thrash metal now but that doesn't mean there won't be one in the future. Sometimes, by playing music no one wants to hear, you can create a demand for that music. To return to the landscape analogy, treading water in one spot can cause an island to rise up beneath you. It has been known to happen but it's probably not something you want to count on.

In the case of the Monkees, the water-treading strategy would be particularly risky since their reputation is likely to work against them if they try something radically new. This is probably why Nesmith chose to use his own much less well known name for the Grammy-winning Elephant Parts rather than trying to sell it as a Monkees project.

Which brings us back to Mr. Godin and the advice books he and other business gurus dump on the market every year. These books gush out at such a rate that there are actually companies that put out fifteen page versions so that executives can at least give the impression that they have read the latest releases. The Dylan/Monkees example is sadly representative. It takes one of business gurus' favorite truisms (take risks, i.e. move out of your comfort zone, i.e. they laughed at Henry Ford), bills it as a fundamental key to fabulous success (fabulous as in fabled as in obviously untrue) then backs it up with an irrelevant but impressive sounding example.

Godin is telling businesses to be like Bob Dylan and to make radical moves that may piss off your customers and invite scorn and mockery. The trouble is very few businesses are Dylan-at-Newport. The majority are the Monkees-at-the-state-fair. They have something they do reasonably well. If they stick close to their local maxima they can turn a decent profit and have a pretty good run. If they follow Mr. Godin's advice they will sink like a cinder block and never be heard from again.

Thursday, April 29, 2010

Landscapes and Lab Rats

In this post I discussed gradient searches and the two great curses of the gradient searcher, small local optima and long, circuitous paths. I also mentioned that by making small changes to the landscape being searched (in other words, perturbing it) we could sometimes (with luck) improve our search metrics without significantly changing the size and location of our optima.

The idea that you can use a search on one landscape to find the optima of a similar landscape is the assumption behind more than just perturbing. It is also the basis of all animal testing of treatments for humans. This brings genotype into the landscape discussion, but not in the way it's normally used.

In evolutionary terms, we look at an animal's genotype as a set of coordinates for a vast genetic landscape where 'height' (the fitness function) represents that animal's fitness. Every species is found on that landscape, each clustering around its own local maximum.

Genotype figures in our research landscape, but instead of being the landscape itself, it becomes part of the fitness function. Here's an overly simplified example that might clear things up:

Consider a combination of two drugs. If we use the dosage of each drug as an axis, this gives us something that looks a lot like our first example with drug A being north/south, drug B being east/west and the effect we're measuring being height. In other words, our fitness function has a domain of all points on our AB plane and a range corresponding to the effectiveness of that dosage. Since we expect genetics to affect the subjects react to the drugs, genotype has to be part of that fitness function. If we ran the test on lab rats we would expect a different result than if we tested it on humans but we would hope that the landscapes would be similar (or else there would be no point in using lab rats).

Scientists who use animal testing are acutely aware of the problems of going from one landscape to another. For each system studied, they have spent a great deal of time and effort looking for the test species that functions most like humans. The idea is that if you could find an animal with, say, a liver that functions almost exactly like a human liver, you could do most of your controlled studies of liver disease on that animal and only use humans for the final stages.

As sound and appealing as that idea is, there is another way of looking at this.

On a sufficiently high level with some important caveats, all research can be looked at as a set of gradient searches over a vast multidimensional landscape. With each study, researchers pick a point on the landscape, gather data in the region then use their findings to pick their findings and those of other researchers to pick their next point.

In this context, important similarities between landscapes fall into two distinct categories: those involving the positions and magnitudes of the optima; and those involving the search properties of the landscape. Every point on the landscape corresponds to four search values: a max; the number of steps it will take to reach that max; a min; and the number of steps it will take to reach that min. Since we usually want to go in one direction (let's say maximizing), we can generally reduce that to two values for each point, optima of interest and time to converge.

All of this leads us to an interesting and somewhat counterintuitive conclusion. When searching on one landscape to find the corresponding optimum of another, we are vitally interested in seeing a high degree of correlation between the size and location of the optima but given that similarity between optima, similarity in search statistics is at best unimportant and at worst a serious problem.

The whole point of repeated perturbing then searching of a landscape is to produce a wide range of search statistics. Since we're only keeping the best one, the more variability the better. (Best here would generally be the one where the global optimum is associated with the largest region though time to converge can also be important.)

In animal testing, changing your population of test subjects perturbs the research landscape. So what? How does thinking of research using different test animals change the way that we might approach research? I'll suggest a few possibilities in my next post on the subject.

Monday, April 26, 2010

Fitness Landscapes, Ozark Style

[Update: part two is now up.]

I grew up with a mountain in my backyard... literally. It wasn't that big (here in California we'd call it a hill) but back in the Ozarks it was a legitimate mountain and we owned about ten acres of it. Not the most usable of land but a lovely sight.

That Ozark terrain is also a great example of a fitness landscape because, depending on which side you look at, it illustrates the two serious challenges for optimization algorithms. Think about a mountainous area at least partially carved out by streams and rivers. Now remove all of the rocks, water and vegetation drop a blindfolded man somewhere in the middle, lost but equipped with a walking stick and a cell phone that can get a signal if he can get to a point with a clear line of sight to a cell tower.

With the use of his walking stick, the man has a reach of about six feet so he feels around in a circle, finds the highest point, takes two paces that direction then repeats the process (in other words, performs a gradient search). He quickly reaches a high point. That's the good news; the bad news is that he hasn't reached one of the five or six peaks that rise above the terrain. Instead, he has found the top of one of the countless hills and small mountains in the area.

Realizing the futility of repeating this process, the man remembers that an engineer friend (who was more accustomed to thinking in terms of landscape minima) suggested that if they became separated he should go to the lowest point in the area so the friend would know where to look for him. The man follows his friend's advice only to run into the opposite problem. This time his process is likely to lead to his desired destination (if he crosses the bed of a stream or a creek he's pretty much set) but it's going to be a long trip (waterways have a tendency to meander).

And there you have the two great curses of the gradient searcher, numerous small local optima and long, circuitous paths. This particular combination -- multiple maxima and a single minimum associated with indirect search paths -- is typical of fluvial geomorphology and isn't something you'd generally expect to see in other areas, but the general problems of local optima and slow convergence show up all the time.

There are, fortunately, a few things we can do that might make the situation better (not what you'd call realistic things but we aren't exactly going for verisimilitude here). We could tilt the landscape a little or slightly bend or stretch or twist it, maybe add some ridges to some patches to give it that stylish corduroy look. (in other words, we could perturb the landscape.)

Hopefully, these changes shouldn't have much effect on the size and position of the of the major optima,* but they could have a big effect on the search behavior, changing the likelihood of ending up on a particular optima and the average time to optimize. That's the reason we perturb landscapes; we're hoping for something that will give us a better optima in a reasonable time. Of course, we have no way of knowing if our bending and twisting will make things better (it could just as easily make them worse), but if we do get good results from our search of the new landscape, we should get similar results from the corresponding point on the old landscape.

In the next post in the series, I'll try to make the jump from mountain climbing to planning randomized trials.

* I showed this post to an engineer who strongly suggested I add two caveats here. First, we are working under the assumption that the major optima are large relative to the changes produced by the perturbation. Second our interest in each optima is based on its size, not whether it is global. Going back to our original example, let's say that the largest peak on our original landscape was 1,005 feet tall and the second largest was 1,000 feet even but after perturbation their heights were reversed. If we were interested in finding the global max, this would be be a big deal, but to us the difference between the two landscapes is trivial.

These assumptions will be easier to justify when start applying these concepts in the next post in the series. For now, though, just be warned that these are big assumptions that can't be made that often.

Thursday, April 15, 2010

While on the subject of evolution...

I can't miss a chance to recommend Ian Stewart's "Through the Evolvoscope," a clever and elegant discussion of fitness landscapes. I believe this originally appeared in Stewart's Scientific American column, but you can find it in Another Fine Math You've Gotten Me Into.

Way cool.