Newport’s So Good They Can’t Ignore You: Why Skills Trump Passion in the Quest for Work You Love

I suspect I would have enjoyed Cal Newport’s So Good They Can’t Ignore You more if it had been written by a grumpy armchair economist. Newport’s advice is just what you would expect that economist to give:

  • Get good at what you do (build human capital), then someone might be willing to pay you for it. If you simply follow your passion but you don’t offer anything of value, you likely won’t succeed.
  • If you become valuable, you might be able to leverage that value into control of your career and a mission. Control without value is dangerous – ask anyone who tries to set up their own business or passive income website without having something that people are willing to pay for.

Since we have Newport’s version and not the grumpy economist’s, the advice is framed somewhat less bluntly and Newport tells us a series of stories about people who became good (through deliberate practice) and leveraged that skill into a great career. It’s not easy to connect with many of the examples – TV hosts, script writers, organic farmers, a programmer working at the boundary of programming and music – but I suppose they are more interesting than stories of those in dreary jobs who simply bite the bullet, skill up and get promoted.

In TED / self-help style, Newport introduces us to a new set of buzzwords (“career capital”, “craftsman mindset” etc.) and “laws”. I’m glad Newport independently discovered of the “law of financial viability” –  do what people are willing to pay for – but at many points of the book we are left witnessing a battle between common sense and “conventional wisdom” rather than the discovery of new deep insights.

One piece of advice that the economist might not have given was how to find a mission. Newport’s advice is that you should become so skilled that you are the frontier of your field. You then might be able to see new possibilities in the “adjacent possible” that you can turn into a mission. And not only does the approach need to be cutting edge, it should also be “remarkable”, defined as being so compelling that people remark on it and it can be launched in a venue that compels remarking (luckily Newport has the venue of peer review….). I suspect this might be interesting advice for a few people, but I suspect not a lot of help for the average person stuck behind a desk.

Despite entering the book with a high degree of mood affiliation – I believe the basic argument is right – there was little in the book that convinced me either way. The storytelling and buzzwords were accompanied by little data. Threads such as those on the 10,000 hours rule and the unimportance of innate ability were somewhat off-putting.

That said, some points were usefully framed. Entering the workplace expecting to follow your passion will likely lead to chronic unhappiness and job shifting. Instead, suck it up and get good. There are a lot of books and blogs encouraging you to follow your passion, and most of them are garbage. So if more people follow Newport’s fluffed up way of giving some basic economic advice, that seems like a good thing.

Rosenzweig’s Left Brain, Right Stuff: How Leaders Make Winning Decisions

I was triggered to write my recent posts on overconfidence and the illusion of control – pointing to doubts about the pervasiveness of these “biases” – by Phil Rosenzweig’s entertaining Left Brain, Right Stuff: How Leaders Make Winning Decisions. Some of the value of Rosenzweig’s book comes from his examination of some classic behavioural findings, as those recent posts show. But much of Rosenzweig’s major point concerns the application of behavioural findings to real-world decision making.

Rosenzweig’s starting point is that laboratory experiments have greatly added to our understanding about how people make decisions. By carefully controlling the setup, we are able to focus on individual factors affecting decisions and tease out where decision making might go wrong (replication crisis notwithstanding). One result of this body of work is the famous catalogue of heuristics and biases where we depart from the model of the perfectly rational decision maker.

Some of this work has been applied with good results to areas such as public policy, finance or forecasting political and economic events. Predictable errors in how people make decisions have been demonstrated, and in some cases substantial changes in behaviour have been generated by changing the decision environment.

But as Rosenzweig argues  – and this is the punchline of the book – this research does not easily translate across to many areas of decision-making. Laboratory experiments typically involve choices from options that cannot be influenced, involve absolute payoffs, provide quick feedback, and are made by individuals rather than leaders. Change any of these elements, and crude applications of the laboratory findings to the outside world can go wrong. In particular, we should be careful not to compare predictions of an event with scenarios where we can influence the outcomes and will be in competition with others.

Let’s take the first, whether outcomes can be influenced. Professional golfers believe they sink around 70 per cent of their 6 foot putts, compared to an actual success rate closer to 55 per cent. This is typically labelled as overconfidence and an error (although see my recent post on overconfidence).

Now, is this irrational? Not necessarily suggests Rosenzweig, as the holder of the belief can influence the outcome. Thinking you are better at sinking six-foot putts than you actually are will increase the chance that you will.

In one experiment, participants putted toward a hole that was made to look bigger or smaller by using lighting to create an optical illusion. Putting from a little less than six feet, the (amateur) participants sank almost twice as many putts when putting toward the larger looking hole. They were more likely to sink the putts when it appeared an easier task.

This points to the question of whether we want to ward off biases. Debiasing might be good practice if you can’t influence the outcome, but if it’s up to you to make something happen, that “bias” might be an important part of making it happen.

More broadly, there is evidence that positive illusions allow us to take action, cope with adversity and persevere in the face of competition. Positive people have more friends and stronger social bonds, suggesting a “healthy” person is not necessarily someone who sees the world exactly as it is.

Confidence may also be required to lead people. If confidence is required to inspire others to succeed, it may be necessary rather than excessive. As Rosenzweig notes, getting people to believe they can perform is the supreme act of leadership.

A similar story about the application of laboratory findings is the difference between relative and absolute payoffs. If the competition is relative, playing it safe may be guaranteed failure. The person who comes out ahead will almost always be the one who takes the bigger risk, meaning that an exaggerated level of confidence may be essential to operate in some areas – although as Rosenzweig argues, the “excessive” risk may be calculated.

One section of the book focuses on people starting new ventures. With massive failure rates – around 50% failure after five years (depending on your study) – it is common for entrepreneurs to be said to be overconfident or naive. Sometimes their “reckless ambition” and “blind faith” is praised as necessary for the broader economic benefits that flow from new business formation. (We rarely hear people lamenting we aren’t starting enough businesses).

Rosenzweig points evidence that calls this view into question – from the evidence of entrepreneurs as persistent tinkerers rather than bold arrogant visionaries, to the constrained losses they incur in event of failure. While there are certainly some wildly overconfident entrepreneurs, closure of their business should not always be taken as failure and overconfidence as the cause. There are many types of errors – calculation, memory, motor skills, tactics etc. – and even good decisions sometimes turn out badly. Plus, as many as 92% firms close with no debt – 25% with a profit.

Rosenzweig also notes evidence that, at least in an experimental setting, entrepreneurs enter at less than optimal rates. As noted in my recent post on overconfidence, people tend to overplace themselves relative to the rest of the population for easy tasks (e.g. most drivers believe they are above average). But for hard tasks, they underplace. In experiments by Don Moore and friends on firm entry, they found a similar effect – excess entry when the industry appeared an easy one in which to compete, but too few entered when it appeared difficult. Hubristic entrepreneurs didn’t flood into all areas, and myopia about one’s own and competing firms’ abilities appears a better explanation for what is occurring than being the result of the actions of overconfident entrepreneurs.

There is the occasional part of the book that falls flat with me – the section on the limitations of mathematical models and some of the story telling around massive one-off decisions – but it’s generally a fine book.

* Listen to Russ Roberts interview Rosenzweig on Econtalk for a summary of some of the themes from the book.

The illusion of the illusion of control

In the spirit of my recent post on overconfidence, the illusion of control is another “bias” where imperfect information might be a better explanation for what is occurring.

The illusion of control is a common finding in psychology that people believe they can control things that they cannot. People would prefer to pick their lottery numbers than have them randomly allocated – being willing to even pay for the privilege. In laboratory games, people often report having control over outcomes that were randomly generated.

This effect was labelled by Ellen Langer as the illusion of control (for an interesting read about Langer’s other work, see here). The decision making advice that naturally flows out of this – and you will find in plenty of books building on the illusion of control literature – is that we need to recognise that we can control less than we think. Luck plays a larger role than we believe.

But when you ask about people’s control of random events, which is the typical experimental setup in this literature, you can only get errors in one direction – the belief that they have more control than they actually do. It is not possible to believe you have less than no control.

So what do people believe in situations where they do have some control?

In Left Brain, Right Stuff, Phil Rosenzweig reports on research (pdf) by Francesca Gino, Zachariah Sharek and Don Moore in which people have varying degrees of control over whether clicking a mouse would change the colour of the screen. For those that had no or little control (clicking the mouse worked 0% or 15% of the time), the participants tended to believe they had more control than they did – an illusion of control.

But when it came to those who had high control (clicking the mouse worked 85% of the time), they believed they had less control than they did. Rather than having an illusion of control, they failed to recognise the degree of control that they had. The one point where there was accurate calibration was when there was 100% control.

The net finding of this and other experiments is that we don’t systematically have an illusion on control. Rather, we have imperfect information about our level of control. When low, we tend to overestimate. When high (but not perfect), we tend to underestimate.

That the illusion of control was previously seen to be largely acting in one direction was due to experimental design. When people have no control and can only err in one way, that is naturally what will be found. Gino and friends term this problem as the illusion of the illusion of control.

So when it comes to decision making advice, we need to be aware of the context. If someone is picking stocks or something of that nature, the illusion of control is not helpful. But in their day-to-day life where they have influence over many outcomes, underestimating control could be a serious error.

Should we be warning against underestimating control? If we were to err consistently in one direction, it is not clear to me that having an illusion of control is of greater concern. Maybe we should err on the side of believing we can get things done.

*As an aside, there is a failed replication (pdf) of one of Langer’s 1975 experiments from the paper for which the illusion is named.

Overconfident about overconfidence

In 1995 Werner De Bondt and Richard Thaler wrote “Perhaps the most robust finding in the psychology of judgment and choice is that people are overconfident.” They are hardly been alone in making such a proclamation. And looking at the evidence, they seem to have a case. Take the following examples:

  • When asked to estimate the length of the Nile by providing a range the respondent is 90% sure contains the correct answer, the estimate typically contains the correct answer only 50% of the time.
  • PGA golfers typically believe they sink around 75% of 6 foot putts – some even believe they sink as many as 85% – when the average is closer to 55%.
  • 93% of American drivers rate themselves as better than average. 25% of high school seniors believe they are in the top 1% in ability to get along with others.

There is a mountain of similar examples, all seemingly making the case that people are generally overconfident. 

But despite all being labelled as showing overconfidence, these examples are actually quite different. As pointed out by Don Moore and Paul Healy in “The Trouble with Overconfidence” (pdf), several different phenomena are being captured. Following Moore and Healy, let’s call them overprecision, overestimation and overplacement.

Overprecision is the tendency to believe that our predictions or estimates are more accurate than they actually are. The typical study seeking to show overprecision asks for someone to give confidence ranges for their estimates, such as estimating the length of the Nile. The evidence that we are overprecise is relatively robust (although I have to admit I haven’t seen any tests asking for 10% confidence intervals).

Overestimation is the belief that we can perform at a level beyond that which we realistically can (I tend to think of this as overoptimism). The evidence here is more mixed. When attempting a difficult task such as a six foot putt, we typically overestimate. But on easy tasks, the opposite is often the case – we tend underestimate our performance. Whether over or underestimation occurs depends upon the domain.

Overplacement is the erroneous relative judgment that we are better than others. Obviously, we cannot all be better than average. But this relative judgment, like overestimation, tends to vary with task difficulty. For easy tasks, such as driving a car, we overplace and consider ourselves better than most. But as Phil Rosenzweig points out in his book Left Brain, Right Stuff (which contains a great summary of Moore and Healy’s paper), ask people where they rate for a skill such as drawing, and most people will rate themselves as below average. People don’t suffer from pervasive overplacement. Whether they overplace depends on what the situation is.

You might note from the above that we tend to both underestimate and overplace our performance on easy tasks. We can also overestimate but underplace our performance on difficult tasks.

So are we both underconfident and overconfident at the same time? The blanket term of overconfidence does little justice to what is actually occurring.

Moore and Healy’s explanation for what is going on is these situations is that, after performing a task, we have imperfect information about our own performance, and even less perfect information about that of others. As Rosenzweig puts it, we are myopic, which is a better descriptor of what is going on than saying we are biased.

Consider an easy task. We do well because it is easy. But because we imperfectly assess our performance, our assessment is regressive – that is, it tends to revert to the typical level of performance. Since we have even less information about others, our assessment of them is even more regressive. The net result is we believe we performed worse than we actually did but better than others.

Rosenzweig provides a couple of more intuitive examples of myopia at work. Taking one, we know about our excellent driving record and that there are plenty of people out there who die in car accidents. With a narrow view of that information, it seems logical to place yourself above average.

But when considering whether we are an above or below average juggler, the knowledge of our own ineptitude and the knowledge of the existence of excellent jugglers makes for a myopic assessment of being below average. In one example Rosenzweig cites, 94% of students believed they would be below average in a quiz on indigenous Amazon vegetation – hardly a tendency for overplacement, but rather the result of myopic consideration of the outcomes from a difficult task.

The conflation of these different effects under the umbrella of overconfidence often plays out in stories of how overconfidence (rarely assessed before the fact) led to someone’s fall. Evidence that people tend to believe they are better drivers than average (overplacement) is not evidence that overconfidence led someone to pursue a disastrous corporate merger (overestimation). Evidence that people tend to be overprecise in estimating the year of Mozart’s birth is not evidence that hubris led the US into the Bay of Pigs fiasco.

Putting this together, the claims we are systematically overconfident can be somewhat overblown and misapplied. I am not sure Moore and Healy’s labelling is the best available, but recognising the differing forces are at play seems important in understanding how “overconfidence” affects our decisions.

Henrich’s The Secret of Our Success: How Culture Is Driving Human Evolution, Domesticating Our Species, and Making Us Smarter

When humans compete against chimps in tests of working memory, information processing or strategic play, chimps often come out on top. If you briefly flash 10 digits on a screen before covering them up, a trained chimp will often better identify the order in which the numbers appeared (see here). Have us play matching pennies, and the chimp can converge on the predicted (Nash equilibrium) result faster than the slow to adapt humans.

So given humans don’t appear to dominate chimps in raw brain power (I’ll leave contesting this particular fact until another day), what can explain the ecological dominance of humans?

Joe Henrich’s answer to this question, laid out in The Secret of Our Success: How Culture Is Driving Human Evolution, Domesticating Our Species, and Making Us Smarter, is that humans are superior learning machines. Once there is an accumulated stock of products of cultural evolution – fire, cutting tools, clothing, hunting tools and so on – natural selection favoured those who were better cultural learners. Natural selection shaped us to be a cultural species, as Henrich explains:

The central argument in this book is that relatively early in our species’ evolutionary history, perhaps around the origin of our genus (Homo) about 2 million years ago … cultural evolution became the primary driver of our species genetic evolution. The interaction between cultural and genetic evolution generated a process that can be described as autocatalytic, meaning that it produces the fuel that propels it. Once cultural information began to accumulate and produce cultural adaptations, the main selection pressure on genes revolved around improving our psychological abilities to acquire, store, process, and organize the array of fitness-enhancing skills and practices that became increasingly available in the minds of the others in one’s group. As genetic evolution improved our brains and abilities for learning from others, cultural evolution spontaneously generated more and better cultural adaptations, which kept the pressure on for brains that were better at acquiring and storing this cultural information.

The products of cultural evolution make us (in a sense) smarter. We receive a huge cultural download when growing up, from a base 10 counting system, to a large vocabulary allowing us to communicate complex concepts, to the ability to read and write, not to mention the knowhow to survive. Henrich argues that we don’t have all these tools because we are smart – we are smart because we have these tools. These cultural downloads can’t be devised in a few years by a few smart people. They comprise packages of adaptations developed over generations.

As one illustration of this point, Henrich produces a model where people can be either geniuses who produce more ideas, or social with more friends. Parameterise the model right and social groups end up much “smarter” with a larger stock of ideas. It is better to be able to learn and have more friends to learn from (again, within certain parameters) than have a fewer number of smarter friends. The natural extension of this is that larger populations will have more complex technologies (as Michael Kremer and others have argued – although see my extension on the evolving capacity to generate ideas).

One interesting feature of these cultural adaptations is that the bearers don’t necessarily understand how they work. They simply know how to effectively use them. An example Henrich draws on are food processing techniques developed over generations to remove toxins from otherwise inedible plants. People need, to a degree, to learn on faith. An unwillingness to learn can kill.

Take the consumption of unprocessed manioc (cassava), which can cause cyanide poisoning. South American groups that have consumed it for generations have developed multi-stage processes involving grating, scraping, separating, washing, boiling and waiting. Absent those, the poisoning emerges slowly after years of eating. Given the non-obvious nature of the negative outcomes and link between the practices and outcomes, the development of processing techniques is a long process.

When manioc was transported from South America to West Africa by the Portuguese, minus the cultural protocols, the result has been hundreds of years of cyanide poisoning. The problem that remains today. Some African groups have evolved processing techniques to remove the cyanide, but these are only slowly spreading.

Beyond the natural selection for learning ability, Henrich touches on a few other genetic and biological angles. One of the more interesting is the idea that gene-culture co-evolution can lead to non-genetic biological adaptations. The culture we are exposed to shapes our minds during development, leading to taxi drivers in London having a larger hippocampus, or people from different cultures having different perceptual ability when it comes to judging relative or absolute size. Growing up in different cultures also alters fairness motivations, patience, response to honour threats and so on.

Henrich is right to point out that his argument does not imply that seeing differences between groups implies cultural differences. They could be genetic, and different cultures over time could have moulded group differences. That said, Henrich also suggests genes play a tiny role, although it’s not a position brimming with analysis. As an example, he points out the high levels of violence among Scottish immigrants in the US Deep South who transported and retained an honour culture, compared to the low levels of violence in Scotland itself (or New England where there were also Scottish immigrants), without investing much effort in exploring other possibilities.

Henrich briefly addresses some of the competing hypotheses for why we evolved large brains and developed a the theory of mind (the ability to infer others’ goals). For example the Machiavellian hypothesis posits that our brains evolved to outthink each other in strategic competition. As Henrich notes, possessing a theory of mind can also lead to us more effectively copy and learn from them (the cultural intelligence hypothesis). Successful Machiavellian’s must be good cultural learners – you need to learn the rules before you can bend them.

Since the release of Henrich’s book, I have seen little response from the Stephen Pinker’s and evolutionary psychologists of the world, and I am looking forward to some critiques of Henrich’s argument.

So let me pose a few questions. As a start, until the last few hundred years, most of the world’s population didn’t use a base 10 counting system, couldn’t write and so on. Small scale societies might have a vocabulary of 3,000 to 5,000 words, compared to the 40,000 to 60,000 words held in the mind of the typical American 17-year old. The cultural download has shifted from something that could be passed on in a few years to something that takes a couple of decades of solid learning. Why did humans have so much latent capacity to increase the size of the cultural download? Was that latent capacity possibly generated by other mechanisms? Or has there been strong selection to continue to increase the stock of cultural knowledge we can hold?

Second, is there any modern evidence for the success of those who have better cultural learning abilities? We have evidence the higher reproductive success of those who kill in battle (see Napoleon Chagnon’s work) or those with higher income. What would an equivalent study to show the higher reproductive success of better cultural learners look like (assuming selection for that trait is still ongoing)? Or is it superior learning ability that leads to people to have higher success in battle or greater income? And in that case, are we just talking IQ?

Having been reading about cultural evolution for a few years now, I still struggle to grasp the extent to which it is a useful framework.

Partly, this question arises due to the lack of a well-defined cultural evolution framework. The definition of culture is often loose (see Arnold Kling on Henrich’s definition) and it typically varies between cultural evolution proponents. Even once it is defined, what is the evolutionary mechanism? If it is natural selection, what is the unit of selection? And so on.

Then there is the question of whether evolution is the right framework for all the forms of cultural transmission? Are models for the spread of disease a better fit? You will find plenty of discussions of this type of question across the cultural evolution literature, but little convergence.

Contrast cultural evolution with genetic natural selection. In the latter, high fidelity information is transmitted from parent to offspring in particulate form. Cultural transmission (whatever the cultural unit is) is lower-fidelity and can be in multiple directions. For genetic natural selection, selection is at the level of the gene, but the future of a gene and its vessels are typically tightly coupled within a generation. Not so with culture. As a result we shouldn’t expect to see the types of results we see in population/quantitative genetics in the cultural sphere. But can cultural evolution get even close?

You get a flavour of this when you look through the bespoke models produced in Henrich’s past work or, say, the work by Boyd and Richerson. Lot’s of interesting thinking tools and models, but hardly a unified framework.

A feature of the book that I appreciated was that Henrich avoided framing the group-based cultural evolutionary mechanisms he describes as “group selection”, preferring instead to call them “intergroup competition” (the term group selection only appears in the notes). In the cultural evolution space, group selection is a label that tends to be attached to all sorts of dynamics – whether they resemble genetic group selection processes or not – only leading to confusion. Henrich notes at one point that there are five forms of intergroup competition. Perhaps one of these might be described as approaching a group selection mechanism. (See West and friends on this point that in much of the cultural evolution literature, group selection is used to refer to many different things). By avoiding going down this path, Henrich has thankfully not added to the confusion.

One thread that I have rarely seen picked up in discussion of the book (excepting Arnold Kling) is the inherently conservative message that can be taken out of it. A common story through the book is that the bearers of cultural adaptations rarely understand how they work. In that world, one should be wary of replacing existing institutions or frameworks.

When Henrich offers his closing eight “insights”, he also seems to be suggesting we use markets (despite the absence of that world). Don’t design what we believe will work and impose it on all:

Humans are bad at intentionally designing effective institutions and organizations, although I’m hoping that as we get deeper insights into human nature and cultural evolution this can improve. Until then, we should take a page from cultural evolution’s playbook and design “variation and selection systems” that will allow alternative institutions or organizational forms to compete. We can dump the losers, keep the winners, and hopefully gain some general insights during the process.

Jones’s Hive Mind: How Your Nation’s IQ Matters So Much More Than Your Own

jonesGarett Jones has built much of his excellent Hive Mind: How Your Nation’s IQ Matters So Much More Than Your Own on foundations that, while relatively well established, are likely surprising (or even uncomfortable) for some people. Here’s a quick list off the top of my head:

  • High scores in one area of IQ tests tends to show up in others – be that visual, maths, vocabulary etc. The “g factor” can capture almost half of the variation in performance across the different tests.
  • IQ is as good as the best types of interviews at predicting employee performance (and most interviews aren’t the “best type”) .
  • IQ is the best single predictor of executive performance, and for performance in the middle to high-end range of the workforce.
  • IQ predicts practical social skills. If you know someone’s IQ and are trying to predict job or school performance, there is little benefit in learning their EQ score. Conversely, if you know their EQ score, their IQ score has valuable information.
  • IQ scores in poor countries predict earning power, just as they do in developed countries.
  • Test scores such as the PISA test are better predictors of a country’s economic performance than years of education.
  • Corruption correlates strongly (negatively) with IQ.
  • IQ scores are good predictors of cooperative behaviour.

And so on.

On that last point, there was one element that I had not fully appreciated. Jones reports an experiment in which players were paired in a cooperative game. High-IQ pairs were five times more cooperative than high-IQ individuals. The link between IQ and cooperation came from smart pairs of players, not smart individual players

Once you put all those pieces together, you reach the punchline of the book, which is an attempt to understand why the link between income and IQ, while positive both across and within countries, is of a larger magnitude across countries.

Jones’s argument builds on that of Michael Kremer’s classic paper, The O-Ring Theory of Economic Development. Kremer’s insight was that if production in an economy consists of many discrete tasks and failure in any one of those tasks can ruin the final output (such as an O-ring failure on a space shuttle), small differences in skills can drive large differences in output between firms. This can lead to high levels of inequality as the high-skilled work together in the same firm, leading them to be disproportionately more productive.

Jones extended Kremer’s argument this by contemplating what the world would look like if it comprised a combination of what he calls an O-ring sector and a foolproof sector. Here’s what I wrote about Jones’s argument previously based on an article he wrote:

The foolproof sector is not as fragile as the more complex O-ring sector and includes jobs such as cleaning, gardening and other low-skill occupations. The key feature of the foolproof sector is that being of low skill (which Jones suggests relates more to IQ than quantity of education) does not necessarily destroy the final product. It only reduces the efficiency with which it is produced. A couple of low-skill workers can substitute for a high-skill worker in the foolproof sector, but they cannot effectively fill the place of a high-skill O-ring sector worker, no matter how many low-skill workers are supplied.

In this economy, low-skill workers will work in the foolproof sector as these firms will pay them more than an O-ring sector firm. High-skill workers are found in both sectors, with their level of participation in each sector such that high-skill workers are paid the same regardless of which sector they work in (the law of one price).

Thus, within a country, firms will pay high-skill workers more than their low-skill counterparts, but not dramatically so. Their wage differential is determined by the difference in their outputs in the foolproof sector.

Across countries, however, things are considerably different. The highest skill workers in a country provide labour for the O-Ring sector. If they are low skilled relative to the high-skilled in other countries, their output in that fragile sector will be much lower. This occurs even for relatively small skill differences. Their income will reflect their low output, with wages also lower in the foolproof sector as high-skill workers apportion themselves between sectors such that the law of one price holds. The net result is much lower wages for workers in comparison to another country with a higher-skill elite.

The picture is a bit more subtle than that, depending on the mix of skills in the economy (which Jones describes in more detail in both the paper and book). But the basic pattern of large income gaps between countries and small gaps within is relatively robust.

One thing I would have liked to have seen more of in the book – although I suspect this might have somewhat been counter to Jones’s objective – would have been for Jones to challenge some of the research. At times it feels like Jones is tiptoeing through a minefield – the book is peppered with distracting qualifications that you feel he has to make to broaden the audience of the book.

But that said, I’m likely not the target audience. And I like the thought of that new audience hearing what Jones has to say.

Mandelbrot (and Hudson’s) The (mis)Behaviour of Markets: A Fractal View of Risk, Ruin, and Reward

If you have read Nassim Taleb’s The Black Swan you will have come across some of Benoit Mandelbrot’s ideas. However, Mandelbrot and Hudson’s The (mis)Behaviour of Markets: A Fractal View of Risk, Ruin, and Reward offers a much clearer critique of the underpinnings of modern financial theory (there are many parts of The Black Swan where I’m still not sure I understand what Taleb is saying). Mandelbrot describes and pulls apart the contributions of Markowitz, Sharpe, Black, Scholes and friends in a way likely understandable to the intelligent lay reader. I expect that might flow from science journalist Richard Hudson’s involvement in writing the book.

Mandelbrot’s critique rests on two main pillars. The first is that – seemingly stating the obvious – markets are risky. Less obviously, Mandelbrot’s point is that market changes are more violent than often assumed. Second, trouble runs in streaks.

While Mandelbrot’s critique is compelling, it’s much harder to construct plausible alternatives. Mandelbrot offers two new metrics – α (a measure of how wildly prices vary) and H (a measure of the dependence of price changes upon past changes) – but as he notes, the method used to calculate each can result in wild variation in those measures themselves. On H, he states that “If you look across all the studies to date, you find a perplexing range of H values and no clear pattern among them.”

I’ll close this short note with a brief excerpt from near the end of the book painting a picture of what it is like to live in the world Mandelbrot describes (which just happens to be our world):

What does it feel like, to live through a fractal market? To explain, I like to put it in terms of a parable:

Once upon a time, there was a country called the Land of Ten Thousand Lakes. Its first and largest lake was a veritable sea 1,600 miles wide. The next biggest lake was 919 miles across; the third, 614; and so on down to the last and smallest at one mile across. An esteemed mathematician for the government, the Kingdom of Inference and Probable Value, noticed that the diameters scaled downwards according to a tidy, power-law formula.

Now, just beyond this peculiar land lay the Foggy Bottoms, a largely uninhabited country shrouded in dense, confusing mists and fogs through which one could barely see a mile. The Kingdom resolved to chart its neighbour; and so the surveyors and cartographers set out. Soon, they arrived at a lake. The mists barred their sight of the far shore. How broad was it? Before embarking on it, should they provision for a day or a month? Like most people, they worked out what they knew: They assumed this new land was much like their own and that the size of lakes followed the same distribution. So, as they set off blindly in their boats, they assumed they had at least a mile to go and, on average, five miles.

But they rowed and rowed and found no shore. Five miles passed, and they recalculated the odds of how far they had to travel. Again, the probability suggested: five miles to go. So they rowed further – and still no shore in sight. they despaired. Had they embarked upon a sea, without enough provisions for the journey? Had the spirits of these fogs moved the shore?

An odd story, but one with a familiar ring, perhaps, to a professional stock trader. Consider: The lake diameters vary according to a power law, from largest to smallest. Once you have crossed five miles of water, odds are you have another five to go. If you are still afloat after ten miles, the odds remain the same: another ten miles to go. And so on. Of course, you will hit shore at some point; yet at any moment, the probability is stretched but otherwise unchanged.

Why prediction is pointless

One of my favourite parts of Philip Tetlock’s Expert Political Judgment is his chapter examining the reasons for “radical skepticism” about forecasting. Radical skeptics believe that Tetlock’s mission to improve forecasting of political and economic events is doomed as the world is inherently unpredictable (beyond conceding that no expertise was required to know that war would not erupt in Scandinavia in the 1990s). Before reading Expert Political Judgment, I largely fell into this radical skeptic camp (and much of me still resides in it).

Tetlock suggests skeptics have two lines of intellectual descent – ontological skeptics who argue that the properties of the world make prediction impossible, and psychological skeptics who point to the human mind as being unsuited to teasing out any predictability that might exist. Below are excerpts of Tetlock’s examinations of each (together with the occasional rejoinder by Tetlock).

Ontological skeptics

Path dependency and punctuated equilibria

Path-dependency theorists argue that many historical processes should be modeled as quirky path-dependent games with the potential to yield increasing returns. They maintain that history has repeatedly demonstrated that a technology can achieve a decisive advantage over competitors even if it is not the best long-run alternative. …

Not everyone, however, is sold on the wide applicability of increasing-returns, path-dependency views of history. Traditionalists subscribe to decreasing-returns approaches that portray both past and future as deducible from assumptions about how farsighted economic actors, working within material and political constraints, converge on unique equilibria. For example, Daniel Yergin notes how some oil industry observers in the early 1980s used a decreasing-returns framework to predict, thus far correctly, that OPEC’s greatest triumphs were behind it. They expected the sharp rises in oil prices in the late 1970s to stimulate conservation, exploration, and exploitation of other sources of energy, which would put downward pressure on oil prices. Each step from the equilibrium is harder than the last. Negative feedback stabilizes social systems because major changes in one direction are offset by counterreactions. Good judges appreciate that forecasts of prolonged radical shifts from the status quo are generally a bad bet.

Complexity theorists

Embracing complexity theory, they argue that history is a succession of chaotic shocks reverberating through incomprehensibly intricate networks. To back up this claim, they point to computer simulations of physical systems that show that, when investigators link well-established nonlinear relationships into positive feedback loops, tiny variations in inputs begin to have astonishingly large effects. …

McCloskey illustrates the point with a textbook problem of ecology: predicting how the population of a species next year will vary as a function of this year’s population. The model is xt+1 = f(xt), a one-period-back nonlinear differential equation. The simplest equation is the hump: xt+1 = βxt [1 – xt], where the tuning parameter, β, determines the hump’s shape by specifying how the population of deer at t + 1 depends on the population in the preceding period. More deer mean more reproductive opportunities, but more deer also exhaust the food supply and attract wolves. The higher β is, the steeper the hump and the more precipitous the shift from growth to decline. McCloskey shows how a tiny shift in beta from 3.94 to 3.935 can alter history. The plots of populations remain almost identical for several years but, for mysterious tipping-point reasons, the hypothetical populations decisively part ways twenty-five years into the simulation.

We could endlessly multiply these examples of great oaks sprouting from little acorns. For radical skeptics, though, there is a deeper lesson: the impossibility of picking the influential acorns before the fact. Joel Mokyr compares searching for the seeds of the Industrial Revolution to “studying the history of Jewish dissenters between 50 A.D. and 50 B.C.

Game theorists

Radical skeptics can counter, however, that many games have inherently indeterminate multiple or mixed strategy equilibria. They can also note that one does not need to buy into a hyperrational model of human nature to recognize that, when the stakes are high, players will try to second-guess each other to the point where political outcomes, like financial markets, resemble random walks. Indeed, radical skeptics delight in pointing to the warehouse of evidence that now attests to the unpredictability of the stock market.

Probability theorists

If a statistician were to conduct a prospective study of how well retrospectively identified causes, either singly or in combination, predict plane crashes, our measure of predictability—say, a squared multiple correlation coefficient—would reveal gross unpredictability. Radical skeptics tell us to expect the same fate for our quantitative models of wars, revolutions, elections, and currency crises. Retrodiction is enormously easier than prediction.

Psychological skeptics

Preference for simplicity

However cognitively well equipped human beings were to survive on the savannah plains of Africa, we have met our match in the modern world. Picking up useful cues from noisy data requires identifying fragile associations between subtle combinations of antecedents and consequences. This is exactly the sort of task that work on probabilistic-cue learning indicates people do poorly. Even with lots of practice, plenty of motivation, and minimal distractions, intelligent people have enormous difficulty tracking complex patterns of covariation such as “effect y1 rises in likelihood when x1 is falling, x2 is rising, and x3 takes on an intermediate set of values.”

Psychological skeptics argue that such results bode ill for our ability to distill predictive patterns from the hurly-burly of current events. …

We know—from many case studies—that overfitting the most superficially applicable analogy to current problems is a common source of error.

Aversion to ambiguity and dissonance

People for the most part dislike ambiguity—and we shall discover in chapter 3 that this is especially true of the hedgehogs among us. History, however, heaps ambiguity on us. It not only requires us to keep track of many things; it also offers few clues as to which things made critical differences. If we want to make causal inferences, we have to guess what would have happened in counterfactual worlds that exist—if “exist” is the right word—only in our imaginative reenactments of what-if scenarios. We know from experimental work that people find it hard to resist filling in the missing data points with ideologically scripted event sequences.

People for the most part also dislike dissonance … Unfortunately, the world can be a morally messy place in which policies that one is predisposed to detest sometimes have positive effects and policies that one embraces sometimes have noxious ones. … Dominant options—that beat the alternatives on all possible dimensions—are rare.

Need for control

[P]eople will generally welcome evidence that fate is not capricious, that there is an underlying order to what happens. The core function of political belief systems is not prediction; it is to promote the comforting illusion of predictability.

The unbearable lightness of our understanding of randomness

Our reluctance to acknowledge unpredictability keeps us looking for predictive cues well beyond the point of diminishing returns. I witnessed a demonstration thirty years ago that pitted the predictive abilities of a classroom of Yale undergraduates against those of a single Norwegian rat. The task was predicting on which side of a T-maze food would appear, with appearances determined—unbeknownst to both the humans and the rat—by a random binomial process (60 percent left and 40 percent right). The demonstration replicated the classic studies by Edwards and by Estes: the rat went for the more frequently rewarded side (getting it right roughly 60 percent of the time), whereas the humans looked hard for patterns and wound up choosing the left or the right side in roughly the proportion they were rewarded (getting it right roughly 52 percent of the time). Human performance suffers because we are, deep down, deterministic thinkers with an aversion to probabilistic strategies that accept the inevitability of error. … This determination to ferret out order from chaos has served our species well. We are all beneficiaries of our great collective successes in the pursuit of deterministic regularities in messy phenomena: agriculture, antibiotics, and countless other inventions that make our comfortable lives possible. But there are occasions when the refusal to accept the inevitability of error—to acknowledge that some phenomena are irreducibly probabilistic—can be harmful.

Political observers run the same risk when they look for patterns in random concatenations of events. They would do better by thinking less. When we know the base rates of possible outcomes—say, the incumbent wins 80 percent of the time—and not much else, we should simply predict the more common outcome.

In a later post I’ll touch on some of Tetlock’s tests of whether the perspective of the radical skeptics hold up.

Rosenzweig’s The Halo Effect … and the Eight Other Business Delusions That Deceive Managers

rosenzweigPhil Rosenzweig’s The Halo Effect … and the Eight Other Business Delusions That Deceive Managers is largely an exercise of shooting fish in a barrel, but is an entertaining read regardless.

The central premise of the book is that most blockbuster business books (think Good to Great), for all the claims of scientific rigour, are largely exercises in storytelling.

The problem starts because it is difficult to understand company performance, even as it unfolds before our eyes. Most people don’t know good what good leadership looks like. It is hard to know what makes good communication or optimal cohesion or good customer service. The result of this difficulty is that people tend to allow good performance in the areas that they can measure (such as profits) to contaminate their assessment of other company attributes. They endow the company with a halo.

So when a researcher asks people to rate company attributes when they know the business outcome, those ratings are contaminated. If profits are up, people will assign positive attributes to that company. If times are bad, they will assign negative attributes. We exaggerate strength during booms, and faults during falls. All the factors responsible for a company’s rise might suddenly became the reasons for the fall, or be claimed to have never existed in the first place.

As an example, Apple currently has a clean sweep of all nine attributes in Fortune’s “World’s Most Admired Companies” poll – everything from social responsibility to long-term investment value. Is there not a single company in the world that is better than Apple on any of these nine? As Rosenzweig notes, when asked nine questions, people don’t have nine different opinions. They just give their general impression nine times.

Rosenzweig points to one nice experiment by Barry Straw (replicated?), in which Straw asked groups to projects sales and earnings based on financial data. These groups were then given random feedback on their performance. Those with better feedback described their groups as cohesive, motivated and open to change, while those who the experimenter they performed poorly said there was a lack of communication, poor motivation and so on.

Many of the other delusions in the book are likely familiar to someone who knows a bit about stats or experimental design. Don’t confuse correlation and causation. Rigour is not defined by quantity of data. Do not use samples comprising only successes. Social science isn’t physics.

Other delusions are less often stated. Don’t be deluded into believing single explanations. If you added up the explained variance across the various single explanation business studies, you’ll explain 100% of the variance many times over. The explanations are likely correlated. And following a simple formula won’t necessarily work for a business as performance is relative. What if all your competitors also follow the same formula?

The book closes with Rosenzweig’s spin on what leads to company success, which seems out-of-place after the preceding chapters. Some of it makes sense, such as when Rosenzweig points to the need to acknowledge the role of chance, which is almost never threaded into stories of business success. But Rosenzweig’s punchline of the need for strategy and execution feels just like the type of storytelling that he critiques.

Further, when Rosenzweig assesses the performance of three models of good managers – who he approvingly notes share a probabilistic view of the world, realise the role of luck, can make deals under uncertainty, and recognise the need to be vigilant on the changing competitive landscape – it is hard to even agree that all of their actions were successes. Robert Rubin was one of the three. Rosenzweig classes Rubin’s support of the decision to bail out Mexico during the 1995 peso crisis (or more like the exposed US banks – moral hazard anyone?) as a good decision based on the outcome. What is the objective fact not contaminated by a halo? Rosenzweig ends by defending Rubin – who supported deregulation of derivatives trading and was on the board of Citigroup when it was bailed out during the financial crisis – as being more often right than wrong. If nothing else, the strange close to an otherwise good book did give me one more book for the reading pile – Rubin’s In an Uncertain World.

Tetlock and Gardner’s Superforecasting: The Art and Science of Prediction

Philip Tetlock and Dan Gardner’s Superforecasting: The Art and Science of Prediction doesn’t quite measure up to Tetlock’s superb Expert Political Judgment (read EPJ first), but it contains more than enough interesting material to make it worth the read.

The book emerged from a tournament conducted by the Intelligence Advanced Research Projects Activity (IARPA), designed to pit teams of forecasters against each other in predicting political and economic events. These teams included Tetlock’s Good Judgment Project (also run by Barbara Mellers and Don Moore), a team from George Mason University (for which I was a limited participant), and teams from MIT and the University of Michigan.

The result of the tournament was such a decisive victory by the Good Judgment Project during the first 2 years that IARPA dropped the other teams for later years. (It wasn’t a completely open fight – prediction markets could not use real money. Still, Tetlock concedes that the money-free prediction markets did pretty well, and there is scope to test them further in the future.)

Tetlock’s formula for a successful team is fairly simple. Get lots of forecasts, calculate the average of the forecast, and give extra weight to the top forecasters – a version of wisdom of the crowds. Then extremize the forecast. If the forecast is a 70% probability, bump up to 85%. If 30%, cut it to 15%.

The idea behind extremising is quite clever. No one in the group has access to all the dispersed information. If everyone had all the available information, this would tend to raise their confidence, which would result in a more extreme forecast. Since we can’t give everyone all the information, extremising is an attempt to simulate what would happen if you did. To get the benefits of this extremising, however, requires diversity. If everyone holds the same information there is no sharing of information to be simulated.

But the book is not so much about why the Good Judgment Project was superior to the other teams. Mostly it is about the characteristics of the top 2% of the Good Judgment Project forecasters – a group that Tetlock calls superforecasters.

Importantly, “superforecaster” is not a label given on the basis of blind luck. The correlation in forecasting accuracy for Good Judgment Project members between one year and next was around 0.65. 70% of superforecasters stay in the top 2% the following year.

Some of the characteristics of superforecasters are to be expected. Whereas the average Good Judgment participant scored better than 70% of the population on IQ, superforecasters were better than about 80%. They were smarter, but not markedly so.

Tetlock argues much of the differences lies in technique, and this is where he focused. When faced with a complex question, superforecasters tended to first break it into manageable pieces. For the question of whether French or Swiss inquiries would discover elevated levels of polonium in Yasser Arafat’s body (had he been poisoned?), they might ask whether polonium (which decays) could be found in a man dead for years, what ways could polonium have made its way into his body etc. They don’t jump straight to the implied question of whether Israel poisoned Arafat (which the question was technically not about).

Superforecasters also tended to take the outside view for each of these sub-questions. What is the base rate of this event? (Not so easy for this Arafat question) It is only then that they take the “inside view” by looking for information idiosyncratic to that particular question.

The most surprising finding (to me) was that superforecasters were highly granular in their probability forecasts and granularity predicts accuracy. People who stick to tens (10%, 20%, 30% etc) are less accurate than those who stick to fives (5%, 10%, 15% etc), who are less accurate than those who use ones (35%, 36%, 37% etc). Rounding superforecaster estimates reduces their accuracy, although this has little effect on regular forecasters. A superforecaster will distinguish between 63% and 65%, and this makes them more accurate.

Partly this granularity is reflected in the updates they make when new information is obtained (although they are also more accurate on their initial estimate). Being a superforecaster requires monitoring the news, and reacting the right amount. There are occasional big updates – which Tetlock suggests superforecasters can make because they are not tied to their forecasts like a professional pundit – but most of the time the tweaks represent an iteration toward an answer.

Tetlock suggests such fine-grained distinctions would not come to people naturally, as making them would not have been evolutionarily favourable. If there is a lion in grass, there are three likely responses – yes, no, maybe – not 100 shades of grey. But the reality is there needs to be a threshold for each, and evolution can act on fine distinctions. A gene that leads people to apply “run” with 1% greater accuracy over many generations will spread.

Superforecasters also suffer less from scope insensitivity. People will pay roughly the same amount to save 2,000 or 200,000 migrating birds. Similarly, when asked whether an event will occur in the next 6 or 12 months, regular forecasters would predict approximately the same probability of the event occurring. Conversely, superforecasters tend to spot the difference in timeframes and adjust their probabilities so, although they did not exhibit perfect scope insensitivity. I expect an explicit examination of base rates would help in reducing that scope insensitivity as it will tend to relate to a timeframe.

A couple of the characteristics Tetlock gives to the superforecasters seem a bit fluffy. Tetlock describes them as having a “growth mindset”, although the evidence presented simply suggests that they work hard and try to improve.

Similarly, Tetlock labels the superforecasters as having “grit”. I’ll just call them conscientious.

Beyond the characteristics of superforecasters, Tetlock revisits a couple of themes from Expert Political Judgment. As a start, there is a need to apply numbers to forecasts, or else they are fluff. Tetlock relates the story of Sharman Kent asking intelligence officers what they took the words “serious possibility” in a National Intelligence estimate to mean (this wording relating to the possibility of a Soviet invasion of Yugoslavia in 1951). The answer turned out to be anything between a 20% and an 80% probability.

Then there is a need for scoring against appropriate benchmarks – such as no change or the base rate. As Tetlock points out, lauding Nate Silver for picking 50 of 50 states in the 2012 Presidential election is a “tad overwrought” if compared to the no-change prediction of 48.

One contrast with the private Expert Political Judgment project was that forecasters in the public IARPA tournament were better calibrated. While the nature of the questions may have been a factor – the tournament questions related to shorter timeframes to allow the tournament to deliver results in a useful time – Tetlock suggests that publicity creates a form of accountability. There was also less difference between foxes and hedgehogs in the public environment.

One interesting point buried in the notes is where Tetlock acknowledges the various schools of thought around how accurate people are, such as the work by Gerd Gigerenzer and friends on the accuracy of our gut instincts and simple heuristics. Without going into a lot of detail, Tetlock declares the “heuristics and biases” program is the best approach to bring error rates in forecasting down. The short training guidelines – contained in the Appendix to the book and targeted to typical biases – improved accuracy by 10%. While Tetlock doesn’t really put his claim to the test by comparing all approaches (What would a Gigerenzer led team do?), the evidence of the success of the Good Judgment team makes it hard, at least for the moment, to argue with.