I’ve been reading a very interesting book called “The Flaw of Averages” by Sam Savage. It looks at why using average data only produces the correct answers in very limited circumstances. The flaw of averages is that plans based on average assumptions are, on average, wrong.

For example, assume you are a manager deciding how big a factory (or fab) to build. Your marketing manager tells you he is certain that you’ll sell between 80,000 and 120,000 per year. But you insist on a number and get given the average of 100,000 and you build a factory with a capacity for 100,000. Let’s assume that the marketing manager nailed the numbers precisely (don’t we always?). On average how much money will you make? Well, the number will be somewhere between 80,000 and 120,000. If the number is less than 100,000 you make less money than you expected. If the demand is greater than 100,000 you don’t make more money because your capacity is maxed out. So, on average, you make less money than you expected even though your factory has average capacity.

There are other fascinating things. You may have heard of Simpson’s paradox. One of the most famous examples of this was a 1986 kidney stone study where treatment A was more effective than treatment B. But if you looked at only small kidney stones, then treatment B was better than treatment A. And if you looked at only large kidney stones, then again treatment B was better than treatment A. But when the two were combined, A was better than B. WTF?

Another example: in each of 1995, 1996 and 1997 David Justice had a higher baseball batting average Derek Jeter. But taking the three years together, Derek Jeter had a higher average than Justice. WTF?

A lot of what you learned in school about statistics (means, variance, correlation etc) is really not very relevant now that we can run large numbers of investigations as to what is really going on in seconds. Means and standard deviations were an attempt to get at something important before this capability existed, what Sam Savage calls “steam era” statistics. Now we can use computation to make sure we don’t fall into traps.

There’s also lots of stuff about options and how to price them depends on thinking (or computing) this sort of thing properly. If a stock is $20 today and on average will be $20 in 12 months time, how much should you pay for an option to buy it for $21 in a year. If you’d succeeded in answering this a few decades ago you’d have won the Nobel prize. You may have heard about Black-Scholes option pricing, which does the math to work this out. Even though at the average stock price ($20) an option to purchase at $21 is worth nothing (because you’d simply not exercise your option) it clearly is worth something since there is some chance that the stock will end up above $21 and you can make money exercising your option and selling it at the market price.

I haven’t finished the book yet but I can see already that some of the ideas are important in thinking about business plans and formalizes some of the sensitivity analysis that it is always good to do (how much more money do we need to raise if the first orders come 6 months later than expected? if the product costs 30% more to develop?).

Consider a drunk walking down the middle of a highway. His average position is in the center of the road on the yellow line. But on average where is he. Dead.

And don’t forget, almost everyone has more than the average number of legs.