The Drunkard’s Walk, How Randomness Rules Our Lives, by Leonard Mlodinow

Although not presented as such, this book turns out to be a layman’s reading of the history of probability and statistics. It includes high-level historical accounts, often centered on personalities, and conspicuously avoids a single mathematical equation, to the detriment of clarity in some cases. By the end, it was still not clear to me if this book presents any new concepts, or just repackages ideas that have been in the common knowledge-base for at least a century. At a minimum, it packages concepts of probability and statistics is a way that will be accessible to a wide audience, and included many interesting anecdotes. Highlights for me include:

p. 8 “regression toward the mean” this is the first time this concept is addressed in the book. A good video covering regression toward the mean is here

p. 28 “availability bias” due to “availability heuristic,” this is the concept that is easily understood when you hear someone say something like “There’s always bad traffic when I have to be somewhere on time” – besides being overly dramatic, that person is probably selectively remembering time times there was bad traffic when they were running late due to the stress and other negative consequences they experienced in those instances, and not remembering the times when there was not bad traffic, simply because those instances were not remarkable. Mlodinow shows availability bias can be exploited in jury trials: “…the side with the more vivid presentation of the evidence always prevailed, and the effect was enhanced when there was a forty-eight-hour delay before rendering the verdict (presumably because the recall gap was even greater).” (p. 29)

p. 55 “the Monty Hall problem” is introduced and explained. The problem is this: say you’re on a game show, and the grand prize is placed randomly behind one of three doors. What is the probability of picking the winning door? Easy: 1 in 3. Say you’ve made your pick, and then the game show host, who knows where the prize is, opens one of the doors you didn’t pick to show that the prize is not there. Given the opportunity, should you now switch your choice? Intuitively we’d say no, the probably is still equally likely that the prize is behind either door, right? Well in actuality the prize is 2/3 times more likely behind the door you didn’t originally pick!

More details and multiple explanations here: en.wikipedia.org/wiki/Monty_Hall_problem. I like problems like this because it demonstrates limitations in human intuition – we’re simply not wired for this stuff!

p. 71 Discussion on the statistical significance of world series wins – i.e. does winning the World Series mean you’re MLB’s best team? Great discussion here: http://freakonomics.com/2012/11/09/does-the-%E2%80%9Cbest%E2%80%9D-team-win-the-world-series/

p. 79 Winning the Lottery. This is one I’ve thought about, and was surprised to find out it actually happened. This idea is this: the odds for the grand prize of given lottery are set (e.g. for Powerball it’s currently 175,223,510 to 1), so when the grand prize reaches an amount greater than the odds (i.e. 175,223,510 x $2, since Powerball is $2 per ticket, or $350,447,020), why not purchase every possible ticket and guarantee a win? Of course you’d have to have the cash, and somehow overcome the logistical difficulty of purchasing $350,447,020 worth of lottery tickets, but certainly it could be worth it when Powerball reaches record highs (e.g. almost $600 million in May 2013). It turns out a group of 2,500 investors attempted that exact strategy in an Australian lottery in 1992. The investors sought to purchase all 7,059,052 possible ticket combinations for a grand prize of more than $27 million. The logistics proved to be too daunting and the group only purchased around 5 million tickets, but they still won, and lottery officials couldn’t find any valid reason to deny them their prize.

p. 84 “Benford’s Law” is a concept that states: in sets of numbers generated in real-world conditions (e.g. street addresses, stock prices, business expenses) the occurrence of each number as the first digit follows a logarithmic scale, with the number 1 occurring approximately 30% of the time. Mlodinow recounts an example of where this concept was applied to reveal accounting fraud, because a series of expenses did not follow the “law” and were found to be fabricated. The concept is also applied to election, economic, and genome data, for example. Some data sets do not fall under the law however, such as 9-digit U.S. phone numbers (since none start with 1).

p. 113 the “two-girl” problem. The issue at hand is this: assuming the probability of having a boy or girl is 50/50, what is the probability that a couple with two children has two girls? It’s 1/4, because there are four possibilities: boy, boy; boy, girl; girl, boy; and girl, girl, one of which meets the criteria.

Given that, say we know the couple has at least one girl, what is the probability of them having two girls? The probability of two girls knowing at least one is a girl is 1/3, because the possibilities are: boy, girl; girl, boy; and girl, girl.

Here’s the kicker: in the same situation, say we know at least one of the kids is a girl as before, but now we know her name is Claudia. What is the probability of two girls now? Most people would say it’s still 1/3, but it’s actually 1/2 (!), because there are 4 possibilities, 2 of which meet the criteria: boy, girl named Claudia; girl named Claudia, boy; girl named Claudia, girl; and girl, girl named Claudia. It’s quite counter-intuitive to consider knowing the name of one of the kids changes the probability but it does.

p. 117 “Bayes’s Theory shows that the probability that A will occur if B occurs will generally differ from the probability that B will occur if A occurs.” Mlodinow gives the example of physicians asked to estimate the probability of an asymptomatic woman between the ages of 40 and 50 who has a positive mammogram actually having breast cancer if the false positive rate for mammogram’s is 7%, the false negative rate is 10%, and the actual incident of cancer is about 0.8%. According to Mlodinow, Bayes methods leads to a probability of about 9% (I’d like to see this calculation done), and the physicians estimated on average 70%.

Another example he gives is the famous case of Sally Clark, who had back-to-back children die of Sudden-Infant-Death Syndrom (SIDS). She was convicted of murder in Britain in 1999 based on the erroneous calculation that 1 in 8,543 children die of SIDS so the probability of two children dying of SIDS back-to-back would be 1 in about 73 million (8,543 x 8,543 = 72,982,849). Then, given about 700,000 births in Britian each year, it could be expected back-to-back SIDS deaths only once every 100 years. It was later brought to light a couple misleading assumptions with this argument:

- By calculating the odds as the probability of one event x the probability of the other event you’re making the assertion that the two events are independent, i.e. that that chances of one occurring are completely independent of the other occurring, like two coin flips in a row. In reality, although the exact cause(s) of SIDS is not clearly known, it is assumed genetic and/or environmental factors play a role, such that SIDS deaths in the same family would not be independent.
- The figure 1 in 73 million was portrayed as the odds Clark was innocent, when in fact it was an erroneous calculation of back-to-back SIDS deaths. Even if the figure was correct, it would need to be weighed against the probably of Clark performing double murder, which is also very unlikely. The odds ratio for double SIDs to double murder was estimated to be somewhere between 4.5:1 to 9:1.

Clark was eventually released.

Another interesting misuse of statistics in jury cases is described on p. 36-37, having to do with DNA evidence. You often hear that a DNA match is 1 in 100 million or something outrageous like that, which demonstrates only the complexity of the DNA molecule. In reality, the process of DNA matching is much less reliable, due to errors in collection, possibilities of contamination, equipment operator error, etc. In all likelihood, you probably have lab errors in at least 1 in 100 cases, which makes the statement “less that 1 in 100 million chance” meaningless.

p. 122: good explanation of the difference between probability and statistics: “[probability] concerns predictions based on fixed probabilities; [statistics] concerns the inference of those probabilities based on observed data.”

p. 131: “…the sense of taste comes from five types of receptor cells on the tongue: salty, sweet, sour, bitter, and umami.” Uh, umami? I had never heard of this, but apparently it’s been an official taste type since 1985. See here: http://en.wikipedia.org/wiki/Umami. Fun article on how chefs leverage this taste type here.

p. 133: “expectancy bias” another bias, describing how the human brain will fool you into sensing what you think you’re going to sense based on context. Mlodinow gives the example of wine tasters who are fooled when white wine is colored with food coloring. He also describes an experiment where self-proclaimed Coke or Pepsi lovers and not able discern whether or not they were drinking the other brand when it was presented in their brand’s can.

p. 136: First mention of the bell curve, i.e. the normal distribution.

p. 152: a “life table” shows data on life expectancy, was apparently first assembled in London in 1662. Then more than 25% of people died before the age of 6 and 60% were dead by the age of 16! That’s a pretty strong argument for improved quality of life in the modern age. Can you imagine more than half of the people you grew up with dying before finishing high school? Another thing to note is people who did make into adulthood were likely to live just about as long as people today: i.e. around 75 years.

p. 156: “Justin Wolfers, an economist at the Wharton School, found evidence of fraud in the results of about 70,000 college basketball games. More here. Well that’s disillusioning.

p. 157: More on how sports betting works: “Though … point spreads are set by the bookies, they are really fixed by the mass of bettors because the bookies adjust them to balance the demand. (Bookies make their money on fees and seek to have an equal amount of money bet on each side so that they can’t lose, whatever the outcome.) To measure how well bettors assess two teams, economists use a number called the forecast error, which is the difference between the favored team’s margin of victory and the point spread determined by the marketplace. It may come as no surprise that forecast error, being a type of error, is distributed according to the normal distribution. Wolfers found that its mean is 0…”

p. 160: “…not all that happens in society, especially in the financial realm, is governed by the normal distribution. For example, if film revenue were normally distributed, most films would earn near some average amount, and two-thirds of all film revenue would fall within a standard deviation of that number. But in the film business, 20 percent of the movies bring in 80 percent of the revenue.” One thing missing from this book I find is discussion of other distributions, for instance the power law alluded to here, but also a discussion of *why* movie revenues do not follow the normal distribution. And how about the method a normal distribution arises?

p. 162: another discussion of regression toward the mean, this time having to do with the physical characteristics of humans. He gives the example if two taller-than-average people have offspring, their children will not continue to be taller and taller, but will instead regress towards the mean, which makes sense. But what if I was shorter than average for my population and had offspring with a female that was taller than average for another population. Would their be a conflict in the regression since one is going one direction and the other is going the opposite direction?

p. 163: a powerful statistical concept is briefly touched on: the coefficient of correlation.

p. 164: another mentioned: the chi-squared test

p. 168: on Brownian motion: “…much of the order we perceive in nature belies an invisible underlying disorder and hence can be understood only through the rules of randomness.” Again, one thing I think is missing from this book is a discussion of how apparently pervasive randomness fits in with cause and effect.

p. 174: “It is human nature to look for patterns and to assign them meaning when we find them. Kahneman and Tversky analyzed many of the shortcuts we employ in assessing patterns in data and in making judgments in the face of uncertainty. They dubbed those shortcuts heuristics.”

p. 175: cute example of human perception of randomness: “Apple ran into that issue with the random shuffling method it initially employed in its iPod music players: true randomness sometimes produces repetition, but when users heard the same song or songs by the same artist played back-to-back, they believed the shuffling wasn’t random. And so the company made the feature ‘less random to make it feel more random,’ said Apple founder Steve Jobs.”

p. 185: the importance of feeling “control” in our lives. “In another study, in a group of subjects who were told they were going to take a battery of important tests, even the pointless power to control the order of those tests was found to reduce anxiety levels.”

p. 186: another example: In one of [Langer’s] studies, participants were found to be more confident of success when competing against a nervous, awkward rival than when competing against a confident one even though the card game in which they competed, and hence the probability of succeeding, was determined purely by chance.”

p. 189: “When we are in the grasp of an illusion – or, for that matter, whenever we have a new idea – instead of searching for ways to prove our ideas wrong, we usually attempt to prove them correct. Psychologists call this the confirmation bias.”

p. 191: “The human brain has evolved to be very efficient at pattern recognition, but as the confirmation bias shows, we are focused on finding and confirming patterns rather than minimizing our false conclusions. Yet we needn’t be pessimists, for it is possible to overcome our prejudices. It is a start simply to realize that chance events, too, produce patterns. It is another great step if we learn to question our perceptions and our theories. Finally, we should learn to spend as much time looking for evidence that we are wrong as we spend searching for reason we are correct.”

p. 194: “…Lorenz found that such small differences led to massive changes in the result. The phenomenon was dubbed the butterfly effect.

p. 195: “…the Nobel laureate Max Born wrote, ‘Chance is a more fundamental conception than causality.'” But what is the relationship between chance and causality? Clearly both exist and they complement each other somehow. I wish there was more discussion on that point.

p. 197: “That fundamental asymmetry is why in day-to-day life the past often seems obvious even when we could not have predicted it. It’s why weather forecasters can tell you the reasons why three days ago the cold front moved like this…but the same forecasters are much less successful at knowing how the fronts will behave in three days hence…”

p. 202: “It is easy to concoct stories explaining the past or to become confident about dubious scenarios for the future. That there are traps in such endeavors doesn’t mean we should not undertake them. But we can immunize ourselves against our errors of intuition. We can learn to view both explanations and prophecies with skepticism. We can focus on the ability to react to events rather than relying on the ability to predict them, on qualities like flexibility, confidence, courage, and perseverance.”

p. 205: “That is the deterministic view of the marketplace, a view in which it is mainly the intrinsic qualities of the person or the product that governs success. But there is another way to look at it, a nondeterministic view. In this view there are many high-quality but unknown books, singers, actors, and what makes one or another come to stand out is largely a conspiracy of random and minor factors – that is, luck.” Okay, but there are things that contribute to “high-quality” and luck, such as talent and tenacity. What is the relationship between these things and randomness?

p. 214: “If it is easy to fall victim to expectations, it is also easy to exploit them. That is why struggling people in Hollywood work hard to look as though they are not struggling, why doctors wear while coats and place all manner of certificate and degrees on their office walls, why used-car salesmen would rather repair blemishes on the outside of a car than sink money into engine work…”

p. 215: Mlodinow gives the example of a 2005 blind vodka tasting, in which less expensive Smirnoff beat out “luxury” brands Grey Goose and Ketel One. Details here.

p. 217: “For even a coin weighted toward failure will sometimes land on success. Or as IBM pioneer Thomas Watson said, ‘If you want to succeed, double your failure rate.'”

Overall The Drunkard’s Walk by Leonard Mlodinow was a good, thought-provoking read. It contains a lot of anecdotes, and less hard math or science. Again, I would have liked to see him include equations where they could have greatly improved clarity. I’d also like to have seen more discussion on how randomness complements cause and effect, because the theme of the book is almost read as life is totally random, and there is no cause and effect.