# Why Does Chutes and Ladders Explain Hemoglobin Levels? Some Thoughts on the Normal Distribution

Andrew J. Vickers, PhD

Disclosures

September 07, 2006

One of the great challenges of parenthood is how to lose games of chance. How can I let my son win at Chutes and Ladders (thereby improving his self-esteem and reducing family tension) without cheating, which, as I understand it, would send the wrong message? I can't succeed at both, of course, but "just one more game" does at least allow me to reflect on the nature of statistical distributions.

Chutes and Ladders is a bit like a coin flip, in that there is exactly a 50:50 chance that I'll win a game. If you tell me that we are going to play a certain number of games, I can tell you the probability of each possible combination of wins and losses. As an easy example, if I play 2 games of Chutes and Ladders with my son, there is a 25% chance that I'll lose both, a 50% chance that we'll each win one, and a 25% chance that he'll throw a hissy fit. I can show this as a graph: The y-axis gives the probability that I'll win each particular number of games shown on the x-axis (Figure 1).

Graph for 2 games. The y-axis gives the probability that the author will win each particular number of games shown on the x-axis.

The math is a bit more complicated for 4 games, but as it turns out, there is a 37.5% chance that we split it with 2 games each, and a 6.25% chance of a total meltdown (Figure 2).

Graph for 4 games indicates a 37.5% chance that author and son split wins with 2 games each, and a 6.25% chance of a meltdown by author's son.

Something that you may notice here is that this second graph is starting to look a little bit like the bell-shaped curve that is usually described as the "normal" distribution. Now let's imagine a really wet weekend in which I play 100 games of Chutes and Ladders (Figure 3).

Graph for 100 games approaches normal distribution.

We now have something that really looks like a normal distribution. We also have something that looks very much like many natural biological phenomena. As an example, this is the distribution of hemoglobin in a cohort of Swedish men aged 40-50 taking part in a heart study (Figure 4).

Distribution of hemoglobin in a cohort of Swedish men aged 40-50 taking part in a heart study.

If you concluded that the blood of middle-aged Swedes depended on games of Chutes and Ladders, you wouldn't be far wrong: Like the outcome of a dice-throwing game, a man's hemoglobin level is the result of numerous chance events  genes, environment, diet, lifestyle, and medical history  all added together. When you add up a lot of chance events, what you get is a normal distribution. To a statistician, of course, the normal distribution is actually a complicated formula, including e, mu, pi, and sigma all raised to the power of each other, but the formula is just a mathematical way of describing what you get when you add up lots of chance events.

One example of a set of chance events that is of particular importance to medical research is the result of an experiment. Let's say that I do a big trial of a new heart drug and report death rates of 8.2% in the treatment group compared with 10.1% controls. This is a "risk difference" of 1.9%. Now if I repeated the trial, we wouldn't expect to get exactly the same result; we'd expect some chance variation: For example, if death rates were 10.3% and 8.5% (a risk difference of 1.8%), you'd probably say that I'd done a pretty good replication. Figure 5 shows a computer simulation of the results of the trial if I'd repeated it 100,000 times and there was really no effect of the drug.

Results of trial repeated 100,000 times.

What we see is pretty close to the graph of hemoglobin levels. The normal distribution can therefore be used both to describe the natural variation that we can see in the world around us (such as hemoglobin) and the hypothetical variation of research results (such those from a clinical trial). This means that we can use the mathematical formula for the normal distribution both to help describe data sets and to work out whether our results are interesting: As can be seen from the graph, a 2% difference would be unusual if there was no effect of the drug. Accordingly, we can conclude that the drug probably helps. This is pretty much the basis of P values.

Oh, and the normal distribution can also be used to predict the results of Chutes and Ladders, and therefore exactly anticipates how often I'll be suggesting ice cream on a given weekend.