On Causality, Confounders, and Lots of Rosé

F. Perry Wilson, MD, MSCE


April 17, 2019

Welcome to Impact Factor. I'm Perry Wilson.

And now for something completely different. Instead of talking about a breaking medical study (my usual modus operandi), I want to take a moment to discuss something that applies to so many breaking medical studies: the concept of confounding.

Most docs have an intuition for what a confounder is. We read a study that suggests that breast-feeding is linked to higher IQ, and we get a bit skeptical. We seem to have a gut feeling that maybe the act of breast-feeding isn't causing higher IQs in children; maybe it's merely a marker for some other thing, like maternal education. That other thing is a confounder, and confounders make assessment of causality really difficult.

Why does causality matter? For the simple reason that if A causes B, then changing A should lead to a change in B. That's the model for a new therapy and the goal of much of medical science.

But if you want to really understand confounding—and who doesn't—you need to know a bit more. Let me walk you through it.

Let's imagine a study that started, as most studies do, with a clinical observation. A researcher noted that individuals who drink rosé are shorter, on average, than those who don't. The hypothesis: Rosé stunts your growth.

We can represent this hypothesis with a schematic. You can call it a directed acyclic graph if you want to sound smart at parties, but I often refer to it as a "causal diagram." We are asking: Does drinking rosé cause short stature?


Now, you are probably way ahead of me. There is a major confounder here: Women drink more rosé than men, and women are shorter than men, on average. In this case, women confound the rosé-height relationship.


Here's the critical thing to note: Being a woman is associated with both the exposure of interest and the outcome of interest. That is the definition of a confounder.

Once we've identified a confounder, we can adjust for it. Adjusting cuts causal lines, like this,


allowing the true relationship between exposure and outcome to emerge.

Now let's reset and ask about a variable that is only associated with the outcome but not the exposure—say, parental height.


Not a confounder. Practically speaking, you don't need to worry about it. Testing the rosé-height hypothesis does not require measurement of parents' height. Adjusting for parents' height gets you no closer to the causality question than you were when you started.

Okay, let's reset and add our confounder back in, the factor associated with both exposure and outcome. What about factors associated with the exposure but not the outcome?

Now, this can be special. If you can find a factor that is associated with the exposure but has no plausible link to the outcome, you have identified what is called an instrumental variable, and these are super-cool.

Let's imagine that there is a gene called ROSE1. It codes for a receptor on the tongue that makes rosé taste absolutely bonkers amazing. People born with this gene will drink more rosé because it just tastes so good.

Let's posit further that the gene is not on a sex chromosome, so it has no relationship to gender. The gene also has no plausible link to height; it doesn't code for any growth proteins.

If rosé really does cause short stature, people born with this gene will be shorter on average due to all that rosé they have been drinking. There is a path from the gene to stature.


If, on the other hand, the observed rosé-height relationship is all due to confounding, people born with the ROSE1 gene will be no taller or shorter than the rest of us.


See? Without a causal link between rosé drinking and stature, the gene promoting rosé drinking has no path to get to stature. That is what is so special about an instrumental variable—it allows for a decent assessment of causality.

A genetic instrumental variable is even more special. In fact, if you ever read a study referencing Mendelian randomization, this is exactly what they are talking about. They found an instrumental variable that just happened to be a gene, and it allows for all sorts of causal inference that you couldn't do otherwise because genes are, basically, assigned at birth.

So that's more than you ever wanted to know about confounding, but I hope it helps the next time you are reading an observational study while drinking rosé. Cheers.


Comments on Medscape are moderated and should be professional in tone and on topic. You must declare any conflicts of interest related to your comments and responses. Please see our Commenting Guide for further information. We reserve the right to remove posts at our sole discretion.