You Can’t Test Instrument Validity

Instrumental variable (IV) estimation is an important technique in causal inference and applied empirical work. The canonical IV setting looks like the following:

Here, the relationship between X and Y is confounded by unobservable influence factors (denoted by the dashed bidirected arrow). Therefore we cannot estimate the causal effect of X on Y by a simple regression. But since the instrument Z induces variation in X that is unrelated to the unobserved confounders, we can use Z as an auxiliary experiment that allows us to identify the so-called local average treatment effect (or LATE) of X on Y.¹

For this to work it’s crucial that Z doesn’t directly affect Y (i.e., no arrow from Z to Y). Moreover, there shouldn’t be any unobservable confounders (i.e., other dashed bidirected arcs) between Z and Y, otherwise the identification argument breaks down. These two assumptions need to be justified purely based on theoretical reasonings and cannot be tested with the help of data.

Unfortunately, however, you will frequently come across people who don’t accept that the assumption of instrument validity isn’t testable. Usually, these folks then ask you to do one of the following two things in order to convince them:

1. Show that Z is uncorrelated with Y (conditional on the other control variables in your study), or;
2. Show that Z is uncorrelated with Y when adjusting for X (again, conditional on the other controls).

Both of these requests are wrong. The first one is particularly moronic. In order to not run into a weak instruments problem we want that Z exerts a strong influence on X. If X also affects Y, there will be a correlation between Z and Y by construction, through the causal chain Z $\rightarrow$ X $\rightarrow$ Y.

The second request is likewise mistaken, because adjusting for X doesn’t d-separate Z and Y. On the contrary, as X is a collider on Z $\rightarrow$ X $\dashleftarrow \dashrightarrow$ Y, conditioning on X opens up the path and thus creates a correlation between Z and Y.²

So both “tests” won’t tell you anything about whether the causal structure in the graph above is correct. Z and Y can be significantly correlated (also condional on X) even though the instrument is perfectly valid. These tests have no discriminating power whatsoever. Instead, all you can do is argue on theoretical grounds that the IV assumptions are fulfilled.

In general, there is no such thing as purely data-driven causal inference. At one point, you will always have to rely on untestable assumptions that need to be substantiated by expert knowledge about the empirical setting at hand. Causal graphs are of great help here though, because they make these assumptions super transparent and tractable. I see way too many people — all across the ranks — who are confused about the untestability of IV assumptions. If we would teach them causal graph methodology more thoroughly, I’m sure this would be less of a problem.

¹ Identification of the LATE additionally requires that the effect of Z on X is monotone. If you want to know more about these and other details of IV estimation, you can have a look at my lecture notes on causal inference here.

² I explain the terms d-separation and colliders both here and here (latter source is more technical)

No Free Lunch in Causal Inference

Last week I was teaching about graphical models of causation at a summer school in Montenegro. You can find my slides and accompanying R code in the teaching section of this page. It was lots of fun and I got great feedback from students. After the workshop we had stimulating discussions about the usefulness of this new approach to causal inference in economics and business. I’d like to pick up one of those points here, as this is an argument I frequently hear when talking to people with a classical econometrics training. Continue reading No Free Lunch in Causal Inference

Causality for Policy Assessment and Impact Analysis

Here is a great introductory lecture into causal inference and the power of directed acyclic graphs / bayesian networks. It repeats a point I made earlier on this blog that big data alone, without a causal model (i.e., theory) to support it, is simply not sufficient for making causal claims. Continue reading Causality for Policy Assessment and Impact Analysis