A frequent point of criticism against Directed Acyclic Graphs is that writing them down for a real-world problem can be a difficult task. There are numerous possible variables to consider and it’s not clear how we can determine all the causal relationships between them. We recently had a Twitter discussion where exactly this argument popped up again.
I’ve written about this problem before, where I argue that DAGs actually don’t have to be that complex, if we look at, for example, the models we work with in structural econoemtrics or economic theory. But Jason Abaluck, professor at the Yale School of Management, brought up an interesting example that might be useful for illustrating what I have in mind.
Here is my reply:
It’s good point that mapping out what we know in a DAG – especially for unchartered territory – can be complex. Related to the specific example of the college wage premium, I would advise a grad student who studies this question to first do a thorough literature review. That’s the basis for synthesizing what we’ve learned in 50 years or so about the topic. The DAG then serves as a perfect tool for organizing this body of knowledge. Now, for some arrows the decision to include or omit them might be ambiguous. But these are exactly the cases where there is a need for future research. A great opportunity for a fresh grad student.
This process is of course quite tedious, but there isn’t really an alternative to it. When we justify the exogeneity of our instruments, we also need to know all possible confounders that might play a role. The same goes for arguing that there is no self-selection around the discontinuity threshold or that common trends hold. We can only justify these assumptions by synthesizing the prior knowledge we have about the subject under study.
The fact that some people think this would be different with potential outcome methods, is because we’ve accepted loose standards for arguing verbally about ignorability, exogeneity and causal mechanism in our papers and seminars. This process is highly non-transparent and prone to arguments by authority.
Going through the entire body of knowledge about a specific problem and casting it into a DAG is cumbersome, I realize. Once we will start to make our assumptions more explicit though, others will be able to build on our work. They can then test the proposed model against the available data or look for experimental evidence for ambigious causal relationships. This process of knowledge curation is not something one paper can achieve alone, it has to be a truly collaborative exercise. I don’t see how we can have real progress in a field without it.
Last week I attended the AAAI spring symposium on “Beyond Curve Fitting: Causation, Counterfactuals, and Imagination-based AI”, held at Stanford University. Since Judea Pearl and Dana Mackenzie published “The Book of Why”, the topic of causal inference gains increasing momentum in the machine learning and artificial intelligence community. If we want to build truly intelligent machines, which are able to interact with us in a meaningful way, we have to teach them the concept of causality. Otherwise, our future robots will never be able to understand that forcing the rooster to crow at 3am in the morning won’t make the sun appear. Continue reading Beyond Curve Fitting
I just submitted an extended abstract of an upcoming paper to a conference that will discuss new analytical tools and techniques for policymaking. The abstract contains a brief discussion about the importance of causal inference for taking informed policy decisions. And I would like to share these thoughts here. Continue reading Causal Inference for Policymaking
Here is an interesting bit of intellectual history. In his 2000 book “Causality”, Judea Pearl describes how he got to the initial idea that sparked the development of causal inference based on directed acyclic graphs. Continue reading The Origins of Graphical Causal Models
In 1957, Zvi Griliches published a seminal article in innovation economics (“Hybrid Corn: An Exploration in the Economics of Technological Change“), which is based on his PhD thesis. It is safe to say that this piece stands at the beginning of innovation developing into an independent subfield of economics. Besides that, Griliches was also a pioneer in modern style econometric work. In this paper you can clearly see why. It’s a marvellous combination of policy relevant work–he collected a novel data set on US corn production of the time–and advancement in economic theory. Continue reading How We Started to Study Technological Change
Technology transfer is a big topic for scholars and policy makers.We would like to know how we can harvest the knowledge and ideas that are produced at universities and research institutes and to make them available to society. The invention of new technologies is only a first step. They need to be commercialized as innovative products and services to further foster a society’s wealth. Especially Europe could do better here. Continue reading How to get knowledge out of the ivory tower?
Fabian Waldinger pursued an interesting research agenda so far. Let me explain what’s so fascinating about his work. But in order to do so, I first need to describe why (good) empirical research is actually such a hard task. Continue reading Exodus in German science