Beyond Curve Fitting

Last week I attended the AAAI spring symposium on “Beyond Curve Fitting: Causation, Counterfactuals, and Imagination-based AI”, held at Stanford University. Since Judea Pearl and Dana Mackenzie published “The Book of Why”, the topic of causal inference gains increasing momentum in the machine learning and artificial intelligence community. If we want to build truly intelligent machines, which are able to interact with us in a meaningful way, we have to teach them the concept of causality. Otherwise, our future robots will never be able to understand that forcing the rooster to crow at 3am in the morning won’t make the sun appear.

Causal inference has always been somewhat of a niche topic in AI. All of the cutting-edge machine learning tools—you know, the ones you’ve heard about, like neural nets, random forests, support vector machines, and so on—remain purely correlational, and can therefore not discern whether the rooster’s crow causes the sunrise, or the other way round. This seems to be changing though and more and more big shots start to recognize the limits of prediction methods and acknowledge a need for major retooling in the community.

Yoshua Bengio from the University of Montreal, who’s one of the pioneers in deep learning, was attending the symposium too. He was speaking about transfer learning and causal discovery (the slides are available on the website). One funny anecdote of the event was that nobody during his talk—apart from himself, maybe—knew yet that Yoshua will be awarded the 2018 Turing award (together with Geoffrey Hinton and Yann LeCun) for his contributions to neural networks and AI.

After his presentation, Yoshua was excusing himself for not attending lunch, because he had to take an “important phone call”. That’s when the news broke. So together with Judea Pearl’s keynote on the first day, that made already two Turing award winners at the symposium.

IMG_9817 2
A personal highlight for me: meeting the father of causal inference in AI — Judea Pearl

Based on Pearl’s seminal work on graph-theoretic causal models (directed acyclic graphs), tremendous progress has been made in the field of causal AI during the last 30 years. But causal inference is obviously also super important in other fields that rely on empirical work. And they all have developed their own idiosyncratic methods for approaching causal questions. The symposium program was thus divided into several “Causality + X” sessions, where X was referring to many of the scientific disciplines in which causal inference plays a role:

  • Machine learning and AI
  • Computer vision
  • Social sciences
  • Health sciences / epidemiology

This format created a great opportunity for sharing different perspectives and stimulated learning beyond narrow disciplinary silos.

My session was about causal inference in the social sciences, together with Kosuke Imai from Harvard. I was responsible for representing the economics view.

IMG_9823

31c1a3bd-7316-4ae7-bf31-5a63b973fd20
Session: Causality + social sciences

If you’re interested in my slides, you can have a look here. Soon, Elias Bareinboim (who was organizing the event, thanks Elias!) and I, will also release a working paper, in which we’ll get into much more detail on the subject.

To quickly summarize my main message: Having spent considerable time studying the methods for causal inference developed in computer science, I came to the conclusion that economists can learn a lot from engaging with that literature. Of course, that goes the other way round. So I think we could all benefit tremendously from mutual knowledge exchange, which—I must admit—didn’t happen so far to a satisfactory extent. But I see many promising signs of improvement. More and more economists express interest in DAG methodology and what they have to offer.

One thing became clear to me when attending the symposium. The field of causal AI is developing rapidly in so many directions, and a lot of different fields are currently adopting graph-based approaches to causality. Econ should keep pace if we don’t want to lose touch with these developments. That doesn’t mean that we need to abandon our own unique perspective on causal inference, which is tailored to our specific needs. But coordinating on one basic framework for causal inference can have huge potential for cross-fertilization between disciplines. Something that we’re not nurturing nearly enough at the moment, if you ask me.

No Free Lunch in Causal Inference

Last week I was teaching about graphical models of causation at a summer school in Montenegro. You can find my slides and accompanying R code in the teaching section of this page. It was lots of fun and I got great feedback from students. After the workshop we had stimulating discussions about the usefulness of this new approach to causal inference in economics and business. I’d like to pick up one of those points here, as this is an argument I frequently hear when talking to people with a classical econometrics training. Continue reading No Free Lunch in Causal Inference

Econometrics and the “not invented here” syndrome: suggestive evidence from the causal graph literature

[This post requires some knowledge of directed acyclic graphs (DAG) and causal inference. Providing an introduction to the topic goes beyond the scope of this blog though. But you can have a look at a recent paper of mine in which I describe this method in more detail.]

Graphical models of causation, most notably associated with the name of computer scientist Judea Pearl, received a lot of pushback from the grandees of econometrics. Heckman had his famous debate with Pearl, arguing that economics looks back on its own tradition of causal inference, going back to Haavelmo, and that we don’t need DAGs. Continue reading Econometrics and the “not invented here” syndrome: suggestive evidence from the causal graph literature

Judea Pearl on Angrist and Pischke

Today, Judea Pearl commented on a new NBER working paper by Josh Angrist and Jörn-Steffen Pischke in a mail for subscribers to the UCLA Causality Blog. I think the text is too good to hide it in a mailing list though. That’s why I will quote it here:

Overturning Econometrics Education
(or, do we need a “causal interpretation”?)

My attention was called to a recent paper by Josh Angrist and Jorn-Steffen Pischke titled; “Undergraduate econometrics instruction” (A NBER working paper)
http://www.nber.org/papers/w23144?utm_campaign=ntw&utm_medium=email&utm_source=ntw

This paper advocates a pedagogical paradigm shift that has methodological  ramifications beyond econometrics instruction;  As I understand it, the shift stands contrary to the traditional teachings of causal inference, as defined by Sewal Wright (1920), Haavelmo (1943), Marschak (1950), Wold (1960), and other founding fathers of econometrics methodology.

In a nut shell, Angrist and Pischke  start with a set of favorite statistical routines such as IV, regression, differences-in-differences among others, and then search for “a set of control variables needed  to insure that the regression-estimated effect of the variable of interest has a causal interpretation” Traditional causal inference (including economics)  teaches us that asking whether the output of a statistical routine “has a causal interpretation” is the wrong question to ask, for it misses the direction of the analysis. Instead, one should start with the target causal parameter itself, and asks whether it is ESTIMABLE (and if so how),  be it by IV, regression, differences-in-differences, or perhaps by some new routine that is yet to be discovered and ordained by name. Clearly, no “causal interpretation” is needed for parameters that are intrinsically causal; for example, “causal effect” “path coefficient”, “direct effect” or “effect of treatment on the treated” or “probability of causation”

In practical terms, the difference  between the two paradigms is that estimability requires a substantive model while interpretability appears to be model-free.
A model exposes its assumptions explicitly, while statistical routines give the deceptive impression that they run assumptions-free ( hence their popular appeal). The former lends itself to judgmental and statistical tests, the latter escapes such scrutiny.

In conclusion, if an educator needs to choose between the “interpretability” and “estimability” paradigms, I would go for the latter. If traditional econometrics education is tailored to support the estimability track, I do not believe a paradigm shift is warranted towards an “interpretation seeking” paradigm as the one proposed by Angrist and Pischke,

I would gladly open this blog for additional discussion on this topic.

I tried to post a comment on NBER (National Bureau of Economic Research), but was rejected for not being an approved “NBER family member”. If any of our readers is a “”NBER family member” feel free to post the above.

Note: “NBER working papers are circulated for discussion and comment purposes.” (page 1).

Judea

Update: By now, the text has been published on the causality blog.

Different stages of empirical research

Eventually, the job market stress comes to an end. So I thought I could start into the blogging year with a bit of humor. During the last couple of weeks I flew out to both economics and more management-oriented departments. That’s were the inspiration for this little comic came from.

state_of_empirical_cropped

Causality for Policy Assessment and Impact Analysis

Here is a great introductory lecture into causal inference and the power of directed acyclic graphs / bayesian networks. It repeats a point I made earlier on this blog that big data alone, without a causal model (i.e., theory) to support it, is simply not sufficient for making causal claims. Continue reading Causality for Policy Assessment and Impact Analysis