Causal Data Science in Business

A while back I was posting about Facebook’s causal inference group and how causal data science tools slowly find their way from academia into business. Since then I came across many more examples of well-known companies investing in their causal inference (CI) capabilities: Microsoft released its DoWhy library for Python, providing CI tools based on Directed Acylic Graphs (DAGs); I recently met people from IBM Research interested in the topic; Zalando is constantly looking for people to join their CI/ML team; and Lufthansa, Uber, and Lyft have research units working on causal AI applications too.

The topic of causal inference seems to be booming at the moment—and for good reasons.

Causal knowledge is crucial for decision-making. Take the example of an advertiser who wants to know how effective her company’s social media marketing campaign on Instagram is. Unfortunately, our current workhorse tools in machine learning are not capable of answering such a question.

A decision tree classifier might give you a very precise estimate that ads which use blue colors and sans-serif fonts are associated with 12% higher click-through rates. But does that mean that every advertising campaign should switch to that combination in order to boost user engagement? Not necessarily. It might just reflect the fact that a majority of Fortune-500 firms—the ones with great products—happen to use blue and sans-serif in their corporate designs.

This is what Judea Pearl—father of causality in artificial intelligence—calls the difference between “seeing” and “doing”. Standard machine learning tools are designed for seeing, observing, discerning patterns. And they’re pretty good at it! But management decisions very often involve “doing”, as long the goal is to manipulate a variable X (e.g., ad design, team diversity, R&D spending, etc.) in order to achieve an effect on another variable Y (click-through rate, creativity, profits, etc.).

In my group we recently won a grant for a research project in which we want to learn more about how this crucial difference affects business practices. In particular, we want to know what kind of questions companies are trying to answer with their data science efforts, and whether these questions require causal knowledge. We also want to understand better whether firms are using appropriate tools for their respective business applications, or whether there’s a need for major retooling in the data science community. After all, there might be important questions that currently remain unanswered, because companies lack the causal inference skills to address them. That’s certainly another issue we would like to explore.

So, if you working in the field of data science and machine learning, and you’re interested in causality, please come talk to us! We would love to hear about your experiences. Slowly but surely, causal inference seems to develop into one of the hottest trends in the tech sector right now, and our goal is to shed more light on this phenomenon with our research.

Beyond Curve Fitting

Last week I attended the AAAI spring symposium on “Beyond Curve Fitting: Causation, Counterfactuals, and Imagination-based AI”, held at Stanford University. Since Judea Pearl and Dana Mackenzie published “The Book of Why”, the topic of causal inference gains increasing momentum in the machine learning and artificial intelligence community. If we want to build truly intelligent machines, which are able to interact with us in a meaningful way, we have to teach them the concept of causality. Otherwise, our future robots will never be able to understand that forcing the rooster to crow at 3am in the morning won’t make the sun appear. Continue reading Beyond Curve Fitting

Microsoft Releases New Python Library for Causal Inference

A while ago I blogged about Facebook’s causal inference group. Now Microsoft has followed suit and released a Python library for graph-based methods of causal inference. Continue reading Microsoft Releases New Python Library for Causal Inference