October 28, 2018

Repurposing trees for causal research: talk at BU

I'll be giving a talk on Monday, Oct 29, 2018, at Boston University's Questrom School of Business on Repurposing trees for causal research


Classification & Regression Trees ("trees") and their variants are popular predictive tools used in many machine learning applications and predictive research. While studying causal effects and structures is central to research in many areas, trees are not commonly used in causal-explanatory research. In this talk I will describe special uses of trees that we developed for tackling two causal-explanatory issues: self selection and confounder detection. For self selection, we develop a novel tree-based approach adjusting for observable self-selection bias in intervention studies, thereby creating a useful tool for analysis of observational impact studies as well as post-analysis of experimental data which scales for big data. For tackling confounders, we use trees for automated detection of potential Simpson's paradoxes in data with few or many potential confounding variables, and even with large samples (big data). Our approach relies on the tree structure and the location of the cause vs. the confounders in the tree. I will illustrate these approaches on applications in eGov, labor economics, and healthcare. 

Relevant papers:

  • Yahav, Shmueli, and Mani (2016). "A Tree-Based Approach for Addressing Self-Selection in Impact Studies with Big Data," MIS Quarterly, (40: 4) pp.819-848.
  • Shmueli and Yahav (2018), "The Forest or the Trees? Tackling Simpson’s Paradox with Classification Trees", Production and Operations Management, vol 27 no 4, pp. 696-716.