"The Forest or the Trees? Tackling Simpson's Paradox in Big Data with Trees" - at ECIS 2014

Earlier this month, Inbal Yahav (Bar Ilan University) and I presented our joint work on detecting Simpson's Paradox in big data as a poster at ECIS 2014 (thanks to the many interested visitors!), and at 2014 SCECR. This work describes an unusual use of classification and regression trees for a causal goal, rather than their normal use in prediction. We develop a tree variant that helps detect possible paradoxes in large datasets. The research-in-progress paper is available here, and the longer version is available on SSRN.