
Many studies use quasi-experiments, which are similar to randomized experiments except that subjects are not randomly assigned to the treatment and control groups. The result is what's called "self-selection bias", which requires special analysis correction for valid inference about the treatment effect. In a joint paper with Deepa Mani (ISB) and Inbal Yahav (Bar-ilan U) we propose a new method that is based on classification and regression trees: "A Tree-Based Approach for Addressing Self-Selection in Impact Studies with Big Data", MIS Quarterly, vol 40 no 4, pp. 819-848. Useful also for randomized experiments and observational impact studies.
Check out the slide deck for a quick walk-through.