November 5, 2015

Explain or Predict @ Microsoft Research

The Microsoft Research group in NYC invited me to give a talk on "To Explain or To Predict? How Prediction Can Advance Research". I spent half a day on Nov 3, 2015 at their beautiful lab and learned what "computational social scientists" study. The audience in my talk included folks with a computer science background from the computational social science and machine learning groups and others. The lively discussion touched on what the scientific method means and requires in social science research.

Special thanks to Shawndra Hill, Duncan Watts, and Hanna Wallach for super interesting conversations and hearty hospitality.

November 1, 2015

Talk @ INFORMS: Trees for Detecting Simpson's Paradox in Big Data

Tomorrow at INFORMS's Data Mining Cluster @ 1:30pm, I'll be presenting my work (with Inbal Yahav) "The Forest or the Trees? Tackling Simpson’s Paradox with Classification and Regression Trees". I'll show the special use of the tree structure that we take advantage of in order to detect whether a dataset has Simpson's Paradox (reversal of a causal direction when disaggregating the data). See our working paper on SSRN for more details.

October 27, 2015

Tree based approach for addressing self-selection in Big Data: forthcoming in MIS Quarterly

My paper A Tree-Based Approach for Addressing Self-Selection in Impact Studies with Big Data with Deepa Mani (Indian School of Business) and Inbal Yahav (Bar-Ilan University) is forthcoming in MIS Quarterly, in the special issue on Transformational Issues of Big Data and Analytics in Networked Business. The paper introduces a novel method based on a classification and regression tree - a tool typically used for prediction in data mining - for use in studies that might suffer from self-selection bias, where observations self-select the treatment/control group. We present an alternative to the well-known Propensity Score approach, which is more automated, simpler to understand, more flexible in terms of assumptions and data types, and especially useful with Big Data.

A working paper of an earlier version is available on SSRN.

October 16, 2015

"Big Data & Analytics in the Digital Creative Industries" Talk at Taipei National University of the Arts

On Oct 17, 2015 @ 10am, I'll be giving a talk on "Big Data & Analytics in the Digital Creative Industries" at Taipei National University of the Arts' Film Making Department, as part of Professor Randy Finch's course Digital Media Entrepreneurship. I'll discuss getting Big Data and using it (with Analytics), both by the big content providers and platforms for TV, film, music, etc. as well as by "outsiders" - entrepreneurs, developers, and researchers.

August 11, 2015

Modeling bivariate discrete data - paper now in print!

My paper Modeling Bimodal Discrete Data Using Conway-Maxwell-Poisson Mixture Models with co-authors Smarajit Bose, Pragya Sur and Paromita Dubey (ISI Kolkata) is finally in print in the ASA's Journal of Business & Economic Statistics. We develop a method for modeling the distribution of bimodal discrete data, such as rankings (on a 5-star scale) and even censored data.

For some mysterious reason, our paper went through two rounds of independent proofs, and hence the delay in publication. The good news is that the link (above) to the paper provides a free eprint to the first 30 downloads.