Statistical Strategy

I have been tackling some "big picture" questions related to using statistical methods in practice.

My major focus is on assessing the differences between explanatory, predictive and descriptive modeling and statistical modeling in terms of the statistical modeling process (from data collection and goal definition to model use). My paper To Explain or To Predict? discusses the distinction from a statistical point of view. The paper Predictive Analytics in Information Systems Research examines the value of predictive modeling to theory building, testing, and validation, illustrated in information systems research which is monopolized by explanatory modeling.

Quality Control

Runs and Scans in Industrial statistics

During my Ph.D. I developed a method for computing exact probabilities for random variables that arise when runs or scans are used. A run is a sequence of consecutive successes in a series of Bernoulli trials. A scan is a “window” of consecutive Bernoulli trials that includes at least a given number of successes. Runs and scans are applied in various fields. Although they are easy to understand and use, the random variables that arise tend to have characteristics (e.g. probability functions, moments) that are complicated for computation.

Count Data Model

Shmueli et al. (2005) revived a useful discrete distribution called the COM-Poisson (the Conway–Maxwell–Poisson) and introduced its statistical and probabilistic properties. This distribution is a two-parameter extension of the Poisson distribution that generalizes some well-known discrete distributions (Poisson, Bernoulli and geometric).

Online Auctions

Empirical research of online auctions such as eBay has been dominated by researchers from economics and information systems. Together with colleagues from Information Systems and statistics, I have been working on developing statistical methods for visualizing, collecting, modeling, predicting and analyzing such data. Bid data (and other types of eCommerce data) have non-standard structures and therefore require careful and specialized methods.

Two upcoming talks at CityU Hong Kong

This week I will give two talks at the Information System Department, College of Business, City University of Hong Kong:

To Explain or To Predict?
Date: 17 Jan, 2017
Time: 2:00pm to 3:30pm
Venue: AC3-6-208, Academic Building 3, City University of Hong Kong

A Tree-Based Approach for Addressing Self-Selection in Impact Studies with Big Data
Date: 18 Jan, 2017
Time: 2:00pm to 3:30pm
Venue: AC3-7-211, Academic Building 3, City University of Hong Kong

Paper on trees for addressing self-selection in impact studies now published in MIS Quarterly

Many studies use quasi-experiments, which are similar to randomized experiments except that subjects are not randomly assigned to the treatment and control groups. The result is what's called "self-selection bias", which requires special analysis correction for valid inference about the treatment effect.

Tree based approach for addressing self-selection in Big Data: forthcoming in MIS Quarterly

My paper A Tree-Based Approach for Addressing Self-Selection in Impact Studies with Big Data with Deepa Mani (Indian School of Business) and Inbal Yahav (Bar-Ilan University) is forthcoming in MIS Quarterly, in the special issue on Transformational Issues of Big Data and Analytics in Networked Business.


Subscribe to Research