Statistical Strategy

I have been tackling some "big picture" questions related to using statistical methods in practice.

My major focus is on assessing the differences between explanatory, predictive and descriptive modeling and statistical modeling in terms of the statistical modeling process (from data collection and goal definition to model use). My paper To Explain or To Predict? discusses the distinction from a statistical point of view. The paper Predictive Analytics in Information Systems Research examines the value of predictive modeling to theory building, testing, and validation, illustrated in information systems research which is monopolized by explanatory modeling.

I have also been working on answering a question that many of my non-statistician colleagues have asked me: how to deal with inference in analyses of large samples. It turns out that quite a few things work differently from small sample analysis. Our paper Too Big To Fail: Large Samples and the p-Value Problem in Information Systems Research

discusses one important issue. Another is Linear Probability Models: The Good, The Bad, and The Ugly, re-examining the use of linear regression models for a binary output variable.

A third topic that I am working on with Ron Kenett is the notion of Information Quality, which is the potential of a dataset to answer a particular scientific/practical question using a given data analysis method. Our  paper On Information Quality formalizes the concept in terms of definition, characterization, and assessment.

See relevant publications.