Too Big To Fail: Large Samples and the P-Value Problem -- forthcoming in ISR

This weekend an important paper that I co-author with Hank Lucas and Mingfeng Lin has been accepted to the prestigious journal Information Systems Research. The paper, entitled "Too Big to Fail: Large Samples and the P-Value Problem" describes a critical challenge that occurs in modeling large samples. Publications in fields such as Information Systems as well as other social sciences have begun to rely on very large samples for testing theories. The problem is that using "good old statistical significance" for deriving practical conclusions can lead researchers astray because with a large sample even very small effects will be statistically significant at traditional levels.

I hope that the publication of the paper creates more awareness and opens a discussion about new ways for carrying out empirical research in the social sciences in the advent of "big data".

Here's the abstract:
The Internet has provided IS researchers with the opportunity to conduct studies with extremely large samples, frequently well over 10,000 observations. There are many advantages to large samples, but researchers using statistical inference must be aware of the p-value problem associated with them. In very large samples, p-values go quickly to zero, and solely relying on p-values can lead the researcher to claim support for results of no practical significance. In a survey of large sample IS research, we found that a significant number of papers rely on a low p-value and the sign of a regression coefficient alone to support their hypotheses. This research commentary recommends a series of actions the researcher can take to mitigate the p-value problem in large samples and illustrates them with an example of over 300,000 camera sales on eBay. We believe that addressing the p-value problem will increase the credibility of large sample IS research as well as provide more insights for readers.
-------
Note: We published a working paper already in 2009, but have toned it a bit down in the review process... The working paper is available on SSRN.

Research, Publications, Modeling

Search form