Finally, some new avenues are opening for publishing work on analytics and big data that are not necessarily statistics journals, machine learning journals, or operations research journals. One such avenue is Big Data, an open-access peer-reviewed journal with Vasant Dhar as editor-in-chief. I'm glad to join the editorial board.
Earlier this month, Inbal Yahav (Bar Ilan University) and I presented our joint work on detecting Simpson's Paradox in big data as a poster at ECIS 2014 (thanks to the many interested visitors!), and at 2014 SCECR. This work describes an unusual use of classification and regression trees for a causal goal, rather than their normal use in prediction. We develop a tree variant that helps detect possible paradoxes in large datasets.
Linear regression is among the most popular statistical model in social sciences research. Linear probability models (LPMs), which are linear regression models applied to a binary outcome, are commonly used for various reasons, despite criticisms of such usage.
This weekend an important paper that I co-author with Hank Lucas and Mingfeng Lin has been accepted to the prestigious journal Information Systems Research. The paper, entitled "Too Big to Fail: Large Samples and the P-Value Problem" describes a critical challenge that occurs in modeling large samples. Publications in fields such as Information Systems as well as other social sciences have begun to rely on very large samples for testing theories.
On Information Quality is a paper that I co-author with Ron Kenett. Information Quality, or InfoQ, introduces a high-level analytics concept, asking the question about the potential of a particular dataset to answer the question of interest, given a particular data analysis.
My co-authors Ravi Bapna and Wolfgang Jank and I have just been notified that our paper Consumer Surplus in Online Auctions (ISR, 2008) was selected as one of the five best Information Systems papers published in 2008. We're all very excited about the news!
These Matlab modules below contain procedures for plotting control charts of different types, including ones that are based on wavelets. The functions are designed to be independent of specialized Matlab toolboxes. The code is distributed under the GNU General Public License. Note that you can use it (or change it according to the license), but I carry no responsibility to its accuracy and use.
Here are a few snapshots from the software output (click on chart to enlarge)