March 10, 2017

Keynote at Discovery Summit Europe (Prague, Mar 22)

To Explain Or To Predict travels to Prague! I'll be delivering a keynote address at the upcoming Discovery Summit on Wednesday March 22, 2017, 11AM at the Prague Marriott Hotel. The two other keynote speakers are John Sall (co-founder and executive VP of SAS) and Prof. Ron Kenett.

At this event, Ron Kenett and I will also officially launch our new book Information Quality: The Potential of Data and Analytics to Generate Knowledge!

About this annual event, organized by SAS JMP: "At Discovery Summit Europe, brilliant data analysts gather to exchange best practices in data exploration, and play with both new and proven statistical techniques."

Talk abstract: Statistical modeling is a powerful tool for developing and testing theories by way of causal explanation, prediction and description. In many disciplines, there is near-exclusive use of statistical modeling for causal explanation with the assumption that models with high explanatory power are inherently of high predictive power. Conflation between explanation and prediction is common, yet the distinction must be understood for progressing scientific knowledge and for proper use in practice.
Understanding the differences between explanatory and predictive modeling and assessment is crucial for being able to assess a data set’s information quality – its potential to achieve a scientific/practical goal using data analysis. While the explain-predict distinction has been recognized in the philosophy of science, the statistical and data mining literature lack a thorough discussion of the many differences that arise in the process of modeling for an explanatory versus a predictive goal. In this talk I will clarify the distinction between explanatory and predictive modeling and reveal the practical implications in terms of data analysis.

January 14, 2017

Two upcoming talks at CityU Hong Kong

This week I will give two talks at the Information System Department, College of Business, City University of Hong Kong:

To Explain or To Predict?
Date: 17 Jan, 2017
Time: 2:00pm to 3:30pm
Venue: AC3-6-208, Academic Building 3, City University of Hong Kong

A Tree-Based Approach for Addressing Self-Selection in Impact Studies with Big Data
Date: 18 Jan, 2017
Time: 2:00pm to 3:30pm
Venue: AC3-7-211, Academic Building 3, City University of Hong Kong

January 12, 2017

New book now out: Information Quality

I'm excited to announce that our book Information Quality - The Potential of Analytics to Generate Knowledge (with Ron Kenett) is finally out just in time for the new year. The book introduces the Information Quality (InfoQ) framework, which is useful for evaluating the potential or usefulness of a dataset for answering a specific question or goal, given the use of data analysis (statistical modeling, data mining, etc.). It is also useful for evaluating studies that use data analysis.

A bit on the history of InfoQ:

  • Ron and I started thinking and discussing the topic more than 10 years ago
  • In 2010 we introduced the framework in our paper On Information Quality (JRSS-A vol 177(1), pp. 3-38, with 6 discussion papers and rejoinder)
  • In 2013 I presented InfoQ in a 30-min webinar by the Royal Statistical Society journal club.
  • We published papers applying the InfoQ framework to different domains: reviewing empirical articles (Helping Reviewers Ask the Right Questions: The InfoQ Framework for Reviewing Applied Research), official statistics (From Quality to Information Quality in Official Statistics, Journal of Official Statistics, vol 32 no 4, pp. 1–19),
  • We used the InfoQ dimension of Generalization to discuss Reproducibility, Replicability, and Repeatability (Clarifying the terminology that describes scientific reproducibility, Nature Methods, Vol. 12(8), p 699, August 2015).
  • I gave the opening keynote “Information Quality: Can Your Data Do the Job?” at the 11th Statistical Challenges in eCommerce Research (SCECR) Symposium, Addis Ababa, Ethiopia, June 2015.

    The book has three parts: (1) the InfoQ framework, (2) application of InfoQ in different fields (education, healthcare, customer surveys, and more), and (3) Implementing InfoQ in software (with a special JMP add-on).

    Where now?
    Above is InfoQ 101. There's much more going on! See more publications and talks on the InfoQ website and follow news on the FB page.

November 20, 2016

Keynote at Israeli Conference on Mechanical Engineering at Technion

On Wednesday morning (Nov 23), I'll be giving a keynote talk at the 2016 Israeli Conference on Mechanical Engineering with the title "Research Using Behavioral Big Data: A Tour and Why Mechanical Engineers Should Care". This is the first time I'll be presenting to an audience of mechanical engineers and I see it as an important opportunity to foster collaborations between the designers and creators of "things" and those using the data generated by the "things". Mechanical engineering is also embracing the era of big data and IoT - the theme of the conference is "Mechanical Engineering in the Internet of Things and Big Data Era". We're experiencing the convergence of engineering, data analytics, and the social sciences; it's a good idea to figure out the landscape!

When and Where: Technion (Haifa, Israel) Churchill Building, Wed 23/11, ~10am.

November 18, 2016

Paper on trees for addressing self-selection in impact studies now published in MIS Quarterly

Many studies use quasi-experiments, which are similar to randomized experiments except that subjects are not randomly assigned to the treatment and control groups. The result is what's called "self-selection bias", which requires special analysis correction for valid inference about the treatment effect. In a joint paper with Deepa Mani (ISB) and Inbal Yahav (Bar-ilan U) we propose a new method that is based on classification and regression trees: "A Tree-Based Approach for Addressing Self-Selection in Impact Studies with Big Data", MIS Quarterly, vol 40 no 4, pp. 819-848. Useful also for randomized experiments and observational impact studies.

Check out the slide deck for a quick walk-through.