May 1, 2017

Received Outstanding Research Award by Taiwan Ministry of Science & Technology

I'm delighted to receive (along with 5 other NTHU professors) the 2016 Outstanding Research Award by the Ministry of Science & Technology.

Here's my best attempt at translating the award's goal (thank you Google Translate): "The Ministry of Science and Technology Outstanding Research Award is given for outstanding long-standing scientific and technological achievements in academic or industry research" (If you can read Chinese, the award description is here and you can try to find my name in the list.)

Thanks for all the warm wishes from colleagues and friends!

May 1, 2017

Talk at NCCU on May 4: "Research Using Behavioral Big Data"

This week I'll deliver a talk at National Chengchi University's Department of MIS on "Research Using Behavioral Big Data". In this talk I'll discuss why Behavioral Big Data (BBD) is different from inanimate big data, physiological big data, and small behavioral data. I'll describe various methodological, technical, ethical and practical issues that behavioral researchers encounter when using BBD.

The topic should be of interest to anyone conducing research with behavioral big data. This talk is part of the MIS PhD seminar taught by Prof. Eldon Li -- if you're interested in joining, please let me know.

Where: Room 260210, Commerce Building, National Chengchi University (#64 Zihnan Road, Section 2, Taipei, Taiwan)
When: Thursday, May 4, 7-9pm

March 10, 2017

Keynote at Discovery Summit Europe (Prague, Mar 22)

To Explain Or To Predict travels to Prague! I'll be delivering a keynote address at the upcoming Discovery Summit on Wednesday March 22, 2017, 11AM at the Prague Marriott Hotel. The two other keynote speakers are John Sall (co-founder and executive VP of SAS) and Prof. Ron Kenett.

At this event, Ron Kenett and I will also officially launch our new book Information Quality: The Potential of Data and Analytics to Generate Knowledge!

About this annual event, organized by SAS JMP: "At Discovery Summit Europe, brilliant data analysts gather to exchange best practices in data exploration, and play with both new and proven statistical techniques."

Talk abstract: Statistical modeling is a powerful tool for developing and testing theories by way of causal explanation, prediction and description. In many disciplines, there is near-exclusive use of statistical modeling for causal explanation with the assumption that models with high explanatory power are inherently of high predictive power. Conflation between explanation and prediction is common, yet the distinction must be understood for progressing scientific knowledge and for proper use in practice.
Understanding the differences between explanatory and predictive modeling and assessment is crucial for being able to assess a data set’s information quality – its potential to achieve a scientific/practical goal using data analysis. While the explain-predict distinction has been recognized in the philosophy of science, the statistical and data mining literature lack a thorough discussion of the many differences that arise in the process of modeling for an explanatory versus a predictive goal. In this talk I will clarify the distinction between explanatory and predictive modeling and reveal the practical implications in terms of data analysis.

January 14, 2017

Two upcoming talks at CityU Hong Kong

This week I will give two talks at the Information System Department, College of Business, City University of Hong Kong:

To Explain or To Predict?
Date: 17 Jan, 2017
Time: 2:00pm to 3:30pm
Venue: AC3-6-208, Academic Building 3, City University of Hong Kong

A Tree-Based Approach for Addressing Self-Selection in Impact Studies with Big Data
Date: 18 Jan, 2017
Time: 2:00pm to 3:30pm
Venue: AC3-7-211, Academic Building 3, City University of Hong Kong

January 12, 2017

New book now out: Information Quality

I'm excited to announce that our book Information Quality - The Potential of Analytics to Generate Knowledge (with Ron Kenett) is finally out just in time for the new year. The book introduces the Information Quality (InfoQ) framework, which is useful for evaluating the potential or usefulness of a dataset for answering a specific question or goal, given the use of data analysis (statistical modeling, data mining, etc.). It is also useful for evaluating studies that use data analysis.

A bit on the history of InfoQ:

  • Ron and I started thinking and discussing the topic more than 10 years ago
  • In 2010 we introduced the framework in our paper On Information Quality (JRSS-A vol 177(1), pp. 3-38, with 6 discussion papers and rejoinder)
  • In 2013 I presented InfoQ in a 30-min webinar by the Royal Statistical Society journal club.
  • We published papers applying the InfoQ framework to different domains: reviewing empirical articles (Helping Reviewers Ask the Right Questions: The InfoQ Framework for Reviewing Applied Research), official statistics (From Quality to Information Quality in Official Statistics, Journal of Official Statistics, vol 32 no 4, pp. 1–19),
  • We used the InfoQ dimension of Generalization to discuss Reproducibility, Replicability, and Repeatability (Clarifying the terminology that describes scientific reproducibility, Nature Methods, Vol. 12(8), p 699, August 2015).
  • I gave the opening keynote “Information Quality: Can Your Data Do the Job?” at the 11th Statistical Challenges in eCommerce Research (SCECR) Symposium, Addis Ababa, Ethiopia, June 2015.

    The book has three parts: (1) the InfoQ framework, (2) application of InfoQ in different fields (education, healthcare, customer surveys, and more), and (3) Implementing InfoQ in software (with a special JMP add-on).

    Where now?
    Above is InfoQ 101. There's much more going on! See more publications and talks on the InfoQ website and follow news on the FB page.