This morning I delivered the opening keynote address at the 9th international conference on PLS and related methods (2017PLS), in Macau, on "When Prediction Met PLS: What We learned in 3 Years of Marriage". My slides are now publicly available on Slideshare. Two more sessions today were dedicated to prediction, and even outside those sessions there were several talks focusing on prediction and PLS models.
Keynote at 2017PLS on "When Prediction Met PLS"
Talk at HKUST (May 16): "A tree-based approach for modeling self-selection"
Next Tuesday I'll give a seminar talk on "A Tree-based Approach for Addressing Self-Selection in Impact Studies with Big Data" at The Hong Kong University of Science & Technology, in the department of Information Systems, Business Statistics, and Operations Management (lovely combination!). In the talk, I'll describe the cool tree-based method we developed for addressing self-selection as an alternative to propensity score matching (based our 2016 MISQ paper with Inbal Yahav and Deepa Mani).
For more details (where, when) see the poster.
For a very light non-technical description, see this 5-min video.
A major challenge in deriving insights from impact studies is differences between the treatment groups due to self‐selection or other factors unrelated to the intervention. We introduce a tree‐based approach adjusting for observable self‐selection bias in intervention studies in management research. In contrast to traditional propensity score matching methods, including those using classification trees as a subcomponent, our tree‐based approach provides a standalone, automated, data‐driven methodology that allows for (1) the examination of nascent interventions whose selection is difficult and costly to theoretically
specify a priori, (2) detection of heterogeneous intervention effects for different pre‐intervention profiles, (3) identification of pre‐intervention variables that correlate with the self‐selected intervention, and (4) visual presentation of intervention effects that is easy to discern and understand. As such, the tree‐based approach is a useful tool for analyzing observational impact studies as well as for post‐analysis of experimental data. The tree‐based approach is particularly advantageous in the analyses of big data. I'll illustrate the method and the insights it yields in the context of two impact studies with different study designs: reanalysis of a field experiment and observational data on the effect of training on earnings in the US; and analysis of a quasi‐experiment examining the impact of an e‐governance service in India.
Received Outstanding Research Award by Taiwan Ministry of Science & Technology
I'm delighted to receive (along with 5 other NTHU professors) the 2016 Outstanding Research Award by the Ministry of Science & Technology.
Here's my best attempt at translating the award's goal (thank you Google Translate): "The Ministry of Science and Technology Outstanding Research Award is given for outstanding long-standing scientific and technological achievements in academic or industry research" (If you can read Chinese, the award description is here and you can try to find my name in the list.)
Thanks for all the warm wishes from colleagues and friends!
Talk at NCCU on May 4: "Research Using Behavioral Big Data"
This week I'll deliver a talk at National Chengchi University's Department of MIS on "Research Using Behavioral Big Data". In this talk I'll discuss why Behavioral Big Data (BBD) is different from inanimate big data, physiological big data, and small behavioral data. I'll describe various methodological, technical, ethical and practical issues that behavioral researchers encounter when using BBD.
The topic should be of interest to anyone conducing research with behavioral big data. This talk is part of the MIS PhD seminar taught by Prof. Eldon Li -- if you're interested in joining, please let me know.
Where: Room 260210, Commerce Building, National Chengchi University (#64 Zihnan Road, Section 2, Taipei, Taiwan)
When: Thursday, May 4, 7-9pm
Keynote at Discovery Summit Europe (Prague, Mar 22)
To Explain Or To Predict travels to Prague! I'll be delivering a keynote address at the upcoming Discovery Summit on Wednesday March 22, 2017, 11AM at the Prague Marriott Hotel. The two other keynote speakers are John Sall (co-founder and executive VP of SAS) and Prof. Ron Kenett.
At this event, Ron Kenett and I will also officially launch our new book Information Quality: The Potential of Data and Analytics to Generate Knowledge!
About this annual event, organized by SAS JMP: "At Discovery Summit Europe, brilliant data analysts gather to exchange best practices in data exploration, and play with both new and proven statistical techniques."
Talk abstract: Statistical modeling is a powerful tool for developing and testing theories by way of causal explanation, prediction and description. In many disciplines, there is near-exclusive use of statistical modeling for causal explanation with the assumption that models with high explanatory power are inherently of high predictive power. Conflation between explanation and prediction is common, yet the distinction must be understood for progressing scientific knowledge and for proper use in practice.
Understanding the differences between explanatory and predictive modeling and assessment is crucial for being able to assess a data set’s information quality – its potential to achieve a scientific/practical goal using data analysis. While the explain-predict distinction has been recognized in the philosophy of science, the statistical and data mining literature lack a thorough discussion of the many differences that arise in the process of modeling for an explanatory versus a predictive goal. In this talk I will clarify the distinction between explanatory and predictive modeling and reveal the practical implications in terms of data analysis.