January 4, 2018

Paper "Efficient estimation of COM–Poisson regression and GAM"

Our paper Efficient estimation of COM–Poisson regression and a generalized additive model is now available in Computational Statistics & Data Analysis (with Suneel Chatla). This link gives 50 free downloads before Feb 22, 2018. 

Abstract: The Conway–Maxwell–Poisson (CMP) or COM–Poisson regression is a popular model for count data due to its ability to capture both under dispersion and over dispersion. However, CMP regression is limited when dealing with complex nonlinear relationships. With today’s wide availability of count data, especially due to the growing collection of data on human and social behavior, there is need for count data models that can capture complex nonlinear relationships. One useful approach is additive models; but, there has been no additive model implementation for the CMP distribution. To fill this void, we first propose a flexible estimation framework for CMP regression based on iterative reweighed least squares (IRLS) and then extend this model to allow for additive components using a penalized splines approach. Because the CMP distribution belongs to the exponential family, convergence of IRLS is guaranteed under some regularity conditions. Further, it is also known that IRLS provides smaller standard errors compared to gradient-based methods. We illustrate the usefulness of this approach through extensive simulation studies and using real data from a bike sharing system in Washington, DC.


October 25, 2017

Talk at UT Dallas on "Researcher Dilemmas Using Behavioral Big Data"

I'll be talking about "Researcher Dilemmans Using Behavioral Big Data" at the The University of Texas at Dallas Naveen Jindal School of Management's Information Systems seminar today.
Time: 10:30-12:00
Location: JSOM Room 13.501

October 19, 2017

"A Tree-based Approach for Addressing Self-Selection in Impact Studies with Big Data" at INFORMS Data Science Workshop

On Saturday morning, I'll present a talk on "A Tree-based Approach for Addressing Self-Selection in Impact Studies with Big Data" at the 1st INFORMS Workshop on Data Science in Houston, TX. I'll describe our proposed tree-based approach as an alternative to propensity scores, which has several advantages over PSM. This is joint work with Inbal Yahav and Deepa Mani, published in MISQ in 2016.

When: Saturday, Oct 21, Session 1A (9:00 – 10:30)
Where: Hilton Americas-Houston, Level 3, Room 344

October 19, 2017

"Researcher Dilemmas using Behavioral Big Data in Healthcare": Keynote at INFORMS DMDA Workshop

This coming Saturday I'll deliver a keynote talk on "Researcher Dilemmas using Behavioral Big Data in Healthcare" at the 12th INFORMS Workshop on Data Mining and Decision Analytics in Houston, TX.

When: Saturday, Oct 21, 13:45-14:30
Where: Hilton Americas-Houston, Level 3, Room 339

Behavioral big data (BBD) refers to very large and rich multidimensional data sets on human and social behaviors, actions, and interactions, which have become available to companies, governments, and researchers. A growing number of researchers acquire and analyze BBD for the purpose of extracting knowledge and scientific discoveries. However, the relationships between the researcher, data, human subjects, and research questions differ in the BBD context compared to non-BBD and even traditional behavioral data. Researchers using BBD face not only methodological and technical challenges but also ethical and moral dilemmas. In this talk, I will discuss several dilemmas, challenges, and trade-offs related to acquiring and analyzing BBD in healthcare research.

September 6, 2017

R edition of Data Mining for Business Analytics textbook now available!

Wiley just notified us that our new textbook Data Mining for Business Analytics in R is out! Thanks to all those who've encouraged us to write the R edition, to the beta testers, and to the many folks who've been holding their breath. And thanks to Professors Gareth James and Ravi Bapna for writing wonderful Forwords!

The R edition covers the same topics as the 3rd edition of Data Mining for Business Analytics with XLMiner that came out last year. This Fall I am teaching a course that allows students to choose between the two editions.

As with the other editions, all datasets (and R code!) are available at Adopting instructors can get access to instructor materials that include slides, solutions to end-of-chapter problems and cases, and more.