Machine Learning in Health Care

Machine Learning in Health Care

The NBER's conference on Machine Learning in Health Care took place May 10 in Cambridge. Research Associates David M. Cutler of Harvard University and Sendhil Mullainathan of University of Chicago, and Ziad Obermeyer of University of California, Berkeley organized the meeting. These researchers' papers were presented and discussed:

Hagai Rossman and Smadar Shilo, Weizmann Institute of Science

Childhood Obesity Prediction and Risk Factor Analysis from Nationwide Health Records


Jason Abaluck, Yale University and NBER; Leila Agha, Dartmouth College and NBER; and David C. Chan Jr, Stanford University and NBER

Why Should Get Blood? Personalizing Medicine with Heterogeneous Treatment Effects

Randomized clinical trials typically estimate average treatment effects within selected populations. With modern medical records and quasi-experimental research designs, it is now possible to estimate heterogeneous treatment effects using vastly more information. Abaluck, Agha, and Chan present evidence that applying such estimates to inform clinical decisions could lead to large health benefits, outperforming both status quo physician decisions and strict applications of current medical guidelines. Abaluck, Agha, and Chan study blood transfusion decisions for 1.6 million patients with anemia receiving inpatient care at Veteran Health Administration hospitals from 2000-2015. They first show that observed treatment decisions are largely invariant to a wide array of observable patient characteristics, with the exception of blood hemoglobin levels. Treatment effects estimated by naively assuming unconfoundedness vary substantially with patient characteristics. Using instruments based on quasi-random assignment of patients to physicians, the researchers find that much of the measured heterogeneity in the naive "observational" treatment effects reflects heterogeneity in underlying causal effects rather than selection. In counterfactual simulations, Abaluck, Agha, and Chan find that better targeting the existing number of transfusions would reduce the total 30-day mortality rate in the study population by 1.1 percentage points, from a base of 9%.


Emma J. Pierson and Jure Leskovec, Stanford University, David M. Cutler, Sendhil Mullainathan, and Ziad Obermeyer

Using Machine Learning to Explain Socioeconomic and Racial Gaps in Pain (slides)


Rediet Abebe, Cornell University; Shawndra Hill and Jennifer Wortman Vaughan, Microsoft Research; Peter M. Small, Rockefeller Foundation; and H. Andrew Schwartz, Stony Brook University

Using Search Queries to Understand Health Information Needs in Africa

The lack of comprehensive, high-quality health data in developing nations creates a roadblock for combating the impacts of disease. One key challenge is understanding the health information needs of people in these nations. Without understanding people's everyday needs, concerns, and misconceptions, health organizations and policymakers lack the ability to effectively target education and programming efforts. In this paper, Abebe, Hill, Vaughan, Small, and Schwartz propose a bottom-up approach that uses search data from individuals to uncover and gain insight into health information needs in Africa. They analyze Bing searches related to HIV/AIDS, malaria, and tuberculosis from all 54 African nations. For each disease, they automatically derive a set of common search themes or topics, revealing a wide-spread interest in various types of information, including disease symptoms, drugs, concerns about breastfeeding, as well as stigma, beliefs in natural cures, and other topics that may be hard to uncover through traditional surveys. The researchers expose the different patterns that emerge in health information needs by demographic groups (age and sex) and country. They also uncover discrepancies in the quality of content returned by search engines to users by topic. Combined, their results suggest that search data can help illuminate health information needs in Africa and inform discussions on health policy and targeted education efforts both on- and offline.


Tony Duan, Pranav Rajpurkar, Dillon Laird, Andrew Ng, and Sanjay Basu, Stanford University

Clinical Value of Predicting Individual Treatment Effects for Intensive Blood Pressure Therapy: A Machine Learning Experiment to Estimate Treatment Effects from Randomized Trial Data (slides)

Basu, Duan, Laird, Ng, and Rajpurkar compared a conventional logistic regression approach for modeling HTEs to the X-learner ML approach, applying both approaches to individual participant data from randomized trials of intensive blood pressure treatment. The ML approach revealed correctly that an individual patient's predicted absolute benefit from intensive treatment was not necessarily proportional to their baseline CVD risk. This contradicts prior hypotheses that simply calculating baseline risk will be sufficient to guide therapy, highlighting the clinical importance of HTE risk estimation for making individual treatment effect estimates. Basu, Duan, Laird, Ng, and Rajpurkar also observed that the ML approach had significantly better discriminative ability, evident in higher C-for-benefit and decision value RMST statistics. The ML approach also partitioned participants into a benefit subgroup that observed a higher empirical ARR than the no-benefit subgroup, whereas the difference between subgroups was more modest for the logistic regression model. Finally, the ML approach had better calibration than the logistic regression model, which over-estimated the ARR attributable to intensive blood pressure treatment.


Michael A. Ribers, University of Copenhagen, and Hannes Ullrich, DIW Berlin

Battling Antibiotic Resistance: Can Machine Learning Improve Prescribing?

Machine learning methods are increasingly providing economists with opportunities to design welfare improving policies for prediction problems. The alarming increase in antibiotic resistance caused by the misuse of antibiotics is one such opportunity. To avoid misuse, predicting bacterial causes of infections is key. Ribers and Ullrich combine administrative and laboratory data from Denmark to evaluate how machine prediction of bacterial causes for urinary tract infections can improve primary care prescribing. They propose an assessment of machine prediction-based policies against human decision making by defining feasible policies based on information available prior to their implementation. Contrasting existing work tackling prediction policy problems, in their setting, patient test outcomes are observed independent of physician prescription choices. This allows the researchers to address the relevance of unobservables for physician decisions and to directly evaluate prescription policies based on machine prediction. Ribers and Ullrich find that policies using a combination of machine prediction-based rules and physician discretion can lower antibiotic use by 7.5 percent without reducing the number of treated bacterial infections. As Denmark is one of the most conservative countries in terms of antibiotic prescribing, this result may be a lower bound of what can be achieved elsewhere.