New study reveals flaws in statistical modeling approach used in health services research
Findings from a new study conducted jointly at Dartmouth’s Geisel School of Medicine and Harvard Medical School, and published in Health Services Research, highlight the statistical drawbacks of one form of analysis commonly used in health services research while demonstrating the benefits of another.,
Findings from a new study conducted jointly at Dartmouth’s Geisel School of Medicine and Harvard Medical School, and published in Health Services Research, highlight the statistical drawbacks of one form of analysis commonly used in health services research while demonstrating the benefits of another.
“An ongoing goal of health services research has been to understand the reasons for variation in healthcare delivery, whether it’s between physicians, hospitals, or geographic areas–and how that variation may contribute to disparities in healthcare across different types of patients (defined by characteristics such as their sex, race, or socio-economic status),” says James O’Malley, MS, Ph.D., a professor of The Dartmouth Institute for Health Policy and Clinical Practice and of biomedical data science at the Geisel School of Medicine, who served as lead author on the study.
One potential source of variation involves individual physicians who might consciously or unconsciously make different clinical decisions for different types of patients who have similar health conditions.
To help them determine whether an identified disparity in care may be caused by a physician’s decision-making or systemic factors affecting specific patient populations across all physicians, for example, researchers use statistical modeling, which helps them account for uncertainties and variable and, in particular, small sample sizes, says O’Malley.
“But some researchers make the mistake of using stratified approaches in their analyses, which basically involve running a separate analysis to estimate each physician’s treatment patterns for each type of patient and then computing a correlation based on those estimates,” he explains. “This can underestimate the consistency of care patterns and may lead to incorrect conclusions about the sources of variation and disparities in care.”
To test this hypothesis, the research team sourced Medicare claims and enrollment data on emergency department (ED) visits (from January 2012 to September of 2015). The data, which included patient characteristics, hospital status, and identification of the physicians responsible for deciding to hospitalize the patient, were used to assess the physicians’ propensity to admit patients to the ED across different patient types.
Using a three-pronged investigation, including analytical derivation, simulation experiments, and analysis of claims data from the ED application that motivated the research, the researchers compared stratified estimators to those of joint modeling–an approach thought to be more accurate but not yet widely used in health services research.
In the context of the ED application, joint modeling analyzes the data from all patients simultaneously and directly estimates the correlation of physician treatment patterns for different patient types across the population of physicians. This fully utilizes the information in the data and accounts for the uncertainty in each physician’s treatment of each type of patient.
“We were able to demonstrate that the joint modeling approach was substantially less biased than the stratified approach, and that the importance of joint modeling becomes more pronounced when sample sizes are small and the true correlations are large (close to 1–corresponding to high consistency),” O’Malley says.
For example, the research team found that the estimated correlation of physician admission tendencies between female and male patients was .98 under the joint model but only .38 using stratified estimation (correlations that are closer to 1 are considered less biased and more accurate). Similarly, it was .99 to .28 when comparing white vs. non-white patients, and .99 to .31 when looking at insured patients vs. non-insured patients.
The fallibility of the stratified approach, says O’Malley, has major implications for analyses that seek to determine the extent to which different types of providers (i.e., physicians and hospitals) contribute to disparities and inequity (such as in cases of racism) in healthcare utilization and outcomes.
“The difference between a correlation of .9 and .3 is really quite profound, and may motivate quite distinct interventions to address disparities, so it’s a situation where there’s a high ante, if you will, on getting the statistical analysis right,” he says.
Mistaken or naive use of stratified estimation may have led in the past to misleading findings being published, particularly for studies of variations in healthcare utilization, quality, cost, and outcomes, says O’Malley.
“We hope this paper increases awareness of concerns with stratified estimation–whenever evaluating similarities or differences of providers’ treatment patterns across patient types–and that this practice is avoided in the future,” he says.
Alistair James O’Malley et al, Weak correlations in health services research: Weak relationships or common error?, Health Services Research (2021). DOI: 10.1111/1475-6773.13882
Health Services Research
The Geisel School of Medicine at Dartmouth