Improving Predictive Healthcare Models by Filtering for Racial Differences in Data

Authored by Ayesha Rajan, Research Analyst at Altheia Predictive Health


Risk calculators are at the forefront of analytics in healthcare – using ranges to understand who might be at risk of contracting a disease and when they might contract it, is a powerful tool but it doesn’t stop there. Analytics can also help us to better understand disease progression and manage symptoms, however, these tools often underserve racial and ethnic minorities because of the lack of inclusion of race-adjusted ranges in the metabolic and blood panels. To better serve the general population, racial differences in health data must be taken into consideration and used where applicable to make analytics a beneficial tool to all. In recent years, several studies have explored this topic and we will take a look at some of them in this article.

Current Problem

The biggest challenge in this research space is collecting data – for example, the link between breast cancer and race in women proves that at least some disparities in cancer diagnoses boil down to racial differences. One of the contributing factors to this disparity is many randomized clinical trials become stalled due to lack of enrollment (Zewde)[1].

Consequently, data segmented by racial differences can be difficult to obtain. The next biggest challenge is identifying when race is the impactful variable. Many people of the same race and ethnicity often share similar cultural practices so relationship, lifestyle, location, and other variables can influence the interaction of panel data and race. One-way that the National Center for Biotechnology

Information (NCBI) suggests tackling this is by administering more comprehensive questionnaires so that such parameters can be factored out to identify the root cause of a disparity.

Emerging Technology and Studies

This field of research is central to our mission at Altheia Predictive Health. Our proprietary predictive health models take race into account when creating risk ranges to ensure that each individual receives information that is personalized to their background. We can see in much of our research that risk ranges vary among race and ethnic groups with many minorities being classified at a higher risk than Caucasian Americans even with the same variable being measured. By including race as a parameter in predictive algorithms, we can train machines to better interpret and apply the most accurate data possible and, as a result, increase the accuracy of these algorithms.

There is more to this area of research; outside of diagnosing and managing diseases, analytics also identifies racial disparities in care management programs. In a study at Portland State University, researchers observed patients in a hospital emergency room and studied the way nurses and physicians interacted with people of varying races and ethnicities. Researchers found that “Black patients were 32 percent less likely to receive pain medication than white patients, while Hispanic patients were 21 percent less likely to receive pain medication than their white counterparts. Asian patients were 24 percent less likely to receive pain medication than white individuals. This was despite the fact that black and Hispanic patients reported higher average pain scores than white patients.” [2]


Ultimately, analytics applications are a tool and just a piece of the puzzle; there is still an element of human touch that will always be necessary to bring together the entire picture. Without taking race and ethnicity into account, analytics applications lack accuracy and context that human interpretation can add to predictive analytics models so that they can better serve a much wider community. As this field continues to develop, the biggest struggle for researchers will continue to be lack of enrollment in studies. However, by expanding the questions asked and information documented on Electronic Health Record for those who do participate in studies, we can make great strides in determining when race and ethnicity are strongly correlated to disease contraction and progression. 

Works Cited

[1] Zewde, Makda. “Tracking Health Disparities with Big Data.” NPHR Blog, 20 Oct. 2017,

[2] Kent, Jessica. “EHR Data Reveals Racial Disparities in Emergency Pain Treatment.” HealthITAnalytics, 20 Dec. 2019,