Risks of AI Prediction Performance Should Be Measured, Especially in Critical Areas like Health Care

The consequences of errors in health-care applications can lead to life-threatening misdiagnoses and lost opportunities for early intervention.

Suppose Tinyiko from Duthuni Village goes to Elim Hospital. The nurse, Thuso, orders an emergency x-ray image of Tinyiko’s lung to ascertain whether he has a condition called a pulmonary embolism. Because no doctor is in sight, Thuso uses an AI system that predicts whether Tinyiko has a pulmonary embolism.

The AI system makes a diagnostic assessment that says Tinyiko has no pulmonary embolism. AI systems like this have been under development for a long time. For example, in 2007, Simon Scurrell, David Rubin and I developed an AI system that predicts whether a patient has a pulmonary embolism.

With the increase in data and computational power, these systems are beginning to exceed the accuracy of human doctors.   

The crucial question is whether or not the AI system that predicts whether a patient such as Tinyiko has a pulmonary embolism is enough? This AI system can determine whether Tinyiko has a pulmonary embolism, and additionally state its confidence in its prediction. For example, the AI system can quantify the prediction risk by stipulating that it is 80% confident that Tinyiko has a pulmonary embolism.

Of course, this additional confidence or risk quantification requires further computational and, thus, financial resources. This article suggests that society is placed in a risky position without carefully measuring this risk (80% confidence level), potentially subjecting it to unanticipated repercussions that may erode confidence, and the ethical underpinnings that ought to govern AI.

AI prediction risk

Measuring AI prediction risk is paramount in augmenting AI systems’ transparency. Providing a coherent structure that enables stakeholders to understand predictive models’ constraints, and possible modes of failure, enhances their capacity to offer well-informed decisions.

End users and those impacted by AI-driven choices, in addition to developers and administrators of AI systems, must have transparent access to AI prediction risk information. It promotes a climate of responsibility in which AI system developers are incentivized to comply with elevated benchmarks of dependability and security.

Furthermore, measuring prediction performance risk is crucial for establishing and sustaining public confidence in AI technologies. The foundation for the extensive adoption and acceptance of AI is trust. People are more likely to adopt AI solutions when they comprehend the associated risks and know that safety and risk management protocols are in effect.

On the contrary, insufficient AI prediction risk quantification and communication may result in adverse public reactions, regulatory repercussions, and a hindrance to productive advancements.

Technical and social requirement

Measuring the risk associated with AI prediction performance is not only a technical requirement but also a social one. AI prediction programs are prone to failure. The consequences can range from moderate to severe, depending on the situation.

For example, the inability of AI-powered financial algorithms might significantly upset the market, while the imprecision of predictive policing models can exacerbate social inequality.

Measuring risk is critical for understanding, mitigating and communicating the possibility of such failures, thereby protecting against their most serious consequences. The quantification of AI prediction risk takes us to the exciting world of Reverend Thomas Bayes.

Thomas Bayes was an English Presbyterian minister, philosopher and statistician born around 1701. His most renowned contribution outside theology is the development of Bayes’ Theorem, which outlines the likelihood of an occurrence by utilizing prior knowledge and evidence of potentially associated conditions. Bayes’ contribution, which remains seminal in statistics, was not published during his lifetime.

Following his death, Richard Price, an acquaintance of Bayes, published it on his behalf. 

Bayes’ work has emerged as an essential tool for measuring the risk of AI predictions. So, how does Bayes’ Theorem operate to quantify AI prediction risk?

Robust mechanism

With its probabilistic underpinnings, the Bayesian framework provides a robust mechanism for incorporating prior information and evidence into the AI prediction procedure, thus offering AI prediction risk. This Bayesian procedure has been applied successfully to many vital areas.

One example is my 2001 work using AI systems, based on Bayes’ work in aircraft structures. Another by Chantelle Gray is how Bayes’ work is used to build algorithms shaping our politics.

Although the Bayesian method presents notable benefits in terms of adaptability, accuracy and uncertainty management, it is crucial to consider the substantial investments in computational and financial resources necessary to implement and maintain these approaches successfully.

However, methods have been developed to reduce this computational load. For example, in 2016, Ilyes Boulkaibet, Sondipon Adhikari and I developed robust methods for reducing the computational cost of the Bayesian AI prediction risk quantification procedure.

Furthermore, Tsakane Mongwe, Rendani Mbuvha and I, in our 2023 book, developed a Bayesian risk quantification method for machine learning. Given the viability of AI prediction risk quantification, what are the governance, regulatory and policy implications?

A concerted effort from all parties involved in the development, deployment and governance of AI systems is required to maximize the benefits of AI while mitigating its risks. It is imperative that policymakers champion and enact regulations mandating AI prediction risk quantification.

AI developers and organizations must incorporate risk quantification into their development lifecycle as a fundamental component of ethical AI development, rather than treating it as an afterthought.

End users and the public should be engaged in a transparent dialogue regarding AI prediction risks, ensuring that the design and deployment of AI systems reflect societal values and ethical considerations.

In “Tinyiko’s” hospital visit, it is evident that measuring AI prediction risk is advantageous and imperative in the health-care industry. The consequences of health-care decisions on patients are substantial; therefore, it is vital to comprehend the reliability and constraints of AI-powered predictions.

Severe consequences of errors

The consequences of errors in health-care applications are exceedingly severe, given that they may result in life-threatening misdiagnoses, inappropriate treatments or lost opportunities for early intervention.

Health-care personnel can weigh the inherent uncertainties of AI-driven insights against the benefits they provide to make informed decisions by measuring the risk associated with AI predictions. This methodology facilitates a sophisticated approach to patient care by integrating AI recommendations with clinical expertise in a transparent, accountable, and patient-centric manner.

Moreover, from a regulatory standpoint, it is critical to quantify prediction risk to verify that AI systems satisfy rigorous safety and effectiveness criteria before implementation in essential health-care settings. In the era of AI, this meticulous risk assessment is vital to preserving patient confidence and adhering to the ethical standards of medical practice.

To conclude, measuring performance risk associated with AI predictions, even though it adds additional cost, is not merely a technical obstacle but also a social and moral imperative.

Our collective endeavours for safety, fairness and success will be determined by our capacity to quantify and manage the risks associated with these powerful technologies as we approach a future that AI progressively influences.

Measuring AI prediction risk must become mandatory for all critical applications such as health care. 

This article was first published by Daily Maverick. Read the original article on the Daily Maverick website.

Suggested citation: Marwala Tshilidzi. "Risks of AI Prediction Performance Should Be Measured, Especially in Critical Areas like Health Care," United Nations University, UNU Centre, 2024-02-26,