Can Governance Catch Up to Data Science?

Data science teams often don't understand the organization's risk frameworks, and insurance leaders have too little experience with analytics.

Set of colored pencils on table

The rate at which data science techniques are developing and being adopted is increasing faster than insurers are able to develop their own understanding of the risk governance and ethics needed.

To make matters more challenging, two distinct groups operate within most insurers on the front line of data science, often in conflict rather than in harmony: data science teams using cutting-edge techniques without the necessary understanding of their organization’s risk frameworks, and insurance leaders who have limited experience with the latest advanced analytics. This internal disconnect leaves insurers and individuals that work for them exposed to risk.

Finding the right balance between governance and control, while still advancing the adoption of data science and the value that it creates, has become the magic middle ground upon which insurers have set their sights. 

See also: Why Becoming Data-Driven Is Crucial

Bias

As increasingly complex models are used, a key risk for insurers to consider is bias -- an issue so far  not fully understood and appreciated by many firms. When individuals or groups are differentiated from others based on particular characteristics, insurers need to understand why. Is the bias due to the data collected not representing the entire population? Is it caused by potentially flawed human decision-making that is represented in the data collected? Or was the bias introduced due to the artificial intelligence (AI) and machine learning models trained on the data? Is the inherent model form being used responsible for reinforcing the existing bias or even creating new biases?

The ability to detect hidden biases is essential to enabling appropriate strategies to measure, monitor and manage bias. Instead of thinking about bias at every stage of the model building process -- when an insurer first explores their data, when they build a model and when model outputs are used in a business decision – data scientists too often consider the risks as an afterthought. 

Choosing the right algorithm that will help an insurer find the optimum balance among interpretability, transparency and predictive power is another essential capability. There are a number of custom algorithms being developed in the market. For example, layered gradient boosting machines (LGBM) capture the same predictive accuracy of a GBM, while providing a much greater level of transparency and interpretability.

Open source risk

In recent years, open source adoption has seen unprecedented growth. While open source allows incredible flexibility and innovation, it also exposes an insurer to more risk, particularly relating to governance and security. Besides the potential for malicious code hiding in open source packages, key person dependency is another risk created by having either just one individual or a small team responsible for building and maintaining code. 

Large language models (LLMs), such as ChatGPT, are examples of technology evolving and being adopted in a hurry. However, the governance risk and control frameworks have not kept pace, creating significant risks relating to data privacy and intellectual property. 

Through the use of LLMs, an insurer could potentially lose sensitive and proprietary data. There is potential to have no or limited control over how the data is used, including being used by competitors later. 

Another risk concerns hallucinations, which refer to the tendency of LLMs to produce text that appears to be correct but is actually false. This could be driven by bad prompts or simply due to an underlying weakness in the model, delivering results that are wrong but are presented with a lot of certainty. Reputational risk for an insurer is high if the data or model is used improperly.

See also: Data Science Is Transforming Public Health

Taking control

At the end of the day, the stability of the open source code is in the insurers’ own hands. They alone are responsible for making sure they meet their business needs. Therefore, it is important that an insurer clearly delineates roles and responsibilities to avoid confusion. Defining who is making which decision ensures that better accountability, visibility and opportunity to challenge decisions are in place at every level. 

Open source offers real potential to contribute to a more efficient and innovative insurance market. However, insurers must first address two critical decisions: what they should use open source for to gain an advantage; and then how best to integrate open source in such a way that good governance and control are in place, creating an optimal balance. 

Data science is spreading quickly. If insurers want to compete in this new AI-driven world, they not only need to simply adopt data science but also do it in the right way. This means a gradual evolution of governance to ensure the right oversight, alignment to internal values and regulatory compliance are achieved, combined with an evolving risk management framework to anticipate and mitigate future risks.

Read More