The Importance of 'Self-Organizing Maps'

Three converging trends—private machine learning systems, vendor large language models, and market expansion—push reinsurers toward integration over replacement.

An artist's illustration of AI

Clear trends in machine learning and artificial intelligence are converging in a growing (re)insurance industry. This process needs attention and reconciliation. For three decades, specialists in insurance and finance have built machine learning systems to solve various complex problems. Prominent and recognizable feats include the wide implementation of genetic algorithms to solve portfolio optimization problems. Less known is the application of self-organizing maps (SOM). The latter are highly capable of consuming unstructured, multi-dimensional data and classifying and ordering it by properties derived from the key attributes of these large deposits of information. While doing this, SOM reduces dimensionality, installs order, and learns in the process. SOM are neural networks by definition and are capable of unsupervised learning and correction. The Finnish computer scientist and mathematician Teuvo Kohonen pioneered the algorithms in the late 1980s and 1990s. This proliferation of private machine learning systems is our first and well-established trend.

Enter large language models (LLM) built and delivered by Big Tech vendors. These have the capabilities of neural networks and genetic algorithms. However, the advantages of proprietary machine learning systems, trained and refined over time, are manifold. Above all, firms have assessed and proven these internal and private systems over the years and by now they require minimum supervision from practitioners. Users have straightened the errors and polished up performance by countless hours of training and production. Secondly and more significantly, these systems contain the topology of risk factors of the firm. This is the core business model and philosophy, which the firm protects keenly. Hence the solution of coexistence of private machine learning systems and vendor LLM is integration. This is our second newly exposed trend.

Last but not least, we have the expansion of the reinsurance business into developing and growing markets and regions. This is our third and well-recognized trend. We will take a case in point with the oldest reinsurance contract, the quota share of catastrophe loss.

A 33% Quota Share treaty applied ‘from the 1’st dollar’ on the insurer distribution of gross loss, resulting in 33% of ceded loss to the reinsurer, and 67% insurer net retained.

Figure 1: A 33% Quota Share treaty applied ‘from the 1’st dollar’ on the insurer distribution of gross loss, resulting in 33% of ceded loss to the reinsurer, and 67% insurer net retained.

The treaty is a fitting instrument to minimize earnings volatility, while supporting ambitious market share targets. This has been the case since the time when Venetian and Genovese bankers reinsured Mediterranean and Black Sea trade. From then to now, volatility is particularly important in a growing market where underwriting targets keep up with fast expansion and a healthy degree of uncertainty. From then to now, reinsurance has always been an information business. The quality of exposure-at-risk estimation by a process of quantifying the amount of risk a cedant carries and how much of it a reinsurer assumes determines the accuracy of pricing, reserving, and capital allocation. A large share of consequential information that drives exposure lives in unstructured data formats: government circulars, regulatory filings, rating agency reviews, accounting standards, broker advisories, and increasingly, satellite-derived physical damage assessments.

Two prescient cases in point are Malaysia and Indonesia with 5-to-8 percent annual growth in Gross Underwritten Premium. This makes the region a dynamic and demanding marketplace. 

 Impact of flooding in Malaysia and Indonesia during November of 2025

Figure 2, Impact of flooding in Malaysia and Indonesia during November of 2025. Exhibit produced by Releifweb.

Reinsurance cycles move quickly. In an environment of sparse data and limited historical experience, simplicity of structure and instantaneous transparency of pricing techniques become an advantage. Quota share is the reinsurance contract with the lowest operational cost and clearest, most stable and most recognizable price. It reduces volatility across the entire book and the entire risk tail of the business. Under the linear and proportional premium-making and loss-ceding rules of the treaty, reducing uncertainty and error in underlying exposure directly and surely reduces uncertainty in loss outcomes and in earnings volatility.

The Self-Organizing Map for data processing is a tried and tested algorithm that can streamline the validation of the ceded exposure of the insurer to its partner, the reinsurer. It reduces multi-dimensional data to two-dimensional surfaces by pre-selected rules and features, while learning, training and self-correcting. This makes it perfect for ingesting substantial amounts of exposure and premium records, historical loss and claims, rates, and indices well in sync with qualitative data and narrative from brokers, government, and accounting agencies. SOM is connectable directly to exposure databases and lakes. Self-organizing and self-learning layers process volumes of ingested data to create an exposure map of linear variables of business interest.

SOM consumes unstructured, multi-dimensional data. As a neural map it creates neurons and assigns their properties. Then it reduces dimensionality, structures and organizes the data.

Figure 3, SOM consumes unstructured, multi-dimensional data. As a neural map it creates neurons and assigns their properties. Then it reduces dimensionality, structures and organizes the data.

In our case, the aim is to vet, correct and fill in sparse data on exposure variables of key business concern such as insurable values, deductible amounts, and spatial coordinates of risks. Then the output from SOM is overlaid on the ceded exposure of the insurer and the procedure itself effectively executes validation, correction, and self-adjustment. As a result, the integrated system reduces uncertainty and error. Through the proportional nature of the quota share contract, this has an immediate multiplier on containing uncertainty and volatility in earnings.

There are various integration concepts capable of addressing and reconciling the intersection of these three trends described so far. We conclude by describing in light, non-technical terms one such concept in the form of a three-layered system.

Layer/Component/Function Table

Clear and optimal understanding of architecture allows one to partition tasks, components, and layers across this multi-dimensional system.

In Layer One, a vendor LLM consumes unstructured, multidimensional numerical and qualitative data. Its tasks are to find and define the modification and feature vectors D from unstructured data .  In this distribution of labor, LLM works out Best Matching Unit [see BMU below] mapping to the neural nodes and to layers. Alpha is the trust level scaler assigned to every document and unstructured data piece from which LLM defines the feature vectors . In my view the modeler & practitioner best assigns this. The human user keeps control of a critical risk control and mitigation variable.

In Layer Two, the proprietary SOM developed in-house by the firm retains control of definition and mapping for risk factors and all business variables. It is essential that the business owner keep the SOM grid proprietary. This is the practitioner's market-making guideline and philosophy of risk topology. We do not want to outsource this to LLM. This is the business model. The core equation of the proprietary SOM, which modifies our exposure variable of interest, is transformable in the context of (re)insurance practices.

equationTerm/Context in (re)insurance Table

In Layer Three, the process becomes machine learning through a feedback loop where is the feedback learning rate, and is the retrospective learning function. The latter distributes corrections to correlated neural nodes and layers.

equation

This is as much as we will go into the mechanics and mathematics of SOM algorithms. There is a big and thriving literature on the topic. For our purposes, this is sufficient to distribute the main tasks and components of the concept system.

Lastly, the modified variable of interest  propagates to a catastrophe modeling system, such as Verisk Synergy Studio, where it enters the (re)insurance loss module and estimates reinsurer treaty and insurer retained loss.

equation

With this concept of integration, we have preserved the utility of business intelligence developed and refined in the firm in the state of a private machine learning system. We have coupled and integrated this with the new power and capabilities of large language models.


Ivelin M. Zvezdov

Profile picture for user IvelinZvezdov

Ivelin M. Zvezdov

Ivelin Zvezdov is a financial economist by training with experience in quantitative analysis and risk management for (re)insurance and natural catastrophe modeling, fixed income and commodities trading. Since 2013 he leads the product development effort of AIR Worldwide's next generation modeling platform.

MORE FROM THIS AUTHOR

Read More