Achieving a 'Logical Data Fabric'

A logical data fabric has the capacity to knit together disparate data sources in insurers' broad, hybrid universe of data platforms.

Saptarshi Sengupta

May 12, 2021

Time-consuming deals or claims-related interactions with agents are getting replaced by self-service insurance portals and sometimes even by bots. The growth of IoT, artificial intelligence (AI) and machine learning (ML) technology and the prevalence of sensors in wearables, cars, houses, agriculture, transportation and other areas are making risk profiling and precautionary measures much better and faster.

However, the sharing economy brought about by Uber, Airbnb, etc. is making insurance tricky. The pandemic is also forcing insurance companies to evolve to survive the current climate and prepare for an uncertain future.

For many, part of this development has been adopting new technologies and digitizing services.

Insurance companies must rely heavily on their data to embrace these new trends. Unfortunately, many depend on older enterprise data architectures composed of legacy tools and methodologies. Business stakeholders need immediate information for real-time decisions, but this is just not possible when data is scattered across multiple data sources. Relying on rigid technologies such as ETL (extract, transform, load), makes it almost impossible for insurers to make real-time decisions on a claim or engage in predictive analytics with the most current data to underwrite the right insurance product for the right client.

These legacy technologies are resource-intensive, time-consuming and costly. ETL processes deliver data in scheduled batches, meaning there is always lag, which forces business users to wait for the data to be delivered. Depending on the configuration and schedule, batches can be delivered very quickly but never on an instantaneous, as-needed basis. In fact, many ETL processes are still done overnight.

This leaves insurers with no choice but to initiate complex, expensive and time-consuming engagements with IT just to answer basic questions. On top of that, M&A and other forms of corporate restructuring are constant in the insurance industry, and legacy data architecture poses a huge threat to post-merger data architecture consolidation. As in other industries, cloud adoption and data lake implementation are becoming more prevalent, yet these cloud-first initiatives, application modernization projects and big data analytics are either fraught with downtime, implementation challenges or, in the best case scenarios, only partially successful.

Data Fabric to Logical Data Fabric: The Modern Way to Keeping Businesses Covered

With volume, variety and velocity of data today, users need a unified view of all the data available to them in near real-time. Insurers are looking to capture the ever-changing data from streaming, data lakes and other newer data sources or data repositories and take advantage of all the data types available.

Technologists have attempted to meet the needs of their organizations in many ways. First, they used larger and larger databases. Then they set up data warehouses. Most recently, they have turned to data lakes, cloud repositories and big data implementations. Unfortunately, these latest solutions have only compounded the problem, as different sources of data are still stored in functional silos, separate from other sources of data. Even data lakes continue to contain multiple data silos, which many business users and analysts only realize when they attempt to run a single query across the entire data lake.

To manage the complexity of today’s environments, companies are adopting newer architectural approaches such as data fabric to augment and automate data management. This modern data management approach streamlines data discovery, access and governance by automating much of the labor that would normally be performed at multiple individual junctures using older methods. Forrester analyst Noel Yuhanna defined enterprise data fabric as a set of processes that automate “integration, transformation, preparation, curation, security, governance and orchestration" of data, which are some of the most traditionally labor-intensive aspects of business intelligence, due to the highly diverse, heterogeneous nature of today’s data landscape.

Most recently, research firms began to evolve their notion of data fabric to that of a “logical data fabric.” Analysts devised this concept based on the idea that even if technology vendors were to automate key aspects of the data pipeline - or turn these processes into services - the resulting data fabric will eventually be limited by certain physical realities; namely, the need to replicate data. To ensure the business continues to gain significant efficiencies, organizations need to change the paradigm from a physical data fabric to a logical data fabric. TDWI analyst David Stodder outlined some of the features of logical data fabric, saying that it had the capacity to “knit together disparate data sources in their broad, hybrid universe of data platforms.”

Why Data Virtualization Is the Key to Stitching Together a Logical Data Fabric

Data virtualization (DV) is a data integration solution, but one that uses a completely different approach than most methods, making it a perfect fit for logical data fabric application. The technology is an approach to data management that allows an application to retrieve and manipulate data without requiring technical details about the data, such as how it is formatted at source, or where it is physically located. Rather than physically moving the data to a new, consolidated location via an ETL process, data virtualization provides a real-time view of the consolidated data, leaving the source data exactly where it is and containing the necessary metadata for accessing the various sources, making it straightforward to implement.

Performing many of the same transformation and data-quality-control functions as traditional data integration solutions, DV differs because it can also provide real-time data integration at a lower cost. As a result, it can either replace traditional data integration processes and their associated data marts and data warehouses, or simply augment them, by extending their capabilities. Sophisticated data virtualization solutions go one step further by establishing an enterprise data-access layer that provides universal access to all of an organization’s critical data. When insurers need to obtain data, they query the data virtualization layer, which, in turn, gets the data from the applicable data sources. Because the data virtualization layer takes care of the data-access component, it abstracts business users from complexities such as where the data is stored or what format it is in. Depending on how a data virtualization layer is implemented, business users can ask questions and receive answers easily, because the underlying data virtualization layer handles all the complexity.

Additionally, modern DV solutions offer dynamic data catalogs that not only list all of an organization’s available data sources but provide access to the data from right there in the catalog. They also leverage their unified metadata capabilities to enable stakeholders to implement data quality, data governance and security protocols across an organization’s disparate data sources, from a single point of control. That is particularly important for insurance companies, which are expected to understand and protect their customers personally identified information (PII), such as credit cards, healthcare and Social Security numbers, credit scores and banking information. That helps organizations comply with regulations such as GDPR, CCPA and U.K. Data Protection Act, to name a few. Finally, some of the best DV solutions offer premium features such as query acceleration through aggregation awareness, AI/ML driven dataset recommendation and auto-scaling architecture in the cloud.

Supported by data virtualization, a logical data fabric can enable a wide range of benefits, including:

Real-time data integration across disparate systems. Logical data fabric enables real-time access to data across vastly different kinds of sources, including cloud and on-premises systems; streaming and historical data systems; legacy and modern systems; structured, semi-structured and completely unstructured sources; and cloud systems provided by different vendors. It can handle flat files, social media feeds, IoT data and more.
Enterprise data catalogs. Logical data fabric enables comprehensive data catalogs across the entire enterprise and provides seamless access to the data itself. Business users can use the catalog to understand what data is available and any conditions under which it can be used. They can also capture the full lineage of any dataset as well as all applicable associations.
Seamless governance and security. Because data in a logical data fabric is accessed through a unified data access layer, organizations can easily control who is allowed to view or edit which data.
Powerful AI/ML capabilities. With a unified, logical framework, a logical data fabric enables AI/ML capabilities at a variety of different points within the offered solution, including query optimization, data delivery and automated recommendations within the data catalog.
Simplified maintenance. Logical data fabric protects users and administrators from the complexities of accessing each individual source and operates with each data source “as it is.” Unlike ETL scripts -- which need to be re-written, re-tested and re-deployed whenever a source is removed or changed -- logical data fabric accommodates these changes, greatly simplifying the overall administrative burden.

The data landscape is only going to get more complex for insurers, and business users will only want broader, faster access to all of the data available without unnecessary risk or liability. Implementing a logical data fabric built on data virtualization has proven to deliver information immediately to meet business demands and does so without the cost of a monolithic hardware upgrade. More importantly, it promises to reduce the total costs of the data infrastructure by leaving the source data exactly where it is. Put simply, the logical data fabric can act as an indemnity to data management and to leveraging your precious data assets.