In May, I wrote about The Three Phases Insurers Need for Real Big Data Value, assessing how insurance companies progress through levels of maturity as they invest in and innovate around big data. It turns out that there’s a similar evolution around how insurers consume and use feeds from the Internet of Things, whether talking about sensor devices, wearables, drones or any other source of complex, unstructured data. The growth of IoT in the insurance space (especially with automotive telematics) is one of the major reasons insurers have needed to think beyond traditional databases. This is no surprise, as Novarica has explained previously how these emerging technologies are intertwined in their increasing adoption.
The reality on the ground is that the adoption of the Internet of Things in the insurance industry has outpaced the adoption of big data technologies like Hadoop and other NoSQL/unstructured databases. Just because an insurer hasn’t yet built up a robust internal skill set for dealing with big data doesn’t mean that those insurers won’t want to take advantage of the new information and insight available from big data sources. Despite the seeming contradiction in that statement, there are actually three different levels of IoT and big data consumption that allow insurers at various phases of technology adoption to work with these new sources.
Phase 1: Scored IoT Data Only
For certain sources of IoT/sensor data, it’s possible for insurers to bypass the bulk of the data entirely. Rather than pulling the big data into their environment, the insurer can rely on a trusted third party to do the work for it, gathering the data and then using analytics and predictive models to reduce the data to a score. One example in use now is third-party companies that gather telematics data for drivers and generate a “driver score” that assesses a driver’s behavior and ability relative to others. On the insurer’s end, only this high-level score is stored and associated with a policyholder or a risk, much like how credit scores are used.
This kind of scored use of IoT data is good for top-level decision-making, executive review across the book of business or big-picture analysis of the data set. It requires having significant trust in the third-party vendor’s ability to calculate the score. Even when the insurer does trust that score, it’s never going to be as closely correlated to the insurer’s business because it’s built with general data rather than the insurer’s claims and loss history. In some cases, especially insurers with smaller books of business, this might actually be a plus, because a third party might be basing its scores on a wider set of contributory data sets. And even large insurers that have matured to later phases of IoT data consumption might still want to leverage these third-party scores as a way to validate and accentuate the kind of scoring they do internally.
One limitation is that a third party that aggregates and scores the kind of IoT data the insurer is interested in has to already exist. While this is the case for telematics, there may be other areas where that’s not the case, leaving the insurer to move to one of the next phases on its own.
Phase 2: Cleansed/Simplified IoT Data Ingestion
Just because an insurer has access to an IoT data source (whether through its own distribution of devices or by tapping into an existing sensor network) doesn’t mean the insurer has the big data capability to consume and process all of it. The good news is it’s still possible to get value out of these data sources even if that’s the case. In fact, in an earlier survey report by Novarica, while more than 60% of insurers stated that they were using some forms of big data, less than 40% of those insurers were using anything other than traditional SQL databases. How is that possible if traditional databases are not equipped to consume the flow of big data from IoT devices?
What’s happening is that these insurers are pulling the key metrics from an IoT data stream and loading it into a traditional relational database. This isn’t a new approach; insurers have been doing this for a long time with many types of data sets. For example, when we talk about weather data we’re typically not actually pulling all temperatures and condition data throughout the day in every single area, but rather simplifying it to condition and temperature high and low at a zip code (or even county) on a per-day basis. Similarly, an insurer can install telematics devices in vehicles and only capture a slice of the data (e.g. top speed, number of hard breaks, number of hard accelerations—rather than every minor movement), or filter only a few key metrics from a wearable device (e.g. number of steps per day rather than full GPS data).
This kind of reduced data set limits the full set of analysis possible, but it does provide some benefits, too. It allows human querying and visualization without special tools, as well as a simpler overlay onto existing normalized records in a traditional data warehouse. Plus, and perhaps more importantly, it doesn’t require an insurer to have big data expertise inside its organization to start getting some value from the Internet of Things. In fact, in some cases the client may feel more comfortable knowing that only a subset of the personal data is being stored.
Phase 3: Full IoT Data Ingestion
Once an insurer has a robust big data technology expertise in house, or has brought in a consultant to provide this expertise, it’s possible to capture the entire range of data being generated by IoT sensors. This means gathering the full set of sensor data, loading it into Hadoop or another unstructured database and layering it with existing loss history and policy data. This data is then available for machine-driven correlation and analysis, identifying insights that would not have been available or expected with the more limited data sets of the previous phases. In addition, this kind of data is now available for future insight as more and more data sets are layered into the big data environment. For the most part, this kind of complete sensor data set is too deep for humans to use directly, and it will require tools to do initial analysis and visualization such that what the insurer ends up working with makes sense.
As insurers embrace artificial intelligence solutions, having a lot of data to underpin machine learning and deep learning systems will be key to their success. An AI approach will be a particularly good way of getting value out of IoT data. Insurers working only in Phase 1 or Phase 2 of the IoT maturity scale will not be building the history of data in this fashion. Consuming the full set of IoT data in a big data environment now will establish a future basis for AI insight, even if there is a limited insight capability to start.
Different Phases Provide Different Value
These three IoT phases are not necessarily linear. Many insurers will choose to work with IoT data using all three approaches simultaneously, due to the different values they bring. An insurer that is fully leveraging Hadoop might still want to overlay some cleansed/simplified IoT data into its existing data warehouse, and may also want to take advantage of third-party scores as a way of validating its own complete scoring. Insurers need to not only develop the skill set to deal with IoT data, but also the use cases for how they want it to affect their business. As is the case with all data projects, if it doesn’t affect concrete decision-making and business direction, then the value will not be clear to the stakeholders.