The Wild West of Data
Insurers are moving from static data sources to real-time, dynamic information, but privacy concerns have emerged as an imposing obstacle.
Photo Illustration: Andrew Crespo
- Real-Time World: Insurers are moving from outdated, static data sources to dynamic, real-time information about the risks they are covering.
- Long Overdue: Personal lines insurers underwrite essentially the same way they did 25 years ago.
- Banning Factors?: Since January, at least a dozen states have introduced or are considering legislation restricting the use of certain rating factors.
A black box lies hidden from view in nearly every car on the road, and most Americans are unaware of it.
The smartphone in their pocket holds the necessary components to track their location and movements, spending habits and driving behavior—and often does.
Meanwhile, scores of data brokers already glean and analyze reams of their personal information from social media, retailers, search engine providers and even automakers that collect it from a variety of vehicle devices.
And the insurance industry is increasingly interested in using those data streams and similar information in its underwriting.
Insurers are searching for new and unconventional forms of data, specifically dynamic, real-time information to replace outdated, static sources. Carriers are shifting to those next-generation data sets to assess and price risk in auto, homeowners, health, life and small commercial.
Most of that information is readily available to insurers, whether consumers know it or not. But sensitive privacy and regulatory concerns accompany much of it—especially the use of consumers' digital footprints.
“Think of it as kind of a Wild West of data generation,” said Greg Donaldson, Aite Group senior P/C insurance analyst. “Nothing is completely locked down right now.
“Many of the new cars already have these data-generation devices built in. [Insurers] just need to move quickly to figure out how to get that and how to turn it into an actionable rate. They're scrambling with new forms of data to keep up with each other.”
They are exploring it as regulators continue to push the industry to evaluate risk based on behavior and conditions—especially in auto—rather than by who the applicant is. Evolving oversight and societal views on the use of gender, education level and even credit scores as rating factors are pressuring carriers to find additional insight into risk.
No matter the space, insurers are experimenting with emerging streams of big data to improve underwriting accuracy, reduce fraud, gain a competitive advantage in crowded markets and replace that growing list of lost rating factors.
The sources of new data include third-party brokers, public filings, social media and internet of things devices. And a host of insurtechs are using algorithms and analytic tools to translate information into scores, metrics and predictive models.
“You're moving from demographic-based data to actual-use data,” Donaldson said. “You're switching from static data points collected at the time of the application to dynamic data that changes all the time.
“That opens the door to much more accurate rating. It also provides an opportunity for insurance companies to move away from some of the more controversial data sources that they've used in the past.”
Many in the industry think the use of dynamic big data in underwriting is long overdue, even if it is heightening concerns among privacy advocates.
Technological innovation has reshaped society over the past two decades, but it hasn't been integrated into insurance models at nearly the same speed. Personal lines insurers underwrite essentially the same way they did 25 years ago, with the notable exception of credit scores.
“Assessing risk at a single point in time is an antiquated concept. But today that's how the entire insurance world works,” said Max Drucker, CEO of insurtech Carpe Data, which provides data and predictive scoring products for insurers. “They'll pull credit. They'll pull [motor vehicle records]. Then they'll issue the policy. And maybe—maybe—they'll do it again at renewal.
“Being able to continuously evaluate and monitor risk is the future of the industry.”
Real-time data offers the opportunity to precisely assess risk and even reduce it—think offering tips and rewards for safer driving. It also presents the chance to engage with consumers in a way that carriers have traditionally failed to do.
The practice is finally taking hold with usage-based insurance in auto, smart devices in homeowners, wearables in health and life and insurtechs offering “living” streams of data across the industry.
In fact, some analysts say the tipping point is fast approaching when active, real-time data will fully replace conventional information to inform underwriting and pricing.
“It would not surprise me at all if over the next five years they start moving from some of these older, static, data points and shift to a dynamic model,” Donaldson said. “It's always better to rate someone based on what's actually going on versus who they were at the time of the application.
“Some companies might do it even faster. There's enough playing with it now that all it's going to take is one successful run.”
But few use it now in their underwriting, possibly because insurers have to delicately navigate an evolving patchwork of regulatory oversight and shifting public opinion.
“In the U.S., it is sort of the Wild West right now,” said Mike Vogt, executive director of data, analytics and machine learning for technology consulting firm SPR. “It will be up to voters and legislators to determine what level of privacy they're willing to trade for convenience and efficiency.”
Think of it as kind of a Wild West of data generation. Nothing is completely locked down right now.
Changing Times, Rating Factors
But the drivers behind the rise of unconventional data are clear.
Regulation and competition.
In some cases, emerging data points are supplementing traditional information that offers limited insight.
The commoditization of auto insurance, for instance, has forced companies “to figure out how to keep costs under control and charge a better rate,” Donaldson said. “Therefore credit modeling comes up, and now you're starting to see all this new data.”
In other cases, emerging data is a substitute for factors such as gender or education level that some regulators have banned.
“With unconventional data sources, you're just using different methods to suss out more clearly the profile of that risk that traditional applications might not be able to determine or they can't use,” said Lucian McMahon, a senior research specialist with the Insurance Information Institute. “There's always other ways to price risk when if you can't use a certain factor.”
But the list of banned factors is growing rapidly.
This year, California became the latest state to ban the use of gender as an auto rating factor, joining Hawaii, Massachusetts, Montana, Pennsylvania, North Carolina and parts of Michigan.
Other states have outlawed marital status and level of education.
The industry has warned that losing many of the tools it uses to assess and price risk will lead to reduced accuracy in underwriting and higher premiums.
But since January, at least a dozen states have introduced or are considering further legislation restricting the use of rating factors, the American Property Casualty Insurance Association said.
Connecticut, New Mexico, Texas and Virginia have considered banning gender in auto. Maryland and New Jersey are considering eliminating education and occupation as factors. Maryland is also considering banning marital status. And California announced plans to host a public meeting to consider the use of occupation and education in setting auto rates.
Meanwhile, the use of consumer credit information has moved to the front lines in the battle between insurers and consumer advocates.
The industry has found credit information—an insurance score that includes credit elements—to be a reliable predictor of personal responsibility and the likelihood a claim will be filed, analysts said. Now ubiquitous, it became a prevalent factor in auto about 20 years ago and remains a trusted consideration in homeowners, life and other segments.
However, California, Massachusetts and Hawaii prohibit the use of credit in auto insurance. And Connecticut, Indiana, Maryland, New Jersey, Oregon, Rhode Island and West Virginia have introduced bills that would ban it.
Maryland and Hawaii do not allow it as a consideration in homeowners.
In a strongly-worded statement in March, NAMIC defended the use of credit information, saying eliminating it would make “underwriting less accurate and could lead to an increase in premiums,” said Jimi Grande, senior vice president of government affairs.
“Credit is something that is being wrestled with a lot in the regulatory and political arena,” said John Lucker, a principal with Deloitte Risk &Financial Advisory and global advanced analytics market leader. “Credit is widely used. It's been proven in numerous studies, including studies that have been done by regulatory bodies and independent researchers, that it's an excellent predictor of many things from an insurance perspective.”
Credit is something that is being wrestled with a lot in the regulatory and political arena. Credit is widely used ... it’s an excellent predictor of many things from an insurance perspective.
However, the potential replacements pose their own concerns.
Studies have found third-party data can be inaccurate, outdated and difficult to correct, Lucker said. Sometimes it can even be misleading.
And that is a significant issue when culling data from consumers' digital footprints.
The trail of information people leave after surfing the internet includes the websites they view, the emails they send, their social media posts and any information they submit online for services.
“There's reasonable concern from the insureds' perspective around privacy with these alternative data streams and social media,” McMahon said. “I would be shocked if [internet history] data would be permissible to use just given the privacy angle.”
However, consumers' digital footprints, lifestyle choices and even the magazines they read provide insight into risk.
Many carriers already use social media in their claims operations to combat fraud. They monitor the accounts of drivers after auto accidents or those filing disability or workers' compensation claims.
Some have even experimented with social media as a supplement to rating factors. But just the mere mention of it creates a storm of media headlines.
Given those concerns, insurers continue pursuing more customer-friendly techniques to obtain data—often dangling rewards and discounts in exchange.
In health, wearables such as Apple Watches and Fitbits monitor everything from activity and diet to sleep patterns, data that can help augment underwriting and encourage healthy lifestyles. Life insurers such as John Hancock are collecting that data and rewarding healthy living with retail gift cards and premium discounts.
Telematics support usage-based insurance in auto. The devices, plugged into cars or downloaded as mobile apps, track location, speed, driving conditions, miles driven and braking and accelerating habits.
UBI is an upgrade over traditional factors, which include age, gender, ZIP code, credit information, daily commuting distance and car make and model.
“It's a static set of data. So insurance companies would love to figure out a better predictor of how this person is going to perform over time,” Aite's Donaldson said. “Your history may not be a good indicator of your future risk. That scares insurance companies.” Carriers such as Progressive and Nationwide and insurtechs like Metromile and Root capture the behavior of the driver and price accordingly.
But some consumers have expressed concern that UBI allows carriers to track their movements. Some companies monitor location and stops through an app during the trial period to determine premiums, even when drivers are not operating a car. A few even continue to record data after the rating period.
Forty-five percent of Americans view trading their driving and location information to an auto insurer for a discount as unacceptable, according to a 2016 Pew Research Center study. Thirty-seven percent found it acceptable, while 16% said it would depend on the circumstances.
Meanwhile, discounts for using UBI amount to only 3% on average.
That combination could explain why only “5% or 7%” of noncommercial U.S. drivers have adopted usage-based insurance, said Tom Scales, head of life and health insurance at Celent.
But it still may become the standard model for auto insurance, with some carriers raising conventional policy premiums to offset UBI rewards and discounts.
The information to do it largely exists even without telematics devices. Nearly every car contains a black box—more formally known as an event data recorder—installed by the auto manufacturer that captures speed, braking and steering angles among other information.
They became commonly included more than a decade ago and record data for small snippets of time in the event of an accident. Seventeen states have passed laws limiting the use of information EDRs capture.
But other vehicle devices such as built-in navigation systems, diagnostic platforms and radar sensors can record data. Some newer cars can even capture a driver's eye movements, the weight of the front seat passengers and whether the driver's hands are on the wheel.
Smartphones, both those connected and not connected to the car, can track other information.
“The industry has the ability now to monitor your driving in real time. And they can give you feedback in real time to improve your driving profile,” Donaldson said. “That kind of data would be invaluable in the rating process.”
In homeowners, internet of things devices such as smart detectors monitoring smoke and carbon monoxide levels, smart sensors detecting plumbing leaks, smart appliances, thermostat detectors and smart home security sensors provide emerging data.
“IoT is going to play a big role,” Donaldson said. “All of those things make the home a little bit safer from risk.”
But insurers have good reason to be cautious.
They have to maneuver through a web of oversight, including the federal Fair Credit Reporting Act, the EU General Data Protection Regulation (GDPR) and the network of state regulators.
The GDPR requires companies to develop processes to catalog any identifiable data collected about individuals. And California recently passed the Consumer Privacy Act, which will require businesses to disclose the personal information collected, its sources and who it has shared it with, upon request, starting in 2020. Consumers can also request that data be deleted or not shared.
Then there is the shadowy world of data brokers to consider.
Much of the emerging data they supply is not compliant with the FCRA, according to Deloitte's Lucker. That could expose insurers to the discrimination and regulatory issues.
“Who knows how accurate or inaccurate this data is,” Lucker said. “The data ecosystem is not necessarily sourced by the company they talk to. Data Broker X might specialize in generating some segment of data, and the other data they present to the marketplace comes from licenses they have with other brokers.
“Sometimes even the data broker can't control the origin of the data they sell. So if there's an error in it, it's very difficult to fix, if not impossible.”
The sources of that information might surprise many consumers.
Nearly every company that asks customers to sign up for services and prompts them to accept its terms and conditions is sharing their data.
Internet search engines. Social media networks. Even your local grocery store through its rewards program.
“Your cell phone company. Your credit card,” Lucker said. “Every single website you go to has some type of terms of service. And no one ever reads it and decides they're not going to shop there or use that search engine.”
But the accuracy and reliability of external data can vary widely.
Social media data is “often unstructured and full of gaps, false statements and hyperbole,” according to a 2015 Verisk report.
The 2018 Experian Global Data Management Benchmark Report found that 33% of U.S. organizations believe their customer and prospect data is inaccurate.
Deloitte conducted its own limited sample survey in 2017 to test the accuracy of commercial data-broker data among 107 of its own employees. Two-thirds of respondents said their information was only zero to 50% accurate as a whole. One-third said it was zero to 25% accurate.
And the context of that data—why a consumer bought something or how they use it—is removed, potentially eliminating the applicability of the information.
Can an algorithm determine that an Apple Watch was purchased as a gift and not a commitment to fitness? Can it decipher that a social media photo of someone smoking was taken eight years ago?
Lucker shared a personal example. He ran an old-style baseball league and maintained its playing field. When it rained, he bought dozens of bags of cat litter to absorb the standing water.
“So I started getting coupons at the grocery store for cat supplies,” he said. “They must have thought I had 100 cats in the house.”
The store incorrectly concluded he was a cat owner and marketed products to him based on that assumption. Innocent as the mistake may be, consider that everything consumers buy and every service they use can be shared with data brokers and analyzed. How many conclusions and predictions from that data would be just as misleading or flatly inaccurate?
But then again everything—even sensitive personal information—has its price.
Insurers continue to devise value propositions to encourage consumers to share their valuable data for discounts or convenience, analysts say.
After all, Americans willingly share their location, movement and routines every day in exchange for free mapping technology.
“We call it Google Maps,” McMahon said. “We give Google and other mapping apps extremely sensitive information because we love the convenience of avoiding traffic.
“That sell needs to be made. But insurance is probably one of those areas where it's a hard sell, because there tends to be more of a skeptical attitude toward it.”
New Data Delivers Commercial Solutions
The clues might seem small and rather unrelated.
Yelp customer reviews. Online employee satisfaction ratings. Risk characteristics such as deep fryers or tanning booths.
But to Max Drucker, CEO of insurtech data broker Carpe Data, they are insightful—and publicly available—indicators of the risk small businesses present that an insurer might not see at first glance.
“We're using new ways to solve existing problems and questions,” he said.
The industry is hunting for new and nontraditional forms of data to inform its underwriting and pricing. And insurtechs are finding it in traditional places such as public filings, from data brokers and from alternative sources such as Yelp and the businesses' own websites.
Carpe Data, based in Santa Barbara, California, is just one insurtech vendor leveraging emerging data and analytics in the small commercial and claims spaces.
Other vendors are applying aerial photography and mapping analysis—via drones or low flying aircraft and artificial intelligence—to help insurers price the risk of homeowners cover.
They are determining the proximity of houses to water sources, overhanging trees and historic wildfire paths. They also are studying elevations and property and neighborhood drainage under various conditions.
That information can augment traditional rating factors such as the age of the home, the age of its roof, its location and the neighborhood.
Only “a very small percentage” of insurers are using such technology and data, said Greg Donaldson, Aite Group senior P/C insurance analyst. But that will soon change. He's aware of two image analysis vendors who have large insurers as clients.
“It's technology that really just started to come into its own in the last 12 to 18 months,” Donaldson said. “With the data they're generating, it's going to create the opportunity for more accurate ratings, and it will be easy to get regulators on board with it.”
With businesses constantly evolving, carriers need new data streams to accurately assess and rate their risk, Drucker contends.
“The biggest challenge is changing the mindset and processes and helping insurers understand how they can use this new information today and outside just the policy renewal cycle,” he said.
“The objective is to use new ways like AI, computer vision and alternative data sets to solve problems and also identify the rating factors of the future. What are the things that they are not looking at today that have predictive value?”
Carpe Data collects data on its own from publicly available sources and from direct data providers.
The publicly available streams include Yelp customer reviews and company profiles; businesses' websites, Twitter accounts and Facebook pages; and online employee satisfaction ratings.
Carpe Data then uses data analytics tools to assist carriers in making eligibility decisions, ratings and eliminating manual underwriting to facilitate automation, Drucker said.
It also teamed with Allstate in 2017 to apply predictive online data to reduce fraud in the insurer's claims processing.
And the firm works with disability and workers' compensation carriers to identify claimants who may not be as disabled or injured as they say. Sometimes it's a workers' comp case where the employee's 5K race time was published or his batting average from his recreation softball league was posted.
“A carrier can't be sitting there Googling people, looking stuff up,” Drucker said. “It doesn't make sense at that scale. But we're able to monitor that claimant activity over the life of that claim.”
Carpe Data also offers products that compile hundreds of small-business risk characteristics across a spectrum of industries to supplement current underwriting factors.
Does the restaurant have a deep fryer? Does the nail salon offer waxing? Tanning? Does the landscaper trim trees?
“Places that are well-run are more likely to have high visibility, to have a good reputation, to have good customer reviews,” Drucker said. “It can be a good indicator to how well that business is run, and in turn, how likely they are to have a loss.”