April 4, 2017
Big Data Can Solve Discrimination
by Alex Timm
With big data, we can better understand the causal paths between data generation and an event. There becomes no need for stereotyping.
Big data has the opportunity to end discrimination.
Everyone creates data. Whether it is your bank account information, credit card transactions or cell phone usage, data exists about anyone who is participating in society and the economy.
At Root, we use data for car insurance, an industry where rating variables such as education level or occupation are used directly to price the product. For a product that is legally mandated in 50 states, the consumer’s options are limited: give up driving and likely your ability to earn a living or pay a price based on factors out of your control.
Removing unfair factors such as education and occupation from pricing leaves room for variables within an individual’s control — namely: driving habits. In this way, data can level the playing field for all consumers and provide an affordable option for good drivers whom other companies are painting with a broad brush. In the lon term, everyone wins as roads become safer and driving becomes prohibitively expensive for irresponsible drivers.
This is just one example where understanding the consumer’s individual situation deeply allows for more precise — and more rational — decision making.
But we know that the opportunity of big data goes beyond the individual. For example, the unfair practice of naively blanketing entire countries, religions or races unfairly as “dangerous” is a major topic in the news. What happens if you apply the lens of big data to this policy?
See also: Industry’s Biggest Data Blind Spot
Causal Paths vs. Assumption-Based Decisions
With the increased availability of data, we are able to better understand the causal paths between data generation and an event. The more direct the causal path, the better predictions of future events (based on data) will perform.
Imagine having something as trivial as GPS location data from a smartphone on a suspected terrorist. Variables such as having frequent cell phone conversations with known terrorists or being located within five miles of the last 10 known terrorist attacks will allow us to move away from crude, unjust and discriminatory practices and toward a more just and rational future.
Ahmad Khan Rahami, who placed bombs in New York and New Jersey, was flagged in the FBI’s Guardian system two years earlier. The agency found there weren’t grounds to pursue an investigation — a failure that may have been averted if the FBI had better data capture and analysis capabilities. Rahami purchased bomb-making materials on eBay and had linked to terrorist-related videos online before his attempted attack. Dylann Roof’s activities showed similar patterns in the months leading up to his attack on the Emanuel AME Church in Charleston, SC.
The causal path between a hate-crime or terrorist attack and the actions of Dylann Roof and Ahmad Khan Rahami is much more direct than factors such as religion, race or skin color. Yet we naturally gravitate toward making blanket assumptions, particularly if we don’t understand how data provides a better, more just approach.
Today, this problem is more acute than ever. Discrimination is rampant — and the Trump administration’s ban on travel is unacceptable and unnecessary in the era of big data. For those unmoved by the moral argument, you should also know policies like the ban are hopelessly outdated. If we don’t begin to use data to make informed, intelligent decisions, we will not only continue to see backlash from discriminatory policies, but our decision making will be systematically compromised.
The Privacy Red Herring
Of course, if data falls into the wrong hands, harm could be done. However, modern techniques for analyzing and protecting data mitigate most of this risk. In our terrorism example, there is no need for a human to ever view GPS data. Instead, this data is collected, passed to a database and assessed using a machine learning algorithm. The output of the algorithm would then direct an individual’s screening process, all without the interference of a human. In this manner, we remove biased decision making from the process and the need for a “spy” to review the data.
See also: Why Data Analytics Are Like Interest
This definitely provides a challenge for the U.S. intelligence community, but it is an imperative one to meet. If used responsibly, analytics can provide insights based on controllable and causal variables. The privacy risk is no longer a valid excuse to delay the implementation of technologies that can solve these problems in a manner that is consistent with our values.
This world can be made a much better and safer place through data. And we don’t have to sacrifice our privacy; we can have a fair world, a safe world and a world that preserves individual liberties. Let’s not make the mistake of believing we are stuck with an outdated and unjust choice.