Machine Learning – Art or Science?

Is machine learning really bias-free? And how can we leverage this tool much more consciously than we do now?

June 6, 2017

The surge of big data and challenge of confirmation bias has led data scientists to seek a methodological approach to uncover hidden insights. In predictive analytics, they often turn to machine learning to save the day. Machine learning seems to be an ideal candidate to handle big data using training sets. It also enjoys a strong scientific scent by making data-driven predictions. But is machine learning really bias-free? And how can we leverage this tool more consciously? Why Science: We often hear that machine-learning algorithms learn and make predictions on data. As such, they are supposedly less exposed to human error and biases. We humans tend to seek confirmation of what we already think or believe, leading to confirmation bias that makes us overlook facts that contradict our theory and overemphasize ones that affirm it. In machine learning, the data is what teaches us, and what could be purer than that? When using a rule-based algorithm or expert system, we are counting on the expert to make up the “right” rules. We cannot avoid having the expert's judgments and positions infiltrate such rules. The study of intuition would go even further to say that we want the expert’s experiences and opinions to influence these rules — they are what make him/her an expert! Either way, when working our way bottom-up from the data, using machine-learning algorithms, we seem to have bypassed this bias. See also: Machine Learning: a New Force Why Art: Facts are not science; neither is data. We invent scientific theories to give data context and explanation to help us distinguish causation from correlation. The apple falling on Newton’s head is a fact; gravity is the theory that explains it. But how do we come up with the theory? Is there a scientific way to predict “Eureka!” moments? We test assumptions using scientific tools, but we don’t generate assumptions that way — at least not innovative ones that manifest from out-of-the-box thinking. Art, on the other hand, takes on an imaginative skill to express and create something. In behavioral analytics, it can take the form of a rational or irrational human behavior. The user clicking on content is fact; the theory that explains causation could be that it answered a question the user was seeking or that it relates to an area of interest to the user based on previous actions. The inherent ambiguity of human behaviors — and even more of our causation or motivation — gives art its honorable place in predictive analytics. Machine learning is the art of induction. Even unsupervised learning uses objective tools that were chosen, tweaked and validated by a human, based on his/her knowledge and creativity. Schrödinger: Another way is to think of machine learning as both an art and a science — much like Schrödinger’s cat (which is both alive and dead), the Buddhist middle way or quantum physics that tells us light is both a wave and a particle. At least, until we measure it. You see, if we use scientific tools to measure the predictiveness of a machine-learning-based model, we subscribe to the scientific approach giving our conclusions some sort of professional validation. Yet if we focus on measuring the underlying assumptions or the representation or evaluation method, we realize the model is only as “pure” as its creators. In behavioral analytics, a lot rides on the interpretation of human behavior into quantifiable events. This piece stems from the realm of art. When merging behavioral analytics with scientific facts — as often occurs when using medical or health research — we truly create an artistic science or a scientific art. We can never again separate the scientific nature from the behavioral nurture. Practical Implementation While this might be an interesting philosophical or academic discussion, the purpose here is to help with practical tools and tips. So what does this mean for people developing machine-learning-based models or relying on those models for behavioral analytics?

Invest in the methodology. Data is not enough. The theory that narrates the data is what gives it the context. The choices you make along the three stages of representation, evaluation and optimization are susceptible to bad art. So, when in need of a machine-learning model, consult with a variety of experts about choosing the best methodology for your situation before running to develop something.
Garbage in, garbage out. Machine learning is not alchemy. The model cannot turn coal into diamond. Preparing the data is often more art (or “black art”) than science, and it takes up most of the time. Keep a critical eye out for what goes into the model you are relying on, and be as transparent about it as possible if you are on the designing side. Remember that more relevant data beats smarter algorithms any day.
Data preparation is domain-specific. There is no way to fully automate data preparation (i.e. feature engineering). Some features may only add value in combination with others, creating new events. Often, these events need to make product or business sense just as much as they need to make algorithmic sense. Remember that feature design or events extraction requires a very different skill than modeling.
The key is iterations across the entire chain. You collect raw data, prepare it, learn and optimize it, test and validate it and finally put it to use in a product or business context. But this cycle is only the first iteration. A well-endowed algorithm often sends you to re-collect slightly different raw data; curve it in another angle; model; tweak and validate it differently; and even use it differently. Your ability to foster collaboration across this chain, especially where involving Martian modelers and Venusian marketers, is key!
Make your assumptions carefully. Archimedes said, “Give me a lever long enough and a fulcrum on which to place it and I shall move the world.” Machine learning is a lever, not magic. It relies on induction. The knowledge and creative assumptions you make going into the process determine where you stand. The science of induction will take care of the rest — provided you chose the right lever (i.e. methodology). But it’s your artistic judgment that decides on the rules of engagement.
If you can, get experimental data. Machine learning can help predict results based on a training data set. Split testing (aka A/B testing) is used for measuring causal relationships, and cohort analysis helps split and tailor solutions per segment. Combining experimental data from split testing and cohort analysis with machine learning can prove to be more efficient than sticking to one or the other. The way you chose to integrate these two scientific approaches is very creative.
Contamination alert! Do not let the artistic process of tweaking the algorithm contaminate your scientific testing of its predictiveness. Remember to keep complete separation of training and test sets. If possible, do not expose the test set to the developers until after the algorithm is fully optimized.
The king is dead, long live the king! The model (and its underlying theory) is only valid until a better one comes along. If you don’t want to be the dead king, it is a good idea to start developing the next generation of the model at the moment the previous one is released. Don’t spend your energy defending your model; spend your energy trying to replace it. The longer you fail, the stronger it becomes…

See also: Machine Learning to the Rescue on Cyber? Machine-learning algorithms are often used to help make data-driven decisions. But machine learning algorithms are not all science, especially when applied to behavioral analytics. Understanding the “artistic” side of these algorithms and its relationship with the scientific one can help make better machine-learning algorithms and more productive use of them.