One of the most popular analytics methods is the “slice and dice” — take some metrics and categorize them. This is such a popular method that people often mistake it for the analytics process itself.
This is a really good method for solving simple problems. But when the problem is a little more complex, the company and its analysts may come to a methodological dead end. For example, in order to identify which step in the product affects the conversion, people use the well-known slice-and-dice method only to find that they do not gain any clear answers.
When using slice and dice, it’s difficult to formulate anything, it’s unclear how to classify behavior into categories, and it’s unclear how to take into account the influence of past steps. It’s also unclear how to combine different factors with each other.
As soon as you break a metric into eight categories, you get 256 splits. Good luck looking at them on Tableau and searching for “insight.” There is a feeling that, in order to look for an answer, you need to go somewhere different, somehow drawing conclusions in bulk.
How could you break any set of metrics into 1,000 categories and find the right answers? How could you map the customer journey and find out which steps are important and which are not?
As soon as you ask such questions, you are immediately faced with the mother of all sciences — mathematics. Unfortunately, we do not often see analysts understanding that their problems can be solved by mathematics. More often, I hear talk about A/B tests, Tableau, and Python. But I hear very little about regression, clustering, and causal inference.
From A to B
There seems to be a feeling that the ultimate state-of-the-art solution is an A/B test — running millions of tests for every possible scenario, with everyone trying to find answers to their questions.
But at the same time, the powerful method for searching for dependencies with a natural experiment is gathering dust. And if you don’t have millions of observations and an A/B test machine, you’re in slice-and-dice territory, guessing for answers to your difficult questions. And if you move the A/B test machine, then it turns out that the problems need to be solved by brute force and enumeration of hypotheses.
When you have the ability to run relatively cheap tests, that’s good. You can make mistakes along the way and learn how to be more efficient. But if you’re wondering why these methods don’t work, here’s a tip: play a game of Kerbal Space Program.
This online game allows you to create and manage your own space program, building the spacecraft, flying them, and helping your Kerbals conquer space. Run and see what happens — you will reach orbit relatively easily.
But flying to the in-game Mars will not work for you. You will find out that you have a lot of hypotheses about what can be improved but, for some reason, this has little effect on the result. You find that you don’t have enough propellant in orbit to get to Mars. You increase the amount of fuel on the strat. Then you find that the rocket can’t take off. You change engines, get to orbit, and find yourself low on fuel anyway.
Why? Because you do not understand the laws that bind your plans and results. You seem to understand the direction but completely misunderstand the scale. The problem is that it’s not a linear law and there are many factors to take into account to be familiar with celestial mechanics.
In this sense, conducting a million A/B tests is the same thing. You can test yourself, but how do you find hypotheses? By instinct? By talking to 10 clients? How do you know before an A/B test whether it can be successful or not?
But what else can we do? Nowadays, data science is focused on building accurate prediction models. We seem to have forgotten that models are a method for describing observations and dependencies. Regression is just such a model that explicitly gives you the significance of the factors used. We as analysts need to establish the “laws” of the firm, just as scientists need to establish the “laws of nature.”
The importance of Newton’s equations doesn’t lie in the fact that you will get to orbit according to the parameters entered, but in the fact that you understand from which parameters orbit is obtained. In this sense, super-complex neural networks* don’t help us, because they don’t give us laws. Instead, they give us predictions.
(*Neural networks can be decomposed into factors, but this is a story for another day.)
If you want to make a change from analytics with the help of data, then you need to know your “laws.” You can then understand what you need to change in order to reach your goals.
This is the true state-of-the-art solution: when your analytics department has discovered the “laws” of your product.
This article was written by a Wriker, in Wrike. See what it’s like to work with us and what career development opportunities we offer here.
Also, hear from our founder, Andrew Filev, about Wrike’s culture and values, the ways we work and appreciate Wrikers, and more here.