Experimental validation
Before implementing your ideas, determine if they will really have the expected effect.
Decision makers want to know if, by implementing their ideas or possible solutions, they will really influence user behavior.
In other words, to determine which is the best decision, they seek to establish the causal relationship between their actions and the results.
For example, if I modify the variable X (cause), will the result Y (effect) change?
To solve this question, some organizations mistakenly simply implement the idea and then measure the changes in the results. The great mistake of this approach is that the observed effect may be due to a series of external circumstances, and not to the variable of interest.
Imagine that your company wants to promote the sales of a product and for this it launches an ambitious advertising campaign. After two months they evaluate the results and notice a 20% growth in sales. Could you say with certainty that the campaign is the cause of this increase?
No.
Due to the fact that there are other factors that may have influenced the result - for example, changes in the marketing mix of the competition, changes in the purchasing power of customers, or the seasonal variation in demand - it is not possible to distinguish with certainty the origin of the change in sales.
The only way to determine with certainty the validity of an idea or solution is through an experiment: a procedure that follows the scientific method to support or reject a hypothesis. There are different types of experiments, but the most effective to determine the cause-effect relationship is known as the Randomized Controlled Trial (or RCT).
Randomized Controlled Trial
A Randomized Controlled Trial makes it possible to accurately reveal the effect of a stimulus or intervention on the indicator of interest, by isolating it from other variables that could also be intervening in the result. It consists of taking a sample of the user population and randomly dividing them into different groups. One of these groups is exposed to a standard stimulus or status quo (control group), while the rest of the groups are exposed to the different variants or treatments that we are interested in testing.
Subsequently, the averages of the results in each group are estimated. The differences between the average of the control group and the treatment groups offer us the effect of the stimulus on the behavior of the users.
However, it is not yet possible to rule out the possibility that the observed effects are the product of chance.
To do this, the p-value is determined by statistical analysis. When this value is less than a threshold established by convention (usually below 5%) we can affirm that the effect is not a coincidence and therefore the intervention worked.
Finally, with that statistical certainty, it is now possible to implement your idea or solution to the entire user population of the organization.
Experimenting to test ideas
Simulating a project that you could carry out with Heurística researchers and statistics specialists.
Imagine that you are responsible for a digital product. Your team recently implemented a feature that has great potential for success. However, so far the results are not as good as you expected: visitors to the web portal explore the functionality but most do not use it.
After conducting an investigation of your users , two possible solution paths emerge. Both alternatives are promising, although you are not sure which to implement. Additionally, you have doubts about the convenience of changing something since this type of modification can be very expensive, both in terms of resources and customer satisfaction.
The optimal way to resolve this trade-off is through a Randomized Controlled Trial. Under this procedure, the first step is to take a sample of your user population, and randomly assign these people to one of the three groups (Figure 1).
The members of the control group interact with the current version of the website. As if nothing had changed. On the other hand, the treatment groups are exposed to each of the possible solution ideas.
Figure 1
In our example, after statistically analyzing the results, we found that the people exposed to idea 2 use the new functionality more than those who see the current version of the platform -in the control group-. Therefore implementing that modification is a good measure.
On the other hand, idea 1 showed a lower performance than the control group, so having implemented it before it was tested would have meant an error, with potentially serious consequences for our business.