Here are some examples of null and alternative hypotheses that we would be answering during the analytic lifecycle.
- Once we have fit a model – does it predict better than always predicting the mean value of the training data? If we call the mean value of the training data “the null model”, then the null hypothesis is that the average squared prediction error from the model is the same as the average squared prediction error from the null model. The alternative is that the model’s squared prediction error is less than that of the null model. A variation of that is to determine whether your “new” model predicts better than some “old” model. In that case, your null model is the “old” model, and the null and alternative hypotheses are the same as describe above.
- When we are evaluating a model, we sometimes want to know whether or not a given input is actually contributing to the prediction. If we are doing a regression, for example, this is the same as asking if the regression coefficient for a variable is zero. The null hypothesis is that the coefficient is zero; the alternative is that the coefficient is non-zero.
- Once we have settled on and deployed a model, we are now making decisions based on its predictions. For example, the model may help us make decisions that are supposed to improve revenue. We can test if the model is improving revenue by doing what are referred to as “A/B tests”. Suppose the model tells us whether or not to make a customer a special offer. Over the next few days, every customer who comes to us is randomly put into the “A” group, or the “B” group. Customers in the A group get special offers (or not) depending on the output of the model. Customers in the B group get special offers (or not) depending on the output of the model. Customers in the B group get special offers “the old way” – either they don’t get them at all, or they get them by whatever algorithm we used before.
If the model and the intervention are successful, then group A should generate higher revenue than group B. If group A does not generate higher revenue than group B (if we accept the null hypothesis that A and B generate the same revenue), then we have to determine if the problem is whether the model makes incorrect predictions, or whether our intervention is ineffective.
If we are testing more than one intervention at the same time (A, B, and C), then we can do an ANOVA analysis to see if there is a difference in revenue between the groups. We will talk about ANOVA in a bit.