How to Determine the Rollout of Recommendation System?
Contents
What’s the Goal of the Recommendation System?
To increase the performance by observing the business metrics
Process of Assessment
Overview
offline testing -> online A/B testing -> rollout phase
Offline testing
Check if the model could reach offline metrics
- MAP
- Mean of the average of the precision of each ranking
- MRR
- Taking the ranking position into the evaluation consideration
- NDCG
- Taking the ranking position and the relevance score into the evaluation consideration
- MAP
Offline metrics could calculate by the collected data and testing by running through whole stages of recommendation system
Online A/B Testing
Online metrics
- Determine by the business goal
- Checkout CVR
- Add to cart CVR
- Favorite CVR
- AOV
- Determine by the business goal
Bucket testing
- Divide all of the user into 10 bucket and 10% of user in each bucket would be performed the experiment.
- In each of the bucket, we can observe whether the new system is success or not
Split testing
- Split testing is needed if the bucket or traffic not enough in bucket testing stage
- If we are using the multi-stage ranking model and we know that the influence of each stage on the online metric wouldn’t interference (affect on the online metric is orthogonal), then we could perform experiment for different stage in same bucket.
Rollout Phase
- Gradually increase the traffic of new system