Evaluating Recommender Model
August 02, 2025
Offline Metrics
Given a test data on impressions logs, group the logs by query. This will give a dictionary where key is the query, and value is a list of contents that's been interacted in chronological order.
Recall@K
This is good to evaluate retrieval model, because the retrieval model aims to find all the relevant contents for a given query.
Average Precision@K
Mean Average Precision@K
Coverage@K
Good to evaluate whether a model is recommending diverse contents.
Mean Reciprical Rank
Takes into account the position of the ranked content. However, it only looks at the first content that was correctly recommended.
Noramlized Discounted Cumulatie Gain@K
Good to take into account how well the recommender system positions the contents. In is the relavence of the th interaction from the ground trugh observation in the test data.
IDCG@K is the ideal discounted cumulative gain of the first items.
NDCG@K is good to evaluate ranking ability of the system, as it takes into the position of the ranked contents.
Reference
10 metrics to evaluate recommender and ranking systems Information Retrieval Evaluation recmetrics