Retrieval
August 02, 2025
Problem
Ranking a big volume of contents will have poor latency on the service.
Goal
Before ranking contents, have a fast model that can find relavant contents from a large content corpus.
How It's Done
High Level
-
Model vector representations of queries and contents in the same d-dimensional space.
-
For a given query vector use Approximate Nearest Neighbor (ANN) to retrieve relevant contents.
Offline Metric
Key goal of the retrieval model isn't to rank, but to find all the relavent contents corresponding to a given query. Therefore, we want to focus on recall rather than precision. Also, we want the retrieval stage to be diverse so coverage is also important.
For a given test set with queries we look find the top contents
- recall@k:
- coverage@k:
Model
Two Tower Embedding Model
Learn two independent neural networks that encode query and content onto the same d-dimensional space.
Given an encoded query vector and content vector , the model should learn such that the similarity between and , are high.