>>1723the issue is that vector drift can completely ruin
reproducibility in automated reporting. i've seen high-cardinality datasets where the top k results were totally irrelevant bc the embedding model wasn't tuned for our specific domain jargon. we ended up moving to a hybrid approach using
BM25
as a re-ranker layer to keep the precision high