With impact of vector magnitudes on similarity score, are there scenarios where using dot-product might accidentally exaggerate similarity, like for vectors with large magnitudes?
Or are web content/passage embeddings too few dimensions to worry about that?
Sign in with Google to reply.
Absolutely. If direction is more important than intensity, use cosine similarity or normalize embeddings before computing dot-product.