← back to post

Cosine similarity is probably not the way to go for the reasons you mentioned, so an implementation that ignores semantic similarity is probably safer. Fuzzy matching with a dictionary of known good slugs wouldn’t handle every situation, but handle enough to be valuable. I don’t know enough about it to think of the specifics, but I’ve seen it in action for things like URL case handling.

Stefan · ChallengesSuggests · · Jun 06, 07:44
1 reply

Semantic similarity can be used as a helping metric, but not a deciding factor.

Dan Petrovic · Suggests · · Jun 16, 00:54

Sign in with Google to reply.