Skip to content

Is there a way to deduplicate vectors if they came from very similar sources? #3268

Answered by generall
dimus asked this question in Q&A
Discussion options

You must be logged in to vote

Each dataset require individual calibration, so I don't think there is an out-of-the-box solution for this. However, you can try to run similarity search against the whole dataset with duplicates to generate a list of candidates for the further deduplication

Replies: 1 comment 2 replies

Comment options

You must be logged in to vote
2 replies
@dimus
Comment options

@generall
Comment options

Answer selected by dimus
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants