-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding or removing one sample results in absolutely different embeddings #50
Comments
It depends how close the new sample is on average to the first 1000 samples. If it's a nearest neighbor of many of the original samples, then the embedding may look a bit different. A few things you can try:
If you give me access to the data I can play around with your example when I have some free time. Additionally, I see that the log contains the following line:
Having duplicated is typically ill-advised (and can sometimes lead to unexpected behavior), since it doesn't really make sense in the context of the embedding problem. You don't need two representations of the same thing. |
My code is following:
When I use the first 1,000 samples from the input matrix I get a very different results then using one sample more (1,001).
Here is the log:
And here the output embeddings:
Is this an expected behaviour? I thought adding one sample should not makes as much difference.
Thank you for helping me out!
The text was updated successfully, but these errors were encountered: