Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

golang: separate c functions and cache simsimd_metric_punned #76

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

corani
Copy link
Contributor

@corani corani commented Jan 25, 2024

The changes move the inline C functions from simsimd.go to a new file, simsimd.c.
This separation enhances code organization and readability. It also allows for
better management of the C code, which can now be modified independently from
the Go code.

More importantly, we now cache the results from simsimd_metric_punned
instead of determining the capabilities for each call. This improves the
benchmark from 1940ns/op to 1320ns/op on my system.

The changes move the inline C functions from simsimd.go to a new file, simsimd.c.
This separation enhances code organization and readability. It also allows for
better management of the C code, which can now be modified independently from
the Go code.

More importantly, we now cache the results from simsimd_metric_punned
instead of determining the capabilities for each call. This improves the
benchmark from 1940ns/op to 1320ns/op on my system.
@corani
Copy link
Contributor Author

corani commented Jan 25, 2024

Note: this is still 4x slower than the native Go implementation, but that's better than 6x 🤣

@ashvardanian
Copy link
Owner

Hi, @corani! You are right to evaluate the dynamic dispatch just once. I think we should generalize it and implement in an identical way to how I implement it in StringZilla. That is more laborious, but can be reused across different languages.

@pplanel has recently pushed Rust bindings, but they are slower than native Rust code, because he doesn't cache the pointer in any way. In case any of you guys want to implement it, I'm happy to provide guidance, but won't be able to work on it actively in the coming weeks 🤗

@pplanel
Copy link
Contributor

pplanel commented Jan 28, 2024

Hey @ashvardanian, I'm interested know more about this benchmark and how can the pointer caching be done.

The Rust binding benchmark are comparing cosine and sqeuclidean against their respective implementations in SimSIMD.

And I'm seeing this results:

Cosine
image

SqEuclidean
image

@ashvardanian
Copy link
Owner

This is interesting, @pplanel. I must have misread the timings in the console.

The common approach is to have a static structure with pointers, that is populated when the shared library is loaded. Then, all the function calls go through that lookup table. The StringZilla snippet is a pretty good example, I believe.

@corani
Copy link
Contributor Author

corani commented Jan 29, 2024

I'm unable to update the PR for resolve the conflict:

 ! [remote rejected] corani/perf -> corani/perf (refusing to allow a Personal Access Token to create or update workflow `.github/workflows/prerelease.yml` without `workflow` scope)

@corani
Copy link
Contributor Author

corani commented Jan 29, 2024

Hi, @corani! You are right to evaluate the dynamic dispatch just once. I think we should generalize it and implement in an identical way to how I implement it in StringZilla. That is more laborious, but can be reused across different languages.

That'll have to be done by someone with actual experience writing C code 😉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants