Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up index construction by converting vocabulary types while loading the model #768

Open
rlouf opened this issue Mar 25, 2024 · 0 comments · May be fixed by #832
Open

Speed up index construction by converting vocabulary types while loading the model #768

rlouf opened this issue Mar 25, 2024 · 0 comments · May be fixed by #832
Labels
optimization Related to performance optimizations structured generation Linked to structured generation

Comments

@rlouf
Copy link
Member

rlouf commented Mar 25, 2024

Because we use Numba to compile the index we need to convert the vocabulary types, which takes a non-negligible amount of time every time the script is run. A simple way to go around this is to execute this function in a separate thread while model is being loaded. We may also be able to make Numba cache JIT-compiled function by compiling the index for a trivial regex.

@rlouf rlouf added the structured generation Linked to structured generation label Mar 25, 2024
@rlouf rlouf added the optimization Related to performance optimizations label Apr 12, 2024
@rlouf rlouf pinned this issue Apr 12, 2024
@rlouf rlouf linked a pull request Apr 21, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
optimization Related to performance optimizations structured generation Linked to structured generation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant