Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

limit / decrease tokens fed to model for each query #1427

Open
JonasDoesThings opened this issue May 7, 2024 · 0 comments
Open

limit / decrease tokens fed to model for each query #1427

JonasDoesThings opened this issue May 7, 2024 · 0 comments

Comments

@JonasDoesThings
Copy link

Hi, currently evaluating danswer. Is there currently a way to limit the input-tokens fed into the LLM to decrease cost per query?

I already searched GitHub and saw mentions of an env var called NUM_DOCUMENT_TOKENS_FED_TO_GENERATIVE_MODEL, but apparently that got removed? I couldn't find it in danswer's code anymore.

I tried to tweak-down GEN_AI_MAX_TOKENS but that lead to errors.

So is there currently some way to decrease the tokens spent on a query by reducing the amount of context being fed into the model? Maybe some limit I can tweak-down regarding the selection of relevant chunks?

Thanks in advance and greetings from Austria :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant