Token counting and litellm provider customization #1421

computer-whisperer · 2024-04-28T03:59:29Z

Schedule monologue compression with token counting rather than character counting, preventing occasional input token count overruns that can happen when lots of non-textual output occur (creating scenarios where the same number of chars can require more tokens).

(This has a minor conflict with #1417, and will be updated after that pr gets merged)

…rations for max input and output tokens, pulling from litellm when available.

… token-counting # Conflicts: # agenthub/monologue_agent/agent.py # opendevin/config.py

li-boxuan · 2024-04-28T06:49:33Z

opendevin/llm/llm.py

+ if self.model_info is not None and 'max_output_tokens' in self.model_info:
+ self.max_output_tokens = self.model_info['max_output_tokens']
+ else:
+ self.max_output_tokens = 1024


Just curious: where does this number come from? I guess 4096 is because it's the limit of GPT 3.5, but how about this one?

I don't have a significant justification for either of these defaults, and I am interested to hear opinions on them. I regularly experienced overruns with a 512 output token limit, and therefore I usually use 1024 or higher locally.

I don't have a strong opinion either. I just feel like it would be better to have some comments explaining where these numbers are from.

I have added comments documenting this:

# Max input tokens for gpt3.5, so this is a safe fallback for any potentially viable model self.max_input_tokens = 4096 # Enough tokens for most output actions, and not too many for a bad llm to get carried away responding # with thousands of unwanted tokens self.max_output_tokens = 1024

opendevin/config.py

… token-counting # Conflicts: # opendevin/llm/llm.py # opendevin/schema/config.py

… conflict with recent command-r-plus commit.

opendevin/llm/llm.py

Co-authored-by: Engel Nyst <[email protected]>

… token-counting # Conflicts: # opendevin/llm/llm.py

codecov-commenter · 2024-04-30T04:19:55Z

Codecov Report

Attention: Patch coverage is 86.66667% with 4 lines in your changes are missing coverage. Please review.

❗ No coverage uploaded for pull request base (main@4e84aac). Click here to learn what that means.

Files	Patch %	Lines
opendevin/llm/llm.py	81.81%	4 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #1421   +/-   ##
=======================================
  Coverage        ?   60.83%           
=======================================
  Files           ?       88           
  Lines           ?     3738           
  Branches        ?        0           
=======================================
  Hits            ?     2274           
  Misses          ?     1464           
  Partials        ?        0

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

rbren · 2024-05-02T13:39:46Z

agenthub/monologue_agent/agent.py

@@ -32,7 +32,7 @@
 if config.get(ConfigType.AGENT_MEMORY_ENABLED):
 from agenthub.monologue_agent.utils.memory import LongTermMemory

-MAX_MONOLOGUE_LENGTH = 20000
+MAX_TOKEN_COUNT_PADDING = 512


Is this roughly a similar number? 40 chars per token? That seems like a lot to me.

Ahh nvm--I see how it's being used differently

rbren

This LGTM!

rbren · 2024-05-02T13:41:19Z

@enyst looks like we need your 👍

rbren · 2024-05-03T13:52:17Z

@computer-whisperer looks like it just needs a rebase. Feel free to ping me when it's ready!

… token-counting # Conflicts: # opendevin/core/config.py # opendevin/core/schema/config.py # opendevin/llm/llm.py

computer-whisperer · 2024-05-05T18:59:52Z

@rbren should be ready to go

enyst · 2024-05-05T19:57:48Z

This will make the behavior much better, thank you! And sorry for the delay here.

computer-whisperer added 4 commits April 26, 2024 14:57

Count tokens to judge more accurate max monologue length, add configu…

8196742

…rations for max input and output tokens, pulling from litellm when available.

Lint fixes

c825a55

Fix token counter

a521c0a

Merge remote-tracking branch 'refs/remotes/origin-upstream/main' into…

294768d

… token-counting # Conflicts: # agenthub/monologue_agent/agent.py # opendevin/config.py

li-boxuan reviewed Apr 28, 2024

View reviewed changes

enyst reviewed Apr 28, 2024

View reviewed changes

opendevin/config.py Outdated Show resolved Hide resolved

computer-whisperer changed the title ~~Token counting~~ Token counting and litellm provider customization Apr 28, 2024

enyst requested changes Apr 28, 2024

View reviewed changes

opendevin/config.py Outdated Show resolved Hide resolved

enyst and others added 6 commits April 29, 2024 00:39

Merge branch 'main' into token-counting

bd12f56

Merge branch 'main' into token-counting

7424522

Merge remote-tracking branch 'refs/remotes/origin-upstream/main' into…

f3b503a

… token-counting # Conflicts: # opendevin/llm/llm.py # opendevin/schema/config.py

Merge remote-tracking branch 'origin/token-counting' into token-counting

b2ea746

Use None as the default for llm_custom_llm_provider, resolve settings…

ed958cc

… conflict with recent command-r-plus commit.

Document rationale for default token counts.

4c36787

enyst reviewed Apr 30, 2024

View reviewed changes

opendevin/llm/llm.py Outdated Show resolved Hide resolved

enyst reviewed Apr 30, 2024

View reviewed changes

opendevin/llm/llm.py Outdated Show resolved Hide resolved

computer-whisperer and others added 3 commits April 29, 2024 21:40

Update opendevin/llm/llm.py

037504d

Co-authored-by: Engel Nyst <[email protected]>

Update opendevin/llm/llm.py

b0c5f29

Co-authored-by: Engel Nyst <[email protected]>

Merge remote-tracking branch 'refs/remotes/origin-upstream/main' into…

6a853c2

… token-counting # Conflicts: # opendevin/llm/llm.py

computer-whisperer and others added 3 commits April 30, 2024 00:30

Reverting formatting changes from merge.

fb605d1

Maybe this will satisfy pydoc-markdown?

30ab87a

Merge branch 'main' into token-counting

7ad63d4

rbren reviewed May 2, 2024

View reviewed changes

rbren approved these changes May 2, 2024

View reviewed changes

rbren mentioned this pull request May 2, 2024

fix(agenthub) : limit context size send to llm api #1485

Closed

enyst approved these changes May 2, 2024

View reviewed changes

Merge remote-tracking branch 'refs/remotes/origin-upstream/main' into…

4269405

… token-counting # Conflicts: # opendevin/core/config.py # opendevin/core/schema/config.py # opendevin/llm/llm.py

enyst merged commit 27e13fa into OpenDevin:main May 5, 2024
23 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Token counting and litellm provider customization #1421

Token counting and litellm provider customization #1421

computer-whisperer commented Apr 28, 2024

li-boxuan Apr 28, 2024

computer-whisperer Apr 28, 2024

li-boxuan Apr 28, 2024

computer-whisperer Apr 29, 2024 •

edited

codecov-commenter commented Apr 30, 2024 •

edited

rbren May 2, 2024

rbren May 2, 2024

rbren left a comment

rbren commented May 2, 2024

rbren commented May 3, 2024

computer-whisperer commented May 5, 2024

enyst commented May 5, 2024

Token counting and litellm provider customization #1421

Token counting and litellm provider customization #1421

Conversation

computer-whisperer commented Apr 28, 2024

li-boxuan Apr 28, 2024

Choose a reason for hiding this comment

computer-whisperer Apr 28, 2024

Choose a reason for hiding this comment

li-boxuan Apr 28, 2024

Choose a reason for hiding this comment

computer-whisperer Apr 29, 2024 • edited

Choose a reason for hiding this comment

codecov-commenter commented Apr 30, 2024 • edited

Codecov Report

rbren May 2, 2024

Choose a reason for hiding this comment

rbren May 2, 2024

Choose a reason for hiding this comment

rbren left a comment

Choose a reason for hiding this comment

rbren commented May 2, 2024

rbren commented May 3, 2024

computer-whisperer commented May 5, 2024

enyst commented May 5, 2024

computer-whisperer Apr 29, 2024 •

edited

codecov-commenter commented Apr 30, 2024 •

edited