Added override for agent model and prompt resolution #663

gravypower · 2024-02-18T03:58:37Z

In an effort to help pilot become more cost effective I wanted to try and have some tasks completed by other models, GPT 3.5 Turbo for coding as an example.

This PR introduced new env vars

# Override model for an agent 
#FULL_STACK_DEVELOPER_MODEL_NAME=gpt-4-turbo-preview
#CODE_MONKEY_MODEL_NAME=gpt-4-turbo-preview
#ARCHITECT_MODEL_NAME=gpt-4-turbo-preview
#PRODUCT_OWNER_MODEL_NAME=gpt-4-turbo-preview
#TECH_LEAD_MODEL_NAME=gpt-4-turbo-preview
#DEV_OPS_MODEL_NAME=gpt-4-turbo-preview
#TECHNICAL_WRITER_MODEL_NAME=gpt-4-turbo-preview
DEFAULT_MODEL_NAME=gpt-4-turbo-preview

#PROMPTS_OVERRIDE_FOLDER=

Along with allowing you to change the model use for an agent this pr also allows you to set a override fro prompt resolution:

Model-Specific Overrides: First, the function checks for the existence of a model-specific prompt. If found, this prompt is loaded.
General Overrides: If a model-specific template is not found, the function looks for a general override prompt .
Default Templates: As a fallback, if neither model-specific nor general overrides are available, the function loads the default prompt from the primary environment.

├── overrides/
│ └── development/task/
│ └── breakdown.prompt
│ └──gpt-3.5-turbo-0125/
│ └── development/task/
│ └── breakdown.prompt

This PR replaces PR #646

gravypower · 2024-02-18T04:04:11Z

Will fix conflict later today

gravypower · 2024-02-18T05:43:21Z

I really need to work our how to dev in python, linting issues all the time :)

gravypower · 2024-02-19T04:17:13Z

There was an issue resolving component templates and some other small issues with the order of some args and a few unit testes

senko · 2024-02-23T01:02:00Z

Hey @gravypower this is cool!

Commeting here so you'll know I'm not ignoring the PR, it's still on my TODO list to give it proper time to review.

A couple of quick notes:

let's use underscore_case for variables in Python instead of camelCase
it would be cool if AgentConvo() could also have a model specified in the constructor (it can use the one from the agent by default), since we have a lot of things in Developer that should actually be different agents are just methods there right now; overriding the AgentConvo model per-instance would allow us to eg. call the reviewer (that's now either Developer or CodeMonkey) to use own model/prompts
I don't think you actually need a Project instance in tests for the agents, you could just use unittest.patch.MagicMock instead and not worry about project paths, creating the project, etc; much simpler to maintain. I know you copied the existing patterns here, but if we can improve on them that's even better
I don't think you need multiple jinja environments/loaders?

gravypower · 2024-02-25T19:42:23Z

let's use underscore_case for variables in Python instead of camelCase

I cant see where I have done this? Will look into linting setting that can error on this.

it would be cool if AgentConvo() could also have a model specified in the constructor (it can use the one from the agent by default), since we have a lot of things in Developer that should actually be different agents are just methods there right now; overriding the AgentConvo model per-instance would allow us to eg. call the reviewer (that's now either Developer or CodeMonkey) to use own model/prompts

This should be an easy change, will do.

I don't think you actually need a Project instance in tests for the agents, you could just use unittest.patch.MagicMock instead and not worry about project paths, creating the project, etc; much simpler to maintain. I know you copied the existing patterns here, but if we can improve on them that's even better

Will look into this. I mainly program in statically typed languages and this is my first time using python so MagicMock feels all kinds of wrong :)

I don't think you need multiple jinja environments/loaders?

You are right, looked into this some more and you can sent an array of paths to look at, have change it to this, much better. When I first looked into this i did not consider the FileSystemLoader only jinja environments. Something like this is much better:

primary_prompts_path = os.path.join(os.path.dirname(__file__), '..', 'prompts')
override_prompts_path = os.getenv('PROMPTS_OVERRIDE_FOLDER', primary_prompts_path)
file_loader = FileSystemLoader([primary_prompts_path, override_prompts_path])
env = Environment(loader=file_loader)
...
def resolveTemplate(prompt_name, model=None) -> Template:
    logger.debug(f'resolving prompt: {prompt_name}')

    if(model is not None) :
        model_prompt_name = f'{model}/{prompt_name}'
        model_template = env.get_template(model_prompt_name)
        if(model_template is not None) :
            return model_template

    return env.get_template(prompt_name)

gravypower · 2024-02-25T22:35:47Z

@senko I cant work out how to enforce underscore_case with the flake8 linter. can you give some direction here please?

gravypower · 2024-03-20T09:45:56Z

hey @senko wondering if you are going to accept this pr or not? I will resolve the conflicts if you are.

senko

Hey @gravypower sorry for the delay on this, and thanks for following up!

Here's another round of reviews, a lot more detailed - I tried to give as much feedback as possible but don't be discouraged mostly these are some small things :)

Looking forward to merging this; together with recent Anthropic additions this will make it possible to do some really cool cost/quality optimizations.

I think in the near future we can also make it work for local LLMs so for example you'd use GPT4 or Claude only for the most intensive operations, and use Mistral or Llama or some other local modal for some other tasks, etc.

pilot/.env.example

senko · 2024-03-21T02:15:39Z

pilot/helpers/Agent.py

+ self.model = os.getenv('DEFAULT_MODEL_NAME', 'gpt-4-turbo-preview')
+
+ agentModelName = f'{role.upper()}_MODEL_NAME'


Please use underscore case in Python, eg agent_model_name.

I think this should stay as uppercase as its resolving the environment variable name to get the agent model from. As we define environment variable names in uppercase I would just need to do a .upper() when trying to fetch that variable.

senko · 2024-03-21T02:16:07Z

pilot/helpers/Agent.py

+ self.project = project
+ self.model = os.getenv('DEFAULT_MODEL_NAME', 'gpt-4-turbo-preview')


The default for getenv here can use DEFAULT_MODEL_NAME instead of hardcoding it.

But since we're talking about model overrides here, I'd actually prefer that the default for Agent be None, and the llm_connection.py or AgentConvo decide what's the actual default (if not overridden anywhere).

I went to do this and broke the tests. Issue is that if there is no agent model defined in the environment variables then the model is not set. This line effectively gives us a fallback model, agents will use when their model is not found in the environment variable. I think we should leave unless we want to not make the agent models optional.

senko · 2024-03-21T02:17:05Z

pilot/helpers/Agent.py

+ agentModelName = f'{role.upper()}_MODEL_NAME'
+ if agentModelName in os.environ:
+ self.model = os.getenv(agentModelName, DEFAULT_MODEL_NAME)


This should fall back to the default model we fetch from the environment in line 12.

For example if the user sets default model name to Claude 3 Opus, we don't want agents defaulting to GPT4-Turbo instead.

it should never fallback because of the guard if agentModelName in os.environ: so I have updated this to self.model = os.getenv(agentModelName) to make things clearer. Thoughts?

pilot/helpers/AgentConvo.py

senko · 2024-03-21T02:27:36Z

pilot/prompts/prompts.py

@@ -89,7 +89,7 @@ def generate_messages_from_description(description, app_type, name):
 ]
 """
 # "I want you to create the app {name} that can be described: ```{description}```
- prompt = get_prompt('high_level_questions/specs.prompt', {
+ prompt = get_prompt('high_level_questions/specs.prompt', original_data={


The reason you need to specify original_data= is because you inserted an argument to get_prompt in its definition. This is why it's nicer to just append kwargs, to avoid hanving to hunt around all the usages (and since Python is not statically compiled, it's easy to miss some).

I think this might have been addressed def get_prompt(prompt_name, model=None, original_data=None): original_data now has a default value. I am not 100% can you please check

senko · 2024-03-21T04:42:21Z

pilot/utils/llm_connection.py

@@ -76,9 +77,9 @@ def test_api_access(project) -> bool:
 ]

 endpoint = os.getenv('ENDPOINT')
- model = os.getenv('MODEL_NAME', 'gpt-4')
+ model = os.getenv('DEFAULT_MODEL_NAME', DEFAULT_MODEL_NAME)


This will get a bit more complicated since we merged Anthropic support and also have a notion of provider and model name split (eg anthropic / claude-3-...). This is not actually used in create_chat_gpt_completion because currently it looks up the model directly, but your change will make that part of the code work slightly different. I'll dive deeper into the side effect when you resolve the conflicts and I can test more.

pilot/utils/llm_connection.py

senko · 2024-03-21T04:44:58Z

pilot/utils/utils.py

@@ -34,23 +34,38 @@ def capitalize_first_word_with_underscores(s):
 return capitalized_string


-def get_prompt(prompt_name, original_data=None):
+def get_prompt(prompt_name, model=None, original_data=None):


Let's add the model after the last keyword argument (original_data) to avoid having to modify all the functions that call it.

senko · 2024-03-21T04:49:57Z

pilot/utils/utils.py

+ return env.get_template(model_prompt_name)
+ except TemplateNotFound:
+ # If the model-specific template is not found, log the event or handle accordingly
+ logger.debug(f'Model-specific template not found for {model_prompt_name}. Falling back to general template.')


While this is useful for testing, it would create too much noise if outputted for each model-specific template that's not overridden. I'd prefer to have just return env.get_template(prompt_name) here in the except clause.

this is why I chose the debug level to log it out, maybe this should be resolve by setting he default level to info or something?

gravypower · 2024-03-21T09:25:08Z

all good, you guys are very busy i am sure. was more if you were going to accept or not.

Will have a look and address your feedback back this weekend.

gravypower · 2024-03-21T09:26:50Z

oh and thank you for the detailed comments on the pr.

…xists

Aaron Job added 4 commits February 16, 2024 05:45

added model override per agent

fe4087d

first pass at prompt override

47e54d2

added unit tests

8d1ae06

fixed spelling error

49a8bed

gravypower changed the title ~~Development~~ Added override for agent model and prompt resolution Feb 18, 2024

Merge branch 'Pythagora-io/development' into development

6de2022

Aaron Job added 4 commits February 18, 2024 07:05

fixed linting issues

545a995

fixed issues with component templates resolution

14465c2

fixed issue with arg order on the get_prompt function

b1b3d3c

fixed test_product_owner_model_override

b2128ad

Merge branch 'development' into development

451ac75

Merge branch 'Pythagora-io:development' into development

bbaf9d9

Aaron Job added 2 commits February 25, 2024 20:14

updated resolve_template lodgic to use one jinja2 environment

01917fe

added model as optional constructor argument

4b53388

Aaron Job and others added 2 commits February 25, 2024 22:47

magic mocking project in unit tests and fixed test name

b3a39a9

Merge branch 'development' into development

1897045

senko requested changes Mar 21, 2024

View reviewed changes

gravypower added 4 commits April 12, 2024 05:37

Merge branch 'development' into development

d469fb0

using MODEL_NAME again after feedback.

0325a43

Made getting the agent model clearer, the guard insures the env var e…

b113c95

…xists

update code style

0edf45b

updated to use the model_name arg

a351ce8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added override for agent model and prompt resolution #663

Added override for agent model and prompt resolution #663

gravypower commented Feb 18, 2024

gravypower commented Feb 18, 2024

gravypower commented Feb 18, 2024

gravypower commented Feb 19, 2024

senko commented Feb 23, 2024

gravypower commented Feb 25, 2024 •

edited

gravypower commented Feb 25, 2024

gravypower commented Mar 20, 2024

senko left a comment

senko Mar 21, 2024

gravypower Apr 11, 2024

senko Mar 21, 2024

gravypower Apr 11, 2024

senko Mar 21, 2024

gravypower Apr 11, 2024

senko Mar 21, 2024

gravypower Apr 11, 2024

senko Mar 21, 2024

senko Mar 21, 2024

senko Mar 21, 2024

gravypower Apr 11, 2024 •

edited

gravypower commented Mar 21, 2024

gravypower commented Mar 21, 2024

		self.model = os.getenv('DEFAULT_MODEL_NAME', 'gpt-4-turbo-preview')

		agentModelName = f'{role.upper()}_MODEL_NAME'

		self.project = project
		self.model = os.getenv('DEFAULT_MODEL_NAME', 'gpt-4-turbo-preview')

Added override for agent model and prompt resolution #663

Are you sure you want to change the base?

Added override for agent model and prompt resolution #663

Conversation

gravypower commented Feb 18, 2024

gravypower commented Feb 18, 2024

gravypower commented Feb 18, 2024

gravypower commented Feb 19, 2024

senko commented Feb 23, 2024

gravypower commented Feb 25, 2024 • edited

gravypower commented Feb 25, 2024

gravypower commented Mar 20, 2024

senko left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gravypower Apr 11, 2024 • edited

Choose a reason for hiding this comment

gravypower commented Mar 21, 2024

gravypower commented Mar 21, 2024

gravypower commented Feb 25, 2024 •

edited

gravypower Apr 11, 2024 •

edited