Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added override for agent model and prompt resolution #663

Open
wants to merge 20 commits into
base: development
Choose a base branch
from

Conversation

gravypower
Copy link
Contributor

In an effort to help pilot become more cost effective I wanted to try and have some tasks completed by other models, GPT 3.5 Turbo for coding as an example.

This PR introduced new env vars

# Override model for an agent 
#FULL_STACK_DEVELOPER_MODEL_NAME=gpt-4-turbo-preview
#CODE_MONKEY_MODEL_NAME=gpt-4-turbo-preview
#ARCHITECT_MODEL_NAME=gpt-4-turbo-preview
#PRODUCT_OWNER_MODEL_NAME=gpt-4-turbo-preview
#TECH_LEAD_MODEL_NAME=gpt-4-turbo-preview
#DEV_OPS_MODEL_NAME=gpt-4-turbo-preview
#TECHNICAL_WRITER_MODEL_NAME=gpt-4-turbo-preview
DEFAULT_MODEL_NAME=gpt-4-turbo-preview

#PROMPTS_OVERRIDE_FOLDER=

Along with allowing you to change the model use for an agent this pr also allows you to set a override fro prompt resolution:

Model-Specific Overrides: First, the function checks for the existence of a model-specific prompt. If found, this prompt is loaded.
General Overrides: If a model-specific template is not found, the function looks for a general override prompt .
Default Templates: As a fallback, if neither model-specific nor general overrides are available, the function loads the default prompt from the primary environment.

├── overrides/
│ └── development/task/
│ └── breakdown.prompt
│ └──gpt-3.5-turbo-0125/
│ └── development/task/
│ └── breakdown.prompt

This PR replaces PR #646

@gravypower
Copy link
Contributor Author

Will fix conflict later today

@gravypower gravypower changed the title Development Added override for agent model and prompt resolution Feb 18, 2024
@gravypower
Copy link
Contributor Author

I really need to work our how to dev in python, linting issues all the time :)

@gravypower
Copy link
Contributor Author

There was an issue resolving component templates and some other small issues with the order of some args and a few unit testes

@senko
Copy link
Collaborator

senko commented Feb 23, 2024

Hey @gravypower this is cool!

Commeting here so you'll know I'm not ignoring the PR, it's still on my TODO list to give it proper time to review.

A couple of quick notes:

  • let's use underscore_case for variables in Python instead of camelCase
  • it would be cool if AgentConvo() could also have a model specified in the constructor (it can use the one from the agent by default), since we have a lot of things in Developer that should actually be different agents are just methods there right now; overriding the AgentConvo model per-instance would allow us to eg. call the reviewer (that's now either Developer or CodeMonkey) to use own model/prompts
  • I don't think you actually need a Project instance in tests for the agents, you could just use unittest.patch.MagicMock instead and not worry about project paths, creating the project, etc; much simpler to maintain. I know you copied the existing patterns here, but if we can improve on them that's even better
  • I don't think you need multiple jinja environments/loaders?

@gravypower
Copy link
Contributor Author

gravypower commented Feb 25, 2024

let's use underscore_case for variables in Python instead of camelCase

I cant see where I have done this? Will look into linting setting that can error on this.

it would be cool if AgentConvo() could also have a model specified in the constructor (it can use the one from the agent by default), since we have a lot of things in Developer that should actually be different agents are just methods there right now; overriding the AgentConvo model per-instance would allow us to eg. call the reviewer (that's now either Developer or CodeMonkey) to use own model/prompts

This should be an easy change, will do.

I don't think you actually need a Project instance in tests for the agents, you could just use unittest.patch.MagicMock instead and not worry about project paths, creating the project, etc; much simpler to maintain. I know you copied the existing patterns here, but if we can improve on them that's even better

Will look into this. I mainly program in statically typed languages and this is my first time using python so MagicMock feels all kinds of wrong :)

I don't think you need multiple jinja environments/loaders?

You are right, looked into this some more and you can sent an array of paths to look at, have change it to this, much better. When I first looked into this i did not consider the FileSystemLoader only jinja environments. Something like this is much better:

primary_prompts_path = os.path.join(os.path.dirname(__file__), '..', 'prompts')
override_prompts_path = os.getenv('PROMPTS_OVERRIDE_FOLDER', primary_prompts_path)
file_loader = FileSystemLoader([primary_prompts_path, override_prompts_path])
env = Environment(loader=file_loader)
...
def resolveTemplate(prompt_name, model=None) -> Template:
    logger.debug(f'resolving prompt: {prompt_name}')

    if(model is not None) :
        model_prompt_name = f'{model}/{prompt_name}'
        model_template = env.get_template(model_prompt_name)
        if(model_template is not None) :
            return model_template

    return env.get_template(prompt_name)
    

@gravypower
Copy link
Contributor Author

@senko I cant work out how to enforce underscore_case with the flake8 linter. can you give some direction here please?

@gravypower
Copy link
Contributor Author

hey @senko wondering if you are going to accept this pr or not? I will resolve the conflicts if you are.

Copy link
Collaborator

@senko senko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @gravypower sorry for the delay on this, and thanks for following up!

Here's another round of reviews, a lot more detailed - I tried to give as much feedback as possible but don't be discouraged mostly these are some small things :)

Looking forward to merging this; together with recent Anthropic additions this will make it possible to do some really cool cost/quality optimizations.

I think in the near future we can also make it work for local LLMs so for example you'd use GPT4 or Claude only for the most intensive operations, and use Mistral or Llama or some other local modal for some other tasks, etc.

pilot/.env.example Outdated Show resolved Hide resolved
self.model = os.getenv('DEFAULT_MODEL_NAME', 'gpt-4-turbo-preview')

agentModelName = f'{role.upper()}_MODEL_NAME'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use underscore case in Python, eg agent_model_name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should stay as uppercase as its resolving the environment variable name to get the agent model from. As we define environment variable names in uppercase I would just need to do a .upper() when trying to fetch that variable.

self.project = project
self.model = os.getenv('DEFAULT_MODEL_NAME', 'gpt-4-turbo-preview')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default for getenv here can use DEFAULT_MODEL_NAME instead of hardcoding it.

But since we're talking about model overrides here, I'd actually prefer that the default for Agent be None, and the llm_connection.py or AgentConvo decide what's the actual default (if not overridden anywhere).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went to do this and broke the tests. Issue is that if there is no agent model defined in the environment variables then the model is not set. This line effectively gives us a fallback model, agents will use when their model is not found in the environment variable. I think we should leave unless we want to not make the agent models optional.

agentModelName = f'{role.upper()}_MODEL_NAME'
if agentModelName in os.environ:
self.model = os.getenv(agentModelName, DEFAULT_MODEL_NAME)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should fall back to the default model we fetch from the environment in line 12.

For example if the user sets default model name to Claude 3 Opus, we don't want agents defaulting to GPT4-Turbo instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should never fallback because of the guard if agentModelName in os.environ: so I have updated this to self.model = os.getenv(agentModelName) to make things clearer. Thoughts?

pilot/helpers/AgentConvo.py Outdated Show resolved Hide resolved
@@ -89,7 +89,7 @@ def generate_messages_from_description(description, app_type, name):
]
"""
# "I want you to create the app {name} that can be described: ```{description}```
prompt = get_prompt('high_level_questions/specs.prompt', {
prompt = get_prompt('high_level_questions/specs.prompt', original_data={
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason you need to specify original_data= is because you inserted an argument to get_prompt in its definition. This is why it's nicer to just append kwargs, to avoid hanving to hunt around all the usages (and since Python is not statically compiled, it's easy to miss some).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might have been addressed def get_prompt(prompt_name, model=None, original_data=None): original_data now has a default value. I am not 100% can you please check

@@ -76,9 +77,9 @@ def test_api_access(project) -> bool:
]

endpoint = os.getenv('ENDPOINT')
model = os.getenv('MODEL_NAME', 'gpt-4')
model = os.getenv('DEFAULT_MODEL_NAME', DEFAULT_MODEL_NAME)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will get a bit more complicated since we merged Anthropic support and also have a notion of provider and model name split (eg anthropic / claude-3-...). This is not actually used in create_chat_gpt_completion because currently it looks up the model directly, but your change will make that part of the code work slightly different. I'll dive deeper into the side effect when you resolve the conflicts and I can test more.

pilot/utils/llm_connection.py Outdated Show resolved Hide resolved
@@ -34,23 +34,38 @@ def capitalize_first_word_with_underscores(s):
return capitalized_string


def get_prompt(prompt_name, original_data=None):
def get_prompt(prompt_name, model=None, original_data=None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add the model after the last keyword argument (original_data) to avoid having to modify all the functions that call it.

return env.get_template(model_prompt_name)
except TemplateNotFound:
# If the model-specific template is not found, log the event or handle accordingly
logger.debug(f'Model-specific template not found for {model_prompt_name}. Falling back to general template.')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this is useful for testing, it would create too much noise if outputted for each model-specific template that's not overridden. I'd prefer to have just return env.get_template(prompt_name) here in the except clause.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is why I chose the debug level to log it out, maybe this should be resolve by setting he default level to info or something?

@gravypower
Copy link
Contributor Author

all good, you guys are very busy i am sure. was more if you were going to accept or not.

Will have a look and address your feedback back this weekend.

@gravypower
Copy link
Contributor Author

oh and thank you for the detailed comments on the pr.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants