Issue: Inconsistent token counting in AzureChatOpenAI Agent with Tools #510

Phobos97 · 2024-03-07T19:06:53Z

Issue you'd like to raise.

I am using an OpenAI agent with tools with an AzureChatOpenAI as ChatModel. This model can invoke a tool to search for additional context, this will often be 1k+ tokens in my use case. The issue being that these ToolMessage tokens are not counted in the LangSmith trace even though they are part of the context for the next agent step.

The inconsistent thing is, when setting model='gpt-35' on the AzureChatOpenAI constructor it DOES count the tokens (model is an optional parameter here). Both model='gpt-4' and model='gpt-35-turbo-16k' do not work. So with AzureChatOpenAI I can actually set azure_deployment='gpt-4' with model=gpt-35 in order to get the proper token count (but then my cost is way off).

I have created an example below, where the agent is prompted into using its CustomTool. When setting model_name = "gpt-4" the trace in LangSmith is ~55 tokens, even though it uses its tool which adds another ~350 tokens. When setting model_name = "gpt-35" the LangSmith trace will be ~400 tokens which is the expected behaviour (except I have non-matching deployment & model name which is obviously wrong but that would just be an user error).

TLDR; The ToolMessage in the AgentExecutor trace is sometimes counted in the token count and sometimes not.

See this example (does not include setting environment variables):

from typing import Type, Optional

import httpx
from langchain.agents import create_openai_tools_agent, AgentExecutor
from langchain_core.callbacks import CallbackManagerForToolRun
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.runnables import RunnableConfig
from langchain_core.tools import BaseTool
from langchain_openai import AzureChatOpenAI
from langsmith import Client
from pydantic.v1 import BaseModel, Field

from constants import LANGCHAIN_API_KEY, set_environment_variables
from utility.tracer import NewLangChainTracer

deployment_name = "gpt-4"

model_name = "gpt-4"  # does not work, does not count tokens from tool output
# model_name = "gpt-35"  # works, counts tokens from tool output, but with gpt-35 costs which are obviously wrong with a gpt-4 deployment

set_environment_variables()

http_client = httpx.Client(verify=False)
chat_model: AzureChatOpenAI = AzureChatOpenAI(azure_endpoint="https://localhost:8080/",
                                              model=model_name,
                                              azure_deployment=deployment_name, api_version="2023-08-01-preview",
                                              http_client=http_client,
                                              tags=["gpt-4-agent"], temperature=0, max_tokens=50)


class ToolInput(BaseModel):
    question: str = Field(description="Please input the word 'Birthday'.")


class CustomTool(BaseTool):
    name = "birthday_search_tool"
    description = "useful for when you need to know a users birthday."
    args_schema: Type[BaseModel] = ToolInput

    def _run(
            self, question: str, run_manager: Optional[CallbackManagerForToolRun] = None
    ) -> str:
        """Use the tool."""
        return "The users birthday is yesterday!!!!!!!!!!\n" * 50


agent_prompt = ChatPromptTemplate.from_messages([
    MessagesPlaceholder(variable_name="instructions"),
    MessagesPlaceholder(variable_name="input"),
    MessagesPlaceholder(variable_name="agent_scratchpad"),
])

tool = CustomTool()

agent = create_openai_tools_agent(chat_model, [tool], agent_prompt)
agent_executor = AgentExecutor(agent=agent, tools=[tool], verbose=True,
                               return_intermediate_steps=True)

instructions = ["You are a helpful assistant"]
test_input = ["What is my birthday?"]

tracer = NewLangChainTracer(
    project_name="reproduce-tracer-error",
    tags=["tag1"],
    client=Client(
        api_url="https://api.smith.langchain.com",
        api_key=LANGCHAIN_API_KEY
    )
)

config = RunnableConfig(
    callbacks=[tracer]
)

result = agent_executor.invoke(
    input={"input": test_input, "instructions": instructions},
    config=config
)

langchain 0.1.11
langchain-core 0.1.30
langchain-openai 0.0.8
langsmith 0.1.22

Image from LangSmith below with AzureChatOpenAI step with claimed 34 tokens while on the right it is obvious the tool added many more than 34 tokens to the context.

Suggestion:

Agent Tool output that is added to the context should count towards the token count. Seems the code parsing ToolMessages only works with model=gpt-35 specified. Possibly related to the model name being optional on AzureChatOpenAI.

Thank you for your help.

The text was updated successfully, but these errors were encountered:

hinthornw · 2024-03-07T19:08:15Z

Thanks for flagging, I'll coordinate with the team

hinthornw added the bug Something isn't working label Apr 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue: Inconsistent token counting in AzureChatOpenAI Agent with Tools #510

Issue: Inconsistent token counting in AzureChatOpenAI Agent with Tools #510

Phobos97 commented Mar 7, 2024

hinthornw commented Mar 7, 2024

Issue: Inconsistent token counting in AzureChatOpenAI Agent with Tools #510

Issue: Inconsistent token counting in AzureChatOpenAI Agent with Tools #510

Comments

Phobos97 commented Mar 7, 2024

Issue you'd like to raise.

Suggestion:

hinthornw commented Mar 7, 2024