Issue: Understanding inconsistencies in data recording for LangSmith and LangChain JS/TS #516

elliotmrodriguez · 2024-03-12T18:40:30Z

Issue you'd like to raise.

Hello, I am using LangSmith to evaluate platform tools for cost and performance tracking and I am noticing some inconsistencies I do not understand.

For azure-openai and openai JS/TS packages exported from LangChain, I get inconsistent cost tracking. When I use the azure-openai package, costs are not recorded. Token counts seem to be consistent, but I get no cost information at all. I have seen similar unrecorded costs making AWS Bedrock calls. When I use the openai package, I get costs, but I'm not confident they are accurate.

No LLM calls record time to first token either, unless it is a local ChatOllama call. LangSmith reports the other values as "this run did not stream output", which is interesting as I am using LangChain's Ollama exported chat model object.

It doesn't matter if I try this in a RunnableSequence or direct invoke call - Azure calls do not get cost information.

When I inspect traces for these Azure runs, I see the following message when trying to interact with the Playground:

And OpenAI runs are correctly derived as OpenAI for the purposes of the Playground.

What am I doing wrong?

hinthornw · 2024-03-12T19:54:08Z

We need to relax the default model-matching rules we have for cost estimation. (I think azure openai returns gpt-35-turbo instead of gpt-3.5-turbo for instance)

You can customize the cost rules yourself if you'd like FYI, in case your pricing doesn't match the default ones

elliotmrodriguez · 2024-03-13T11:59:30Z

Hi @hinthornw thank you for your reply!

I discovered the list of models as well, and noticed the related regexes, and I've been using the model name specified in my deployment (gpt-35-turbo), which should match to the second regex on the unpinned model name (without a version, at least that is the OpenAI behavior)

And this is what the deployments in Azure show for their model names:

But it sounds like you are saying adding/cloning the model with a looser regex should address this - because on closer inspection we are using the legacy 0613 version.

Thanks again

elliotmrodriguez · 2024-03-13T19:35:11Z

Hello @hinthornw, i'm reopening this issue because it doesn't seem as if adding new regexes has helped, at least when using the exported AzureChatOpenAI object from @langchain/azure-openai.

Even for gpt-4, no cost attribute is collected at all. It isn't that they don't match, they simply are not returned at all. I've tried values that regex testers indicate should match, like "gpt-4", "gpt-3.5-turbo", but still they fail to return costs.

const azureChatModel = new AzureChatOpenAI({
    azureOpenAIEndpoint: my-cool-endpoint
    azureOpenAIApiKey: super-secret
    azureOpenAIApiDeploymentName: also-super-secret,
    modelName: "gpt-4" 
  });

For 3.5 turbo, the other params are the same but the modelName argument I am passing, just gpt-3.5-turbo, doesnt appear to work either, at least not for the AzureChatOpenAI object.

is there something else i should be inspecting?

hinthornw · 2024-04-09T02:53:33Z

Thanks for raising - will pass this on.

If you click on the metadata tab for one of those llm runs, what does it show? Any chance you could share a link to a run to help us debug?

elliotmrodriguez closed this as completed Mar 13, 2024

elliotmrodriguez reopened this Mar 13, 2024

hinthornw added the bug Something isn't working label Apr 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue: Understanding inconsistencies in data recording for LangSmith and LangChain JS/TS #516

Issue: Understanding inconsistencies in data recording for LangSmith and LangChain JS/TS #516

elliotmrodriguez commented Mar 12, 2024 •

edited

hinthornw commented Mar 12, 2024 •

edited

elliotmrodriguez commented Mar 13, 2024 •

edited

elliotmrodriguez commented Mar 13, 2024 •

edited

hinthornw commented Apr 9, 2024 •

edited

Issue: Understanding inconsistencies in data recording for LangSmith and LangChain JS/TS #516

Issue: Understanding inconsistencies in data recording for LangSmith and LangChain JS/TS #516

Comments

elliotmrodriguez commented Mar 12, 2024 • edited

Issue you'd like to raise.

hinthornw commented Mar 12, 2024 • edited

elliotmrodriguez commented Mar 13, 2024 • edited

elliotmrodriguez commented Mar 13, 2024 • edited

hinthornw commented Apr 9, 2024 • edited

elliotmrodriguez commented Mar 12, 2024 •

edited

hinthornw commented Mar 12, 2024 •

edited

elliotmrodriguez commented Mar 13, 2024 •

edited

elliotmrodriguez commented Mar 13, 2024 •

edited

hinthornw commented Apr 9, 2024 •

edited