Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue: Understanding inconsistencies in data recording for LangSmith and LangChain JS/TS #516

Open
elliotmrodriguez opened this issue Mar 12, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@elliotmrodriguez
Copy link

elliotmrodriguez commented Mar 12, 2024

Issue you'd like to raise.

Hello, I am using LangSmith to evaluate platform tools for cost and performance tracking and I am noticing some inconsistencies I do not understand.

For azure-openai and openai JS/TS packages exported from LangChain, I get inconsistent cost tracking. When I use the azure-openai package, costs are not recorded. Token counts seem to be consistent, but I get no cost information at all. I have seen similar unrecorded costs making AWS Bedrock calls. When I use the openai package, I get costs, but I'm not confident they are accurate.

image

No LLM calls record time to first token either, unless it is a local ChatOllama call. LangSmith reports the other values as "this run did not stream output", which is interesting as I am using LangChain's Ollama exported chat model object.

It doesn't matter if I try this in a RunnableSequence or direct invoke call - Azure calls do not get cost information.

When I inspect traces for these Azure runs, I see the following message when trying to interact with the Playground:
image

And OpenAI runs are correctly derived as OpenAI for the purposes of the Playground.

What am I doing wrong?

@hinthornw
Copy link
Collaborator

hinthornw commented Mar 12, 2024

We need to relax the default model-matching rules we have for cost estimation. (I think azure openai returns gpt-35-turbo instead of gpt-3.5-turbo for instance)

You can customize the cost rules yourself if you'd like FYI, in case your pricing doesn't match the default ones

Export-1710273338544

@elliotmrodriguez
Copy link
Author

elliotmrodriguez commented Mar 13, 2024

Hi @hinthornw thank you for your reply!

I discovered the list of models as well, and noticed the related regexes, and I've been using the model name specified in my deployment (gpt-35-turbo), which should match to the second regex on the unpinned model name (without a version, at least that is the OpenAI behavior)

image

And this is what the deployments in Azure show for their model names:

image

But it sounds like you are saying adding/cloning the model with a looser regex should address this - because on closer inspection we are using the legacy 0613 version.

Thanks again

@elliotmrodriguez
Copy link
Author

elliotmrodriguez commented Mar 13, 2024

Hello @hinthornw, i'm reopening this issue because it doesn't seem as if adding new regexes has helped, at least when using the exported AzureChatOpenAI object from @langchain/azure-openai.

Even for gpt-4, no cost attribute is collected at all. It isn't that they don't match, they simply are not returned at all. I've tried values that regex testers indicate should match, like "gpt-4", "gpt-3.5-turbo", but still they fail to return costs.

const azureChatModel = new AzureChatOpenAI({
    azureOpenAIEndpoint: my-cool-endpoint
    azureOpenAIApiKey: super-secret
    azureOpenAIApiDeploymentName: also-super-secret,
    modelName: "gpt-4" 
  });

For 3.5 turbo, the other params are the same but the modelName argument I am passing, just gpt-3.5-turbo, doesnt appear to work either, at least not for the AzureChatOpenAI object.

is there something else i should be inspecting?

@hinthornw
Copy link
Collaborator

hinthornw commented Apr 9, 2024

Thanks for raising - will pass this on.

If you click on the metadata tab for one of those llm runs, what does it show? Any chance you could share a link to a run to help us debug?

@hinthornw hinthornw added the bug Something isn't working label Apr 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants