Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: basic Support for OpenTelemetry Metrics and Token Usage Metrics in OpenAI V1 #369

Merged
merged 42 commits into from
Feb 27, 2024

Conversation

Humbertzhang
Copy link
Contributor

@Humbertzhang Humbertzhang commented Jan 27, 2024

Hello all, this is my first PR for issue#251

This PR introduces fundamental components for OpenTelemetry Metrics support, and it add counters for token usage data of openai.resources.chat.completions in OpenAI V1.

My goal with this PR is to present my approach to addressing the issue at hand and to seek your feedback on the implementation. Your insights and suggestions will be extremely valuable for refining this solution. Moving forward, based on your feedback, I plan to enhance this implementation further and extend support to other instrumentors.

Looking for your reviews and constructive criticism to help improve this integration.

@CLAassistant
Copy link

CLAassistant commented Jan 27, 2024

CLA assistant check
All committers have signed the CLA.

@nirga
Copy link
Member

nirga commented Jan 27, 2024

Nice work @Humbertzhang! Let's connect it to some observability platform to see if it works?

@nirga
Copy link
Member

nirga commented Jan 28, 2024

Sharing some work that's being done now with the OpenTelemetry community that we should align with - traceloop/semantic-conventions#2

@Humbertzhang
Copy link
Contributor Author

Hi @nirga!
I setup a demo environment using otel-collector and prometheus, and then setup a openai + traceloop demo for generate metrics.
You can find those files at: https://github.com/Humbertzhang/demo_otel_prometheus

And I think it reports metrics as expect!

image

@Humbertzhang
Copy link
Contributor Author

Sharing some work that's being done now with the OpenTelemetry community that we should align with - traceloop/semantic-conventions#2

ok @nirga , I will look into this pr align with it !

@Humbertzhang
Copy link
Contributor Author

Update to Align with traceloop/semantic-conventions#2

Hello @nirga, I have updated my code to align with the changes proposed of openai.chat_completions.tokens metric in traceloop/semantic-conventions#2.

You can see the results under normal conditions in the attached image. I have appropriately added attributes such as llm.response.model and llm.usage.token_type .

20240131-210801

You can observe the results under exceptional conditions in below images.
During this run, the network was initially stable but became unreachable for OpenAI requests after a few minutes, resulting in an APIConnectionError.

20240131-210757
20240131-210751

However, I encountered challenges in retrieving the server.address. I attempted to acquire it in the same manner as the Span(Like https://github.com/traceloop/openllmetry/blob/main/packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/__init__.py#L40), but this it did not get the base_url in practical scenarios(). Do you have any alternative suggestions or insights on this matter?

@nirga
Copy link
Member

nirga commented Jan 31, 2024

Looks good @Humbertzhang! Before I fully review it, can you:

  • rebase
  • fix lint issues
  • add tests (this actually helped me better review PRs in the past 😅)
  • add all suggested metrics

Reg. your question - can you give an example of when it didn't work for you?

@Humbertzhang
Copy link
Contributor Author

Hi @nirga !
I have added all the suggested metrics of openai and their corresponding tests in this pr:

  • llm.openai.chat_completions.tokens
  • llm.openai.chat_completions.choices
  • llm.openai.chat_completions.duration
  • llm.openai.embeddings.tokens
  • llm.openai.embeddings.vector_size
  • llm.openai.embeddings.duration
  • llm.openai.image_generations.duration

and I have manually tested all metrics in prometheus.

Additionally, I've completed the rebase and addressed all linting issues.

Regarding server.address, I add a function called _get_openai_base_url, and I always get a "" response by calling it.
This function is intended to mirror the method used by spans to retrieve the OpenAI URL.
I'm uncertain if there's a more effective approach to obtain the OpenAI URL—any suggestions would be appreciated.

For the metric Metric: openai.embeddings.vector_size, I'm contemplating whether a counter is the most suitable instrument for recording it. Maybe Gauge is better for recording it? What do you think about it?

I look forward to your review and any feedback you may have!

@Humbertzhang
Copy link
Contributor Author

Humbertzhang commented Feb 18, 2024

Hi @nirga , I have updated tests for metrics using VCR, and I hope it meets the format.

@gyliu513
Copy link
Contributor

Thanks @Humbertzhang !

Hey @nirga , can we get this merged? I was planning to create a PR to enable watsonx as well, and it will depend on this PR, thanks!

@nirga
Copy link
Member

nirga commented Feb 19, 2024

@gyliu513 yeah probably today / tomorrow. I need to see why the tests are failing, and I want to move this to a common semantic conventions.

@nirga
Copy link
Member

nirga commented Feb 21, 2024

@Humbertzhang looks like there's regression in the streaming test of openai? 🤔

@nirga
Copy link
Member

nirga commented Feb 26, 2024

Nice work @Humbertzhang! It's a significant milestone for OpenLLMetry :)

@paolorechia
Copy link
Contributor

@Humbertzhang thanks for the work here, could you resolve the conflicts so we can merge? :)

@@ -129,6 +131,13 @@ def _set_response_attributes(span, response):
)


def _get_openai_base_url():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❓ Is this function still always returning empty string?

I can confirm this behavior if no initialization is done.

In [2]: def _get_openai_base_url():
   ...:     base_url = openai.base_url if hasattr(openai, "base_url") else openai.api_base
   ...:     if not base_url:
   ...:         return ""
   ...:     return base_url
   ...: 

In [3]: import openai

In [4]: _get_openai_base_url()
Out[4]: ''

In [5]: 

On the other hand, if you inspect the client instance, you should get the base url:

In [6]: client = openai.OpenAI()

In [7]: client.base_url
Out[7]: URL('https://api.openai.com/v1/')

Does this help you?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reviewed your code, sadly I don't see an easy way to extract this URL from an instance in the current instrumentation code.

My only idea was to instrument the client constructor so you can inject a listener which retrieves the URL for you. If you store it in a Singleton, you could retrieve it while calling this function:

def _handle_response(response, span, token_counter=None, choice_counter=None, duration_histogram=None, duration=None):
    if is_openai_v1():
        response_dict = model_as_dict(response)
    else:
        response_dict = response

    # metrics record
    _set_chat_metrics(token_counter, choice_counter, duration_histogram, response_dict, duration)

    # span attributes
    _set_response_attributes(span, response_dict)

    if should_send_prompts():
        _set_completions(span, response_dict.get("choices"))

    return response

Or maybe even directly inside the _set_chat_metrics.

But again, quite some effort for just one URL :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nirga what do you think? Should we track this URL as a separate issue?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@paolorechia not sure I follow - I think this function does return the right base URL, no?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, @Humbertzhang mentioned in a comment, he couldn’t get this URL to work correctly, which why I looked a bit on that

edit: no, I think it’s not working if I understood it correctly

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Humbertzhang I think this is the logic we should use to get it: https://github.com/traceloop/openllmetry/blob/main/packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/__init__.py
(depending on the SDK version, it will be in different attributes).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For v1, it should be instance._client.base_url, and for v0 it should be the code you wrote below

@Humbertzhang
Copy link
Contributor Author

@Humbertzhang thanks for the work here, could you resolve the conflicts so we can merge? :)

Hi @paolorechia, I resolved conflicts in my local env, but when I run tests of traceloop-sdk and openai, I got

FAILED tests/test_privacy_no_prompts.py::test_simple_workflow - openai.APIConnectionError: Connection error.
FAILED tests/test_prompt_management.py::test_prompt_management - openai.APIConnectionError: Connection error.
FAILED tests/test_sdk_initialization.py::test_resource_attributes - openai.APIConnectionError: Connection error.
FAILED tests/test_workflows.py::test_simple_workflow - openai.APIConnectionError: Connection error.

and

>           raise CannotOverwriteExistingCassetteException(cassette=cassette, failed_request=vcr_request)
E           vcr.errors.CannotOverwriteExistingCassetteException: Can't overwrite existing cassette ('/Users/maxzhang/Desktop/githubpjs/openllmetry/packages/traceloop-sdk/tests/cassettes/test_workflows/test_simple_workflow.yaml') in your current record mode ('none').
E           No match for the request (<Request (POST) https://api.openai.com/v1/chat/completions>) was found.
E           Found 1 similar requests with 0 different matcher(s) :
E           
E           1 - (<Request (POST) https://api.openai.com/v1/chat/completions>).
E           Matchers succeeded : ['method', 'scheme', 'host', 'port', 'path', 'query']
E           Matchers failed :

errors and exceptions.

Have you or @nirga encountered similar exceptions when using VCR?

@nirga
Copy link
Member

nirga commented Feb 27, 2024

@Humbertzhang I'd try to re-generate the cassettes. Comment out the OpenAI mock key from conftest.py, specify one of your own and then run poetry run pytest --record-mode=once. If you don't have an OpenAI API key I can do that for you :)

@Humbertzhang
Copy link
Contributor Author

Still got connection errors even i can curl request openai... Maybe u can help me with that @nirga ?
I have pushed my resolve conflict commits.

@Humbertzhang
Copy link
Contributor Author

@nirga not pushed successfully just now😅, now it is pushed

@nirga
Copy link
Member

nirga commented Feb 27, 2024

@Humbertzhang All tests pass now 🤩
Thanks so much for this, it's a significant project!
Can we just also fix and test the API base as well? (where @paolorechia and I commented above)

@nirga nirga changed the title Basic Support for OpenTelemetry Metrics and Token Usage Metrics in OpenAI V1 feat: basic Support for OpenTelemetry Metrics and Token Usage Metrics in OpenAI V1 Feb 27, 2024
@Humbertzhang
Copy link
Contributor Author

Hi @nirga and @paolorechia, I have fixed the _get_openai_base_url func (@paolorechia's code and #522 really helped me, thx!), it now can get url.
I also added assert checks for it in openai's tests/metrics tests.

@nirga nirga merged commit 3eba03e into traceloop:main Feb 27, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants