Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Allow users to connect pr_agent to an existing SageMaker inference endpoint #609

Open
mattiaciollaro opened this issue Jan 18, 2024 · 4 comments

Comments

@mattiaciollaro
Copy link

Context: assume a user has e.g. a pre-configured LLM inference endpoint in SageMaker (for example, a self-hosted Llama model as described here). It would be nice to be able to allow the user to configure pr-agent to leverage that endpoint e.g. by means of a dedicated AI handler.

Discord chat: https://discord.com/channels/1057273017547378788/1057273018084237344/1197261978591309884

cc: @krrishdholakia

@mattiaciollaro mattiaciollaro changed the title [feature request] Allow users to connect pr_agent to an existing SageMaker inference endpoint [Feature Request] Allow users to connect pr_agent to an existing SageMaker inference endpoint Jan 18, 2024
@mrT23
Copy link
Collaborator

mrT23 commented Jan 23, 2024

@krrishdholakia do you think this request is feasible ?

@krrishdholakia
Copy link
Contributor

Hey @mattiaciollaro @mrT23 we already support sagemaker - https://docs.litellm.ai/docs/providers/aws_sagemaker

What am i missing?

@mrT23
Copy link
Collaborator

mrT23 commented Jan 28, 2024

@mattiaciollaro is this PR still relevant?

@mattiaciollaro
Copy link
Author

Sorry for the delay guys.

Hey @mattiaciollaro @mrT23 we already support sagemaker - https://docs.litellm.ai/docs/providers/aws_sagemaker

https://docs.litellm.ai/docs/providers/aws_sagemaker seems to support SageMaker JumpStart models specifically.

I am thinking of a different situation where a model is already deployed via SageMaker and a reference to the inference endpoint name is available (as in here). In that case, how can we instruct pr-agent to leverage the LLM behind that pre-existing endpoint? I am not sure I see a way of doing this via https://docs.litellm.ai/docs/providers/aws_sagemaker

In the context of a POC with my team, the way we accomplished this was to hack the pr-agent's default AI handler (which is the LiteLLM AI handler) and use the sagemaker SDK (specifically, the HF predictor to make requests to the pre-existing SageMaker endpoint.

I imagine a cleaner solution would be to implement a dedicated AI handler for this usecase?

@mattiaciollaro is this PR still relevant?

I don't have a PR out for this, but yes: I think the feature request is still relevant :) My apologies again for the delay!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants
@mattiaciollaro @krrishdholakia @mrT23 and others