Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inference Component Name header is required #64

Open
ecdedios opened this issue Jan 18, 2024 · 9 comments
Open

Inference Component Name header is required #64

ecdedios opened this issue Jan 18, 2024 · 9 comments

Comments

@ecdedios
Copy link

ecdedios commented Jan 18, 2024

I'm getting the following error

botocore.errorfactory.ValidationError: An error occurred (ValidationError) when calling the InvokeEndpoint operation: Inference Component Name header is required for endpoints to which you plan to deploy inference components. Please include Inference Component Name header or consider using SageMaker models.

when I run python kendra_chat_llama_2.py

Name: boto3
Version: 1.34.21

llm=SagemakerEndpoint(
          endpoint_name=endpoint_name, 
          region_name=region, 
          model_kwargs={"max_new_tokens": 1500, "top_p": 0.8,"temperature":0.6},
          endpoint_kwargs={"CustomAttributes":"accept_eula=true"},
          content_handler=content_handler,

      )
@MithilShah
Copy link
Contributor

MithilShah commented Jan 22, 2024

Jumpstart inference endpoints now need an InferenceComponentName
response = client.invoke_endpoint(
EndpointName=endpoint_name, InferenceComponentName='jumpstart-dft-meta-textgeneration-l-xx',
ContentType="application/json",
Body=json.dumps(payload),
)
The change needs to happen in the langchain library first. Following that up.

https://github.com/langchain-ai/langchain/blob/master/libs/community/langchain_community/llms/sagemaker_endpoint.py#L126

@3coins
Copy link
Contributor

3coins commented Jan 24, 2024

@MithilShah
Did you try using the InferenceComponentName in the endpont_kwargs?

@ecdedios
Copy link
Author

ecdedios commented Jan 25, 2024

@3coins @MithilShah

Yes it did. Thanks! I modified kendra_chat_llama_2.py to this

  llm=SagemakerEndpoint(
          endpoint_name=endpoint_name, 
          region_name=region, 
          model_kwargs={"max_new_tokens": 1500, "top_p": 0.8,"temperature":0.6},
          endpoint_kwargs={"CustomAttributes":"accept_eula=true",
                           "InferenceComponentName":"jumpstart-dft-meta-textgeneration-l-###"},
          content_handler=content_handler,

      )
     

@utility-aagrawal
Copy link

@ecdedios I have the same issue. I am trying to understand what's the difference between endpoint_name and InferenceComponentName?

If my sagemaker endpoint is meta-textgeneration-llama-2-7b-f-20240201-XXXXXX, what is point_name and what is InferenceComponentName? Appreciate your help with this!

@ecdedios
Copy link
Author

ecdedios commented Feb 1, 2024

@utility-aagrawal I forgot exactly which one I'd used for inferenceComponentName but it's either the endpoint name or the model name. Here are some screenshots. Basically, you get the model name by clicking on the endpoint name.

Screenshot 2024-02-01 at 15 54 59 Screenshot 2024-02-01 at 15 55 39

@MithilShah
Copy link
Contributor

MithilShah commented Feb 1, 2024

I am working on a fix , but @ecdedios is right. The one starting with "jumpstart.." is the endpoint name and the one in the "model" section is the Inference Component name. Testing the fix, but will release soon

@utility-aagrawal
Copy link

Thanks @ecdedios @MithilShah ! I tried with InferenceComponentName in endpoint_kwargs and got this error:

ValueError: Error raised by inference endpoint: An error occurred (ValidationError) when calling the InvokeEndpoint operation: Inference Component Name header is not allowed for endpoints to which you dont plan to deploy inference components. Please remove the Inference Component Name header and try again.

It just worked without InferenceComponentName. It's weird because the same code wasn't working yesterday and was asking me to include InferenceComponentName. I am not sure what's changed since yesterday.

@MithilShah
Copy link
Contributor

@utility-aagrawal can you please try again. I have added a new variable. If you deploy the endpoint via the console, it deploys the model to an InferenceComponent and you need to specify a INFERENCE_COMPONENT_NAME environment variable. However, if you deploy via the SDK you have to option of deploying directly via the endpoint without using an inferencecomponent. If you do that, just ignore the INFERENCE_COMPONENT_NAME environment variable.

@utility-aagrawal
Copy link

Thanks @MithilShah ! I'll try and let you know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants