-
Notifications
You must be signed in to change notification settings - Fork 896
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarification and supplement to the online docs example #1904
Comments
Hi @paulcx thanks for pointing this out, we should be more clear about In TGI the Chat can used with the from huggingface_hub import InferenceClient
client = InferenceClient("http://127.0.0.1:3000")
chat = client.chat_completion(
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "Whats in this image?"},
{
"type": "image_url",
"image_url": {
"url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/rabbit.png"
},
},
],
},
],
seed=42,
max_tokens=100,
)
print(chat)
# ChatCompletionOutput(choices=[ChatCompletionOutputComplete(finish_reason='length', index=0, message=ChatCompletionOutputMessage(role='assistant', content=" The image you've provided features an anthropomorphic rabbit in spacesuit attire. This rabbit is depicted with human-like posture and movement, standing on a rocky terrain with a vast, reddish-brown landscape in the background. The spacesuit is detailed with mission patches, circuitry, and a helmet that covers the rabbit's face and ear, with an illuminated red light on the chest area.\n\nThe artwork style is that of a", name=None, tool_calls=None), logprobs=None)], created=1714589614, id='', model='llava-hf/llava-v1.6-mistral-7b-hf', object='text_completion', system_fingerprint='2.0.2-native', usage=ChatCompletionOutputUsage(completion_tokens=100, prompt_tokens=2943, total_tokens=3043)) Note that when using the chat endpoint images are sent as typed messages rather than markdown format. I hope this helps clarify! please let me know if you have any questions |
System Info
docs[main]: https://huggingface.co/docs/text-generation-inference/basic_tutorials/visual_language_models
vlm: https://huggingface.co/llava-hf/llava-v1.6-34b-hf
Information
Tasks
Reproduction
In current docs, there a few examples about how to query vlm model. for example:
Expected behavior
However, there is no example of how to deal with the default chat template. For example, the chat template of llava-hf/llava-v1.6-34b-hf is following:
Should we ignore it and use the tgi format as showed above? and how to deal with the multi-turn queries? Any examples would be appreciated.
The text was updated successfully, but these errors were encountered: