-
-
Notifications
You must be signed in to change notification settings - Fork 440
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error using Xenova/nanoLLaVA in pipeline #758
Comments
I appreciate your enthusiasm with testing the model out, since I only added it a few hours ago... but I'm still adding support for it to the library! I will let you know when it is supported. |
Brilliant, thank you very much! I'm closely watching this feature, and if you link a PR for this I can glean from the work and help maintain the code! |
You can follow along in the v3 branch: #545 Here's some example code which should work: import { AutoTokenizer, AutoProcessor, RawImage, LlavaForConditionalGeneration } from '@xenova/transformers';
// Load tokenizer, processor and model
const model_id = 'Xenova/nanoLLaVA';
const tokenizer = await AutoTokenizer.from_pretrained(model_id);
const processor = await AutoProcessor.from_pretrained(model_id);
const model = await LlavaForConditionalGeneration.from_pretrained(model_id, {
dtype: {
embed_tokens: 'fp16',
vision_encoder: 'q8', // or 'fp16'
decoder_model_merged: 'q4', // or 'q8'
},
});
// Prepare text inputs
const prompt = 'Describe this image in detail';
const messages = [
{ 'role': 'user', 'content': `<image>\n${prompt}` }
]
const text = tokenizer.apply_chat_template(messages, { tokenize: false, add_generation_prompt: true })
const text_inputs = tokenizer(text, { padding: true });
// Prepare vision inputs
const url = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/cats.jpg'
const image = await RawImage.fromURL(url);
const vision_inputs = await processor(image);
// Generate response
const inputs = { ...text_inputs, ...vision_inputs };
const output = await model.generate({
...inputs,
do_sample: false,
max_new_tokens: 64,
});
// Decode output
const decoded = tokenizer.batch_decode(output, { skip_special_tokens: false });
console.log('decoded', decoded); Note that this may change in future, and I'll update the model card when I've done some more testing. |
The model card has been updated with example code 👍 https://huggingface.co/Xenova/nanoLLaVA We also put an online demo out for you to try: https://huggingface.co/spaces/Xenova/experimental-nanollava-webgpu Example videos: nanollava-webgpu.mp4nanollava-webgpu-2.mp4 |
System Info
Using:
Environment/Platform
Description
https://huggingface.co/Xenova/nanoLLaVA
New model nanoLLaVA threw this error:
Reproduction
Use
Xenova/nanoLLaVA
like this:package.json
The text was updated successfully, but these errors were encountered: