Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error using Xenova/nanoLLaVA in pipeline #758

Closed
2 tasks done
kendelljoseph opened this issue May 12, 2024 · 4 comments
Closed
2 tasks done

Error using Xenova/nanoLLaVA in pipeline #758

kendelljoseph opened this issue May 12, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@kendelljoseph
Copy link

kendelljoseph commented May 12, 2024

System Info

Using:

  • Node v21.7.1
  • Mac M1

Environment/Platform

  • Website/web-app
  • Server-side (e.g., Node.js, Deno, Bun)

Description

https://huggingface.co/Xenova/nanoLLaVA

New model nanoLLaVA threw this error:

Unknown model class "llava", attempting to construct from base class.
Model type for 'llava' not found, assuming encoder-only architecture. 
 Error: Could not locate file: "https://huggingface.co/Xenova/nanoLLaVA/resolve/main/onnx/model_quantized.onnx".

Reproduction

Use Xenova/nanoLLaVA like this:

const featureExtractor = await transformers.pipeline('image-feature-extraction', 'Xenova/nanoLLaVA')

package.json

"@xenova/transformers": "^2.17.1",
@kendelljoseph kendelljoseph added the bug Something isn't working label May 12, 2024
@xenova
Copy link
Owner

xenova commented May 12, 2024

I appreciate your enthusiasm with testing the model out, since I only added it a few hours ago... but I'm still adding support for it to the library! I will let you know when it is supported.

@kendelljoseph
Copy link
Author

Brilliant, thank you very much!

I'm closely watching this feature, and if you link a PR for this I can glean from the work and help maintain the code!

@xenova
Copy link
Owner

xenova commented May 12, 2024

You can follow along in the v3 branch: #545

Here's some example code which should work:

import { AutoTokenizer, AutoProcessor, RawImage, LlavaForConditionalGeneration } from '@xenova/transformers';

// Load tokenizer, processor and model
const model_id = 'Xenova/nanoLLaVA';
const tokenizer = await AutoTokenizer.from_pretrained(model_id);
const processor = await AutoProcessor.from_pretrained(model_id);
const model = await LlavaForConditionalGeneration.from_pretrained(model_id, {
    dtype: {
        embed_tokens: 'fp16',
        vision_encoder: 'q8', // or 'fp16'
        decoder_model_merged: 'q4', // or 'q8'
    },
});

// Prepare text inputs
const prompt = 'Describe this image in detail';
const messages = [
    { 'role': 'user', 'content': `<image>\n${prompt}` }
]
const text = tokenizer.apply_chat_template(messages, { tokenize: false, add_generation_prompt: true })
const text_inputs = tokenizer(text, { padding: true });

// Prepare vision inputs
const url = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/cats.jpg'
const image = await RawImage.fromURL(url);
const vision_inputs = await processor(image);

// Generate response
const inputs = { ...text_inputs, ...vision_inputs };
const output = await model.generate({
    ...inputs,
    do_sample: false,
    max_new_tokens: 64,
});

// Decode output
const decoded = tokenizer.batch_decode(output, { skip_special_tokens: false });
console.log('decoded', decoded);

Note that this may change in future, and I'll update the model card when I've done some more testing.

@xenova
Copy link
Owner

xenova commented May 22, 2024

The model card has been updated with example code 👍 https://huggingface.co/Xenova/nanoLLaVA

We also put an online demo out for you to try: https://huggingface.co/spaces/Xenova/experimental-nanollava-webgpu

Example videos:

nanollava-webgpu.mp4
nanollava-webgpu-2.mp4

@xenova xenova closed this as completed May 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants