404 when trying Qwen in V3 #723

flatsiedatsie · 2024-04-22T19:14:17Z

Question

This is probably just because V3 is a work in progress, but I wanted to make sure.

When trying to run Qwen 1.5 - 0.5B it works with the V2 script, but when swapping to V3 I get a 404 not found.

type not specified for model. Using the default dtype: q8.
GET https://huggingface.co/Xenova/Qwen1.5-0.5B-Chat/resolve/main/onnx/model_quantized.onnx 404 (Not Found)

It seems V3 is looking for a file that was renamed 3 months ago.
Rename onnx/model_quantized.onnx to onnx/decoder_model_merged_quantized.onnx

I've tried setting dtype to 16 and 32, which does change the URL it tries to get, but those URL's also do not exist :-D

e.g. https://huggingface.co/Xenova/Qwen1.5-0.5B-Chat/resolve/main/onnx/model_fp16.onnx when using dtype: 'fp16'.

Is there something I can do to make V3 find the correct files?

(I'm still trying to find that elusive small model with a large context size to do document summarization with)

The text was updated successfully, but these errors were encountered:

Th3G33k · 2024-05-09T07:57:14Z

#745

Hi there 👋 v3 will use the name model instead of decoder_merged_model, as the latter is the result of a legacy conversion process which created multiple versions of the model (w/ and w/o past key value inputs). So, this change isn't needed.

If you want to override the behaviour yourself, you can use the model_file_name option when loading the model.

JohnReginaldShutler · 2024-05-28T07:38:27Z

Hello! Just a beginner here, could someone help me demonstrate with example code how to override the behaviour yourself using the model_file_name option when loading the model

Th3G33k · 2024-05-28T08:25:20Z

@JohnReginaldShutler

model: The default filename prefix can be change using the option model_file_name.
_quantized.onnx: The default filename suffix cannot be change, and will depend on the precision used.
Example:

// using pipeline function
let pipe = await pipeline('text-generation', 'Xenova/Qwen1.5-0.5B-Chat', {model_file_name: 'decoder_model_merged'})
// using AutoModel class
let model = await AutoModel.from_pretrained('Xenova/Qwen1.5-0.5B-Chat', {model_file_name:'decoder_model_merged'})
// will fetch decoder_model_merged_quantized.onnx

flatsiedatsie added the question Further information is requested label Apr 22, 2024

Th3G33k mentioned this issue May 9, 2024

v3: fix model fetch DecoderOnly #745

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

404 when trying Qwen in V3 #723

404 when trying Qwen in V3 #723

flatsiedatsie commented Apr 22, 2024

Th3G33k commented May 9, 2024 •

edited

JohnReginaldShutler commented May 28, 2024

Th3G33k commented May 28, 2024 •

edited

404 when trying Qwen in V3 #723

404 when trying Qwen in V3 #723

Comments

flatsiedatsie commented Apr 22, 2024

Question

Th3G33k commented May 9, 2024 • edited

JohnReginaldShutler commented May 28, 2024

Th3G33k commented May 28, 2024 • edited

Th3G33k commented May 9, 2024 •

edited

Th3G33k commented May 28, 2024 •

edited