Stuck at Quantization? or just taking a long time to run? #4

P15V · 2024-04-23T20:10:07Z

Hello all,

I hope whoever reads this is doing well!! :)

So I'm trying to get this going on my Jetson Nano 8GB. I'm getting stuck (maybe?) at Quantization. I run this command, and I get the terminal output that it's quantizing the model and that this will take a while. And it seems to lock up/get stuck there? I've had it going for the past 1-1.5 hours with no further outputs or such, and the entire Jetson Nano is locked up. I can't interact with it, can't SSH into it.

Do you know if this is normal? or is something going wrong? Am I doing something wrong? I'm going to let it run for a few hours to see if it accomplishes anything.

Thanks for everyone's time!! :) . My run command inside the container :
python3 -m nano_llm.chat --api=mlc
--model Efficient-Large-Model/VILA-2.7b
--max-context-len 128
--max-new-tokens 32

dusty-nv · 2024-04-23T20:42:54Z

@P15V an hour and a half is too long, it is probably froze up. Try rebooting it and then mounting more SWAP memory, disabling ZRAM, and if needed disable the desktop GUI like here:

https://github.com/dusty-nv/jetson-containers/blob/master/docs/setup.md#mounting-swap

Also, try testing --model princeton-nlp/Sheared-LLaMA-2.7B-ShareGPT first (this is the base model for VILA-2.7B) and see if you can get that going for text-only chat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stuck at Quantization? or just taking a long time to run? #4

Stuck at Quantization? or just taking a long time to run? #4

P15V commented Apr 23, 2024

dusty-nv commented Apr 23, 2024

Stuck at Quantization? or just taking a long time to run? #4

Stuck at Quantization? or just taking a long time to run? #4

Comments

P15V commented Apr 23, 2024

dusty-nv commented Apr 23, 2024