-
-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Model stays in Vram after transcribe #126
Comments
Hey there! Thanks for this detailed feedback! Model / memory management is now top priority in our backlog, so we will deal with this asap.
This is a little bit more complex to handle though. For example, in our editing room, we queue up more than one timeline for transcription/indexing, so loading the model for each job, would slow down the process by 10-20% and if you're dealing with 30 jobs at a time it's not ideal. Probably a way to do it is to have the tool check the queue and if there are no more jobs left, to unload the model(s), but we'll find a way to handle that soon!
The reloading shouldn't happen though since there's a check in place which should prevent this. How can you tell this? Do you see something in the logs / console? Cheers! |
Is the overflow happening on the second transcription even without Resolve being open? Are the logs / console telling you that it's loading the model on the second transcription? Cheers! |
I have not tried if it is model specific or something because i only run the large V2 model, but had this issue in all previous versions and on different systems so i was more under the impression that for now its not a bug but a feature that in future gets optimized away. Below is without resolve running, but it acts the same with resolve running |
Got it, thanks! We'll look into it! |
No problem, thank you for taking te time |
One last addition. |
Thanks again for the suggestions! I'll code a patch for un-loading the model if there are no more transcription jobs in the queue soon. I'm not sure why the model re-loads once it's already loaded on your end though. I don't see this happening here and it shouldn't since the variable Would you mind trying to transcribe 2 consecutive things using the same model, but without having any other software that might use the GPU open? Cheers! |
I have noticed that after transcription proces is done, the Vram stays completely loaded to the max.
My guess is the wisper model stays in Vram after it is done, this is a bit of a pain in the ass because after transcription you want to place markers in resolve but since the transcription model (always use large V2) still fills the vram resolve performance is average.
Also when you want to translate another timeline it overflows Vram and fills up main memory and transcription is very slow.
Then it seems it loads up again the whole model while keeping the first one also in memory.
Don't know if its something only i experience but do have it on multiple systems i tried it on. Also with older versions
Its not a enormous problem because just have to close the app, reopen the transcript and continue to work in resolve with it but feels like something that could be solved fairly simple with an unload model thing after transcription.
Windows 10 Enterprise LTSC 21H2
StoryToolkitAI Version 0.19.04
Nvidia driver 536.25
Resolve 18.5
The text was updated successfully, but these errors were encountered: