Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model stays in Vram after transcribe #126

Open
ryzen88 opened this issue Aug 4, 2023 · 8 comments
Open

Model stays in Vram after transcribe #126

ryzen88 opened this issue Aug 4, 2023 · 8 comments
Labels
bug Something isn't working

Comments

@ryzen88
Copy link

ryzen88 commented Aug 4, 2023

I have noticed that after transcription proces is done, the Vram stays completely loaded to the max.
My guess is the wisper model stays in Vram after it is done, this is a bit of a pain in the ass because after transcription you want to place markers in resolve but since the transcription model (always use large V2) still fills the vram resolve performance is average.
Also when you want to translate another timeline it overflows Vram and fills up main memory and transcription is very slow.
Then it seems it loads up again the whole model while keeping the first one also in memory.
Don't know if its something only i experience but do have it on multiple systems i tried it on. Also with older versions

Its not a enormous problem because just have to close the app, reopen the transcript and continue to work in resolve with it but feels like something that could be solved fairly simple with an unload model thing after transcription.

Windows 10 Enterprise LTSC 21H2
StoryToolkitAI Version 0.19.04
Nvidia driver 536.25
Resolve 18.5

vram

@ryzen88 ryzen88 added the bug Something isn't working label Aug 4, 2023
@octimot
Copy link
Owner

octimot commented Aug 5, 2023

Hey there!

Thanks for this detailed feedback!

Model / memory management is now top priority in our backlog, so we will deal with this asap.

Its not a enormous problem because just have to close the app, reopen the transcript and continue to work in resolve with it but feels like something that could be solved fairly simple with an unload model thing after transcription.

This is a little bit more complex to handle though. For example, in our editing room, we queue up more than one timeline for transcription/indexing, so loading the model for each job, would slow down the process by 10-20% and if you're dealing with 30 jobs at a time it's not ideal. Probably a way to do it is to have the tool check the queue and if there are no more jobs left, to unload the model(s), but we'll find a way to handle that soon!

Then it seems it loads up again the whole model while keeping the first one also in memory.

The reloading shouldn't happen though since there's a check in place which should prevent this. How can you tell this? Do you see something in the logs / console?

Cheers!

@ryzen88
Copy link
Author

ryzen88 commented Aug 6, 2023

First of i only use the whisper v2 so that will fill up the 16gb with resolve also running, after the job that remains full.
When transcribing again after that it overflows to shared gpu memory and is significantly slower.
First blue bar is loading in and running the first time, the second blue bar is when running a second transcription.

After some more testing it seems that after the second transcription that gives the bigger vram overflow more transcriptions do not add any extra vram anymore. So only when doing a second transcription it overvlows.

It does not matter if i do the transcriptions with the same parameters or different parameters or languages. The second transcription always significantly adds to the vram usage, or otherwise said the vram usage of the first transcription does not unload.
This test was done without resolve running so it happens regardless of resolve and the log does not display anything useful or out of the ordinary
vram usage

@octimot
Copy link
Owner

octimot commented Aug 7, 2023

Is the overflow happening on the second transcription even without Resolve being open?

Are the logs / console telling you that it's loading the model on the second transcription?

Cheers!

@ryzen88
Copy link
Author

ryzen88 commented Aug 7, 2023

I have not tried if it is model specific or something because i only run the large V2 model, but had this issue in all previous versions and on different systems so i was more under the impression that for now its not a bug but a feature that in future gets optimized away.

Below is without resolve running, but it acts the same with resolve running

run 1
run 2

@octimot
Copy link
Owner

octimot commented Aug 7, 2023

Got it, thanks! We'll look into it!

@ryzen88
Copy link
Author

ryzen88 commented Aug 7, 2023

No problem, thank you for taking te time

@ryzen88
Copy link
Author

ryzen88 commented Aug 11, 2023

One last addition.
Had to do some transcribing again.
I use the transcript to quickly select quotes and mark them in the timeline.
But since the wisperv2 large model remains in memory + resolve wants to also use Vram It gets really full quickly.
Resolve performance in timeline drops significantly, so i do the following:
1 Resolve -> transcribe timeline
2 Transcription complete -> close story for freeing up Vram
3 Open story -> load transcript and then its selecting quotes from transcript and mark the timeline
Anyway thanks.

@octimot
Copy link
Owner

octimot commented Aug 15, 2023

Thanks again for the suggestions!

I'll code a patch for un-loading the model if there are no more transcription jobs in the queue soon.

I'm not sure why the model re-loads once it's already loaded on your end though. I don't see this happening here and it shouldn't since the variable self.whisper_model is always checked before loading the model. Also, the variable is being over-written if the model is different than the one that is already loaded, and therefore it should be cleared by Python's garbage collector.

Would you mind trying to transcribe 2 consecutive things using the same model, but without having any other software that might use the GPU open?

Cheers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants