You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, I've been trying to train the sota/2019 pre-trained dev-clean transformer model for 1 more epoch using flashlight's train continue mode. However, it fails to start training as the pre-trained models from wav2letter are not compatible with flashlight. I then installed wav2letter v0.2 to try retraining the pre-trained models using train continue but it fails and shows this error:
I0401 12:50:50.303958 29472 Train.cpp:80] Parsing command line flags
I0401 12:50:50.303997 29472 Train.cpp:81] Overriding flags should be mutable when using `continue`
terminate called after throwing an instance of 'std::runtime_error'what(): unhandled system error
*** Aborted at 1680324650 (unix time) try "date -d @1680324650"if you are using GNU date ***
PC: @ 0x7f4669af9e87 gsignal
*** SIGABRT (@0x3e800007320) received by PID 29472 (TID 0x7f4697288380) from PID 29472; stack trace: ***
@ 0x7f468f59f980 (unknown)
@ 0x7f4669af9e87 gsignal
@ 0x7f4669afb7f1 abort
@ 0x7f466a4ee957 (unknown)
@ 0x7f466a4f4ae6 (unknown)
@ 0x7f466a4f4b21 std::terminate()
@ 0x7f466a4f4d54 __cxa_throw
@ 0x55673215c6f8 fl::detail::ncclCheck()
@ 0x55673215ddd7 fl::distributedInit()
@ 0x5567320cb387 w2l::initDistributed()
@ 0x556731e3eab2 main
@ 0x7f4669adcc87 __libc_start_main
@ 0x556731ea7e4a _start
Aborted
I tried using train fork and still the error persists. This error does not occur using train alone.
Bug Description
Hello, I've been trying to train the sota/2019 pre-trained dev-clean transformer model for 1 more epoch using flashlight's train continue mode. However, it fails to start training as the pre-trained models from wav2letter are not compatible with flashlight. I then installed wav2letter v0.2 to try retraining the pre-trained models using train continue but it fails and shows this error:
I tried using train fork and still the error persists. This error does not occur using train alone.
Reproduction Steps
This is what I ran:
wav2letter/build/Train continue /mnt/d/198 --minloglevel=0 --logtostderr=1 --rndv_filepath=
Is there other way to try to train the pretrained models for just 1 epoch?
The text was updated successfully, but these errors were encountered: