Value Error Occured in training dataset with Jasper Model #145

secrisa11 · 2021-07-08T10:01:46Z

python ./bin/main.py model=jasper train=jasper_train train.dataset_path=$DATASET_PATH train.transcripts_path=$TRANSCRIPTS_PATH
./bin/main.py:175: UserWarning:
'audio/fbank' is validated against ConfigStore schema with the same name.
This behavior is deprecated in Hydra 1.1 and will be removed in Hydra 1.2.
See https://hydra.cc/docs/next/upgrades/1.0_to_1.1/automatic_schema_matching for migration instructions.
main()
[2021-07-09 15:36:37,587][kospeech.utils][INFO] - audio:
audio_extension: pcm
sample_rate: 16000
frame_length: 20
frame_shift: 10
normalize: true
del_silence: true
feature_extract_by: kaldi
time_mask_num: 4
freq_mask_num: 2
spec_augment: true
input_reverse: false
transform_method: fbank
n_mels: 80
freq_mask_para: 18
model:
architecture: jasper
teacher_forcing_ratio: 1.0
teacher_forcing_step: 0.01
min_teacher_forcing_ratio: 0.9
dropout: 0.3
bidirectional: false
joint_ctc_attention: false
max_len: 400
version: 10x5
train:
dataset: kspon
dataset_path: /home/suresoft/KoSpeech-1.3/dataset/kspon
transcripts_path: /home/suresoft/KoSpeech-1.3/dataset/kspon/transcripts.txt
output_unit: character
batch_size: 32
save_result_every: 1000
checkpoint_every: 5000
print_every: 10
mode: train
num_workers: 4
use_cuda: true
init_lr_scale: 0.01
final_lr_scale: 0.05
max_grad_norm: 400
weight_decay: 0.001
seed: 777
resume: false
optimizer: novograd
reduction: sum
init_lr: 0.001
final_lr: 0.0001
peak_lr: 0.001
warmup_steps: 0
num_epochs: 10
lr_scheduler: tri_stage_lr_scheduler

[2021-07-09 15:36:37,755][kospeech.utils][INFO] - Operating System : Linux 4.9.201-tegra
[2021-07-09 15:36:37,756][kospeech.utils][INFO] - Processor : aarch64
[2021-07-09 15:36:37,797][kospeech.utils][INFO] - device : NVIDIA Tegra X1
[2021-07-09 15:36:37,797][kospeech.utils][INFO] - CUDA is available : True
[2021-07-09 15:36:37,798][kospeech.utils][INFO] - CUDA version : 10.2
[2021-07-09 15:36:37,798][kospeech.utils][INFO] - PyTorch version : 1.6.0
[2021-07-09 15:36:37,827][kospeech.utils][INFO] - split dataset start !!
[2021-07-09 15:36:41,561][kospeech.utils][INFO] - Applying Spec Augmentation...
[2021-07-09 15:36:45,168][kospeech.utils][INFO] - Applying Spec Augmentation...
Error executing job with overrides: ['model=jasper', 'train=jasper_train', 'train.dataset_path=/home/suresoft/KoSpeech-1.3/dataset/kspon', 'train.transcripts_path=/home/suresoft/KoSpeech-1.3/dataset/kspon/transcripts.txt']
Traceback (most recent call last):
File "./bin/main.py", line 175, in
main()
File "/home/suresoft/miniforge3/envs/KoSpeech_Py36/lib/python3.6/site-packages/hydra/main.py", line 53, in decorated_main
config_name=config_name,
File "/home/suresoft/miniforge3/envs/KoSpeech_Py36/lib/python3.6/site-packages/hydra/_internal/utils.py", line 368, in _run_hydra
lambda: hydra.run(
File "/home/suresoft/miniforge3/envs/KoSpeech_Py36/lib/python3.6/site-packages/hydra/_internal/utils.py", line 214, in run_and_report
raise ex
File "/home/suresoft/miniforge3/envs/KoSpeech_Py36/lib/python3.6/site-packages/hydra/_internal/utils.py", line 211, in run_and_report
return func()
File "/home/suresoft/miniforge3/envs/KoSpeech_Py36/lib/python3.6/site-packages/hydra/_internal/utils.py", line 371, in
overrides=args.overrides,
File "/home/suresoft/miniforge3/envs/KoSpeech_Py36/lib/python3.6/site-packages/hydra/_internal/hydra.py", line 110, in run
_ = ret.return_value
File "/home/suresoft/miniforge3/envs/KoSpeech_Py36/lib/python3.6/site-packages/hydra/core/utils.py", line 233, in return_value
raise self._return_value
File "/home/suresoft/miniforge3/envs/KoSpeech_Py36/lib/python3.6/site-packages/hydra/core/utils.py", line 160, in run_job
ret.return_value = task_function(task_cfg)
File "./bin/main.py", line 170, in main
last_model_checkpoint = train(config)
File "./bin/main.py", line 99, in train
epoch_time_step, trainset_list, validset = split_dataset(config, config.train.transcripts_path, vocab)
File "/home/suresoft/KoSpeech-1.3/kospeech/data/data_loader.py", line 306, in split_dataset
audio_extension=config.audio.audio_extension
File "/home/suresoft/KoSpeech-1.3/kospeech/data/data_loader.py", line 67, in init
self.shuffle()
File "/home/suresoft/KoSpeech-1.3/kospeech/data/data_loader.py", line 102, in shuffle
self.audio_paths, self.transcripts, self.augment_methods = zip(*tmp)
ValueError: not enough values to unpack (expected 3, got 0)

sooftware · 2021-07-21T16:51:06Z

Hi @Daeyeop-Kim. This repository is archived. Further development is underway here.

pvodopija · 2022-01-14T02:14:46Z

Had the same problem and managed to fix it like this.
Go to file kospeech/data/data_loader.py and in function shuffle change the following:

def shuffle(self):
        """ Shuffle dataset """
        tmp = list( zip( self.audio_paths, self.transcripts, self.augment_methods ) )
        random.shuffle( tmp )

        # This kinda works.
        for i, x in enumerate( tmp ):
            self.audio_paths[i] = x[0]
            self.transcripts[i] = x[1]
            self.augment_methods[i] = x[2]
        
        # This doesn't work.
        # self.audio_paths, self.transcripts, self.augment_methods = zip( *tmp )

secrisa11 assigned sooftware Jul 8, 2021

roytravel mentioned this issue Jun 28, 2022

Fix value error in dataset module openspeech-team/openspeech#174

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Value Error Occured in training dataset with Jasper Model #145

Value Error Occured in training dataset with Jasper Model #145

secrisa11 commented Jul 8, 2021 •

edited

sooftware commented Jul 21, 2021

pvodopija commented Jan 14, 2022

Value Error Occured in training dataset with Jasper Model #145

Value Error Occured in training dataset with Jasper Model #145

Comments

secrisa11 commented Jul 8, 2021 • edited

sooftware commented Jul 21, 2021

pvodopija commented Jan 14, 2022

secrisa11 commented Jul 8, 2021 •

edited