Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSError: Character label file (csv format) doesn`t exist : ../../../data/vocab/aihub_character_vocabs.csv #168

Open
agag8945 opened this issue Nov 7, 2022 · 1 comment
Assignees

Comments

@agag8945
Copy link

agag8945 commented Nov 7, 2022

python main.py --dataset_path $DATASET_PATH --vocab_dest $VOCAB_DEST --output_unit $OUTPUT_UNIT --preprocess_mode $PREPROCESS_MODE --vocab_size $VOCAB_SIZE
위의 코드를 실행하여 transcript.txt파일과 aihub_labels.csv파일을 생성하는 것 까지는 성공했습니다.
이후
python ./bin/main.py model=ds2 train=ds2_train train.dataset_path=$DATASET_PATH
코드를 실행하여 학습을 진행시켰는데

[2022-11-07 10:56:29,128][kospeech.utils][INFO] - Operating System : Linux 5.10.133+
[2022-11-07 10:56:29,128][kospeech.utils][INFO] - Processor : x86_64
[2022-11-07 10:56:29,129][kospeech.utils][INFO] - CUDA is available : False
[2022-11-07 10:56:29,129][kospeech.utils][INFO] - PyTorch version : 1.12.1+cu113
Error executing job with overrides: ['model=ds2', 'train=ds2_train', 'train.dataset_path=/content/drive/MyDrive/KoreanSpeech_dataset/KoreanSpeech_categori/KsponSpeech_01']
Traceback (most recent call last):
File "/content/drive/MyDrive/kospeech_lastest/bin/kospeech/vocabs/ksponspeech.py", line 126, in load_vocab
with open(label_path, 'r', encoding=encoding) as f:
FileNotFoundError: [Errno 2] No such file or directory: '../../../data/vocab/aihub_character_vocabs.csv'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/content/drive/MyDrive/kospeech_lastest/bin/main.py", line 162, in main
last_model_checkpoint = train(config)
File "/content/drive/MyDrive/kospeech_lastest/bin/main.py", line 85, in train
output_unit=config.train.output_unit,
File "/content/drive/MyDrive/kospeech_lastest/bin/kospeech/vocabs/ksponspeech.py", line 46, in init
self.vocab_dict, self.id_dict = self.load_vocab(vocab_path, encoding='utf-8')
File "/content/drive/MyDrive/kospeech_lastest/bin/kospeech/vocabs/ksponspeech.py", line 139, in load_vocab
raise IOError("Character label file (csv format) doesnt exist : {0}".format(label_path)) OSError: Character label file (csv format) doesnt exist : ../../../data/vocab/aihub_character_vocabs.csv

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

위와 같은 에러가 발생한 것을 확인했습니다.
aihub_character_vocabs.csv파일이 존재함에도 저런 에러가 생기네요..
해결 방법이 있을까요??

@XEL-Maker
Copy link

클론한 폴더 속 data 라는 이름의 폴더가 있을거에요
거기에다가 transcript.txt파일과 aihub_labels.csv파일을 넣고 시도해보세요

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants