Skip to content

Latest commit

 

History

History

voxceleb

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 

Sidekit ASV Voxceleb 1

To run the recipe:

# Activate your miniconda env
. ./path.sh

# Download dataset to data
./local/data_prep.py --save-path ./data --download

# Create train data (recursive search of wavs) (Change `--from data` if you already have downloaded the wavs)
./local/data_prep.py  --from ./data --make-train-data # set --filter-dir if your data dir structure differ from the '--download' one (e.g.: voxceleb1/wav/)

# Create test data
./local/data_prep.py  --from ./data --make-test-data # set --filter-dir if your data dir structure differ from the '--download' one (e.g.: voxceleb1_test/wav/)

# Train
./local/train.py  --config configs/...

Results train-Voxceleb 1 (fbanks)

Test                Voxceleb-0               Exp                                Config
EER / min cllr      2.593 ± 0.0   / 0.106    exp/asv_eval_vox1_ecapa_tdnn       configs/ecapa_tdnn
EER / min cllr      2.089 ± 0.408 / 0.105    exp/asv_eval_vox1_ecapa_tdnn_ft    configs/ecapa_tdnn_fine_tune
EER / min cllr      2.413 ± 0.101 / 0.101    exp/asv_eval_vox1_resnet           configs/resnet

Note: On VCTK, the resnet model seems to be better.
Note: ecapa_tdnn converges faster.

JIT model

import torch
import torchaudio
waveform, _, text_gt, speaker, chapter, utterance = torchaudio.datasets.LIBRISPEECH("/tmp", "dev-clean", download=True)[0]
model = torch.jit.load("__Exp_Path__/final.jit")
model = model.eval()

_, x_vector = model(waveform)