Skip to content

Scarfmonster/HiFiPLN

Repository files navigation

HiFiPLN

Multispeaker Community Vocoder model for DiffSinger

This is the code used to train the "HiFiPLN" vocoder.

A trained model for use with OpenUtau is available for download on the official release page.

Why HiFiPLN?

Because a lot of PLN was spent training this thing.

Training

Python

Python 3.10 or greater is required.

Data preparation

python dataset-utils/split.py --length 1 -sr 44100 -o "dataset/train" PATH_TO_DATASET

You will also need to provide some validation audio files and save them to dataset/valid and then run:

python preproc.py --path dataset --config "configs/hifipln.yaml"

Train model

python train.py --config "configs/hifipln.yaml"
  • If you see an error saying "Total length of `Data Loader` across ranks is zero" then you do not have enough validation files.
  • You may want to edit configs/hifipln.yaml and change train: batch_size: 12 to a value that better fits your available VRAM.

Resume

python train.py --config "configs/hifipln.yaml" --resume CKPT_PATH

You may set CKPT_PATH to a log directory (eg. logs/HiFiPLN), and it will find the last checkpoint of the last run.

Finetuning

Download a checkpoint from https://utau.pl/hifipln/#checkpoints-for-finetuning
Save the checkpoint as ckpt/HiFiPLN.ckpt then run:

python train.py --config "configs/hifipln-finetune.yaml"
  • Finetuning shouldn't be run for too long, especially for small datasets. Just 2-3 epochs or ~20000 steps should be fine.

Exporting for use in OpenUtau

python export.py --config configs/hifipln.yaml --output out/hifipln --model CKPT_PATH

You may set CKPT_PATH to a log directory (eg. logs/HiFiPLN), and it will find the last checkpoint of the last run.

Credits