You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you very much for your contribution. I have trained the model on LJ Speech for 835k. However, the results are not as good as the samples you provided for 420k. Maybe some problem with my training? Below you can find the attention plot and the sample audio at 835k. What kind of attention plot signals a good checkpoint for the synthesizer?
And the progress was like this:
Also, I was wondering if you have any plans to release your trained model.
Another thing is the tf.save keeps the last 5 checkpoints by default, and the wrapper used here (i.e. tf.train.Supervisor) does not easily allow changing max_to_keep property of the saver.
PS. The hyperparameters are kept as default.
# signal processing
sr = 22050 # Sample rate.
n_fft = 2048 # fft points (samples)
frame_shift = 0.0125 # seconds
frame_length = 0.05 # seconds
hop_length = int(sr*frame_shift) # samples.
win_length = int(sr*frame_length) # samples.
n_mels = 80 # Number of Mel banks to generate
power = 1.2 # Exponent for amplifying the predicted magnitude
n_iter = 50 # Number of inversion iterations
preemphasis = .97 # or None
max_db = 100
ref_db = 20
# model
embed_size = 256 # alias = E
encoder_num_banks = 16
decoder_num_banks = 8
num_highwaynet_blocks = 4
r = 5 # Reduction factor.
dropout_rate = .5
# training scheme
lr = 0.001 # Initial learning rate.
logdir = "logdir"
sampledir = 'samples'
batch_size = 32
num_iterations = 1000000
The text was updated successfully, but these errors were encountered:
Thank you very much for your contribution. I have trained the model on
![alignment_835k](https://user-images.githubusercontent.com/5753242/39467554-9a7636ea-4d69-11e8-851d-d271f463b2a7.png)
![problem](https://user-images.githubusercontent.com/5753242/39468696-6731a048-4d6f-11e8-92e4-2817c75bf274.gif)
LJ Speech
for 835k. However, the results are not as good as the samples you provided for 420k. Maybe some problem with my training? Below you can find the attention plot and the sample audio at 835k. What kind of attention plot signals a good checkpoint for the synthesizer?And the progress was like this:
The samples synthesized from this checkpoint can be found here:
https://www.dropbox.com/sh/n5ld72rn9otxl7a/AAACyplZMtxiYtuUgvWN8OGaa?dl=0
Also, the trained model (checkpoint), is uploaded here:
https://www.dropbox.com/sh/ks91bdputl5ujo7/AABRIqpviRDBgWuFIJn1yuhba?dl=0
Also, I was wondering if you have any plans to release your trained model.
Another thing is the
tf.save
keeps the last 5 checkpoints by default, and the wrapper used here (i.e.tf.train.Supervisor
) does not easily allow changingmax_to_keep
property of thesaver
.PS. The hyperparameters are kept as default.
The text was updated successfully, but these errors were encountered: