Skip to content

NVIDIA Neural Modules 1.16.0

Compare
Choose a tag to compare
@ericharper ericharper released this 08 Mar 04:35
· 1491 commits to main since this release
1631118

Highlights

NeMo ASR

  • ASR Evaluator
  • Multi-channel dereverberation algorithm
  • Hybrid ASR-TTS Models
  • Flashlight Decoder Beam Search
  • FastConformer Encoder with 8x subsampling

NeMo TTS

  • SSL Voice Conversion
  • Spectrogram Enhancer
  • VITS

NeMo Megatron

  • Per microbatch dataloader for GPT and BERT
  • Adapters compatible with Faster Transformer

NeMo Core

  • Nested model support

NeMo Tools

  • NeMo Forced Aligner

Container

For additional information regarding NeMo containers, please visit: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo

docker pull nvcr.io/nvidia/nemo:23.01

ASR

Changelog

TTS

Changelog
  • [TTS] Update Spanish TTS model to 1.15 by @rlangman :: PR: #5742
  • [TTS][DE] refine grapheme-based tokenizer and fastpitch training recipe on thorsten's neutral datasets. by @XuesongYang :: PR: #5753
  • No-script TS export, prepared for ONNX export by @borisfom :: PR: #5653
  • Fixing masking in RadTTS bottleneck layer by @borisfom :: PR: #5771
  • Port Riva's mel cepstral distortion w/ dynamic time warping notebook by @redoctopus :: PR: #5778
  • Update radtts' infer path by @blisc :: PR: #5788
  • [TTS][DE] Augment tokenization/G2P to preserve capitalization of words and mix phonemes with word-level graphemes for an input text. by @XuesongYang :: PR: #5805
  • [TTS] porting VITS implementation by @treacker :: PR: #5600
  • [TTS][DE] updated IPA dictionary and heteronyms by @XuesongYang :: PR: #5860
  • [TTS] GAN-based spectrogram enhancer by @racoiaws :: PR: #5565
  • TTS inference with Heteronym classification model, hc model inference refactoring by @ekmb :: PR: #5768
  • Remove MCD_DTW tarball by @redoctopus :: PR: #5889
  • Hybrid ASR-TTS models by @artbataev :: PR: #5659
  • Moved eval notebook data to aws by @redoctopus :: PR: #5911
  • [G2P] fixed typos and broken import library. by @XuesongYang :: PR: #5978
  • [G2P] backward compatibility for english tokenizer and bugfix by @XuesongYang :: PR: #5980
  • fix links, add missing file by @ekmb :: PR: #6044
  • [TTS] Spectrogram Enhancer: correct dim for length when loading data by @racoiaws :: PR: #6048
  • [TTS] bugfix for fastpitch German tutorial by @XuesongYang :: PR: #6051
  • [TTS] bugfix Chinese Fastpitch tutorial by @XuesongYang :: PR: #6055
  • Fix enhancer usage by @artbataev :: PR: #6059
  • [TTS] Spectrogram Enhancer: support arbitrary input length by @racoiaws :: PR: #6060
  • Fix enhancer usage in ASR-TTS examples by @artbataev :: PR: #6116
  • [TTS] Spectrogram Enhancer: add option to zero out the initial tensor by @racoiaws :: PR: #6136
  • [TTS][DE] Augment tokenization/G2P to preserve capitalization of words and mix phonemes with word-level graphemes for an input text. by @XuesongYang :: PR: #5805

NLP / NMT

Changelog
  • Fix P-Tuning Truncation by @vadam5 :: PR: #5663
  • Adithyare/prompt learning seed by @arendu :: PR: #5749
  • Add extra data args to support proper finetuning of HF converted T5 checkpoints by @MaximumEntropy :: PR: #5719
  • Don't add output directory twice when creating shared sentencepiece tokenizer by @pks :: PR: #5737
  • add constraint info on batch size for tar dataset by @yzhang123 :: PR: #5812
  • remove transformer version upper bound by @Zhilin123 :: PR: #5831
  • Adithyare/adapter new placement by @arendu :: PR: #5791
  • Add SSL import functionality for Audio Lexical PNC Models by @trias702 :: PR: #5834
  • validation batch sizing and drop_last controls by @arendu :: PR: #5830
  • Remove ending newlines when encoding strings w/ sentencepiece tokenizer by @pks :: PR: #5739
  • Fix segmenting for pcla inference by @jubick1337 :: PR: #5849
  • RETRO model finetuning by @yidong72 :: PR: #5800
  • Optimizing distributed Adam when running with one work queue by @timmoon10 :: PR: #5560
  • Add option to disable distributed parameters in distributed Adam optimizer by @timmoon10 :: PR: #5685
  • set max_steps for lr decay through config by @anmolgupt :: PR: #5780
  • Fix Prompt text space issue by @aklife97 :: PR: #5983
  • Add batch_size to prompt_learning generate by @aklife97 :: PR: #6091

NeMo Tools

Changelog

Export

Changelog

General Improvements

Changelog