Skip to content

Latest commit

 

History

History
20 lines (17 loc) · 607 Bytes

README.md

File metadata and controls

20 lines (17 loc) · 607 Bytes

Speech enhancement models using spectrograms as features

Speech-Enhancement-Models

Speech enhancement models:MLP, Auto-encoder, GAN

Dataset

The dataset is the speech enhancment dataset built by the University of Edinburgh. DataShare.

Requirements

  • PyTorch
conda install pytorch torchvision -c pytorch
  • librosa
pip install librosa

Notes:

The audios should be sliced into pieces with equal time length. Then do Short Time Fourier Transform on them, turn them into a 2D matirx. Then we use CNN to extract features from them.