Skip to content

Various approaches for speech recognition and speaker diarization.

Notifications You must be signed in to change notification settings

j-schmied/RealTimeSpeechRecognition

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Real-Time Speech Recognition

PoC's for speech recognition and speaker diarization.

Working PoC's

  • rtsr_en.py: PoC using AssemblyAI WebSocket API (english only)
  • rtsr_de.py: PoC using OpenAI Whisper (de, probably multilingual)

Prototypes

Additionally, a handful of prototypes were created using various technologies:

  • librosa
  • NVIDIA NeMo
  • Tensorflow + Keras Model
  • Mel Spectrogram CNN

Credits