asr

Here are 1,018 public repositories matching this topic...

unnumsykar / knowledge-transfer-GenAI

how to compress large knowledge base (.mp4, .mp3, .wav) and transfer it into readable, short, summarized form for effective knowledge transfer

asr gpt-4 genai-usecase

Updated May 24, 2024

k2-fsa / sherpa-onnx

Star

Speech-to-text, text-to-speech, and speaker recongition using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift

android windows macos linux raspberry-pi ios text-to-speech csharp cpp dotnet speech-to-text aarch64 mfc risc-v asr arm32 onnx vits openkylin

Updated May 24, 2024
C++

wenet-e2e / wenet

Star

Production First and Production Ready End-to-End Speech Recognition Toolkit

pytorch transformer speech-recognition automatic-speech-recognition production-ready whisper asr conformer e2e-models

Updated May 24, 2024
Python

NVIDIA / NeMo

Star

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

machine-translation tts speech-synthesis neural-networks deeplearning speaker-recognition asr multimodal speech-translation large-language-models speaker-diariazation generative-ai

Updated May 24, 2024
Python

DmitryRyumin / ICASSP-2023-24-Papers

Star

ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal processing. Code included. Star the repository to support the advancement of audio and signal processing!

Updated May 24, 2024
Python

flozi00 / atra

Star

An open source NLP as a service project focused on providing state of the art systems with ease. Training and inference by simple docker commands

chatbot speech transformers inference speech-recognition asr llm stable-diffusion

Updated May 23, 2024
Jupyter Notebook

voicegain / platform

Star

Voicegain Enterprise Speech-to-Text Platform (API, Portal, etc.)

deep-neural-networks ivr speech-to-text rtc transcription asr mrcp

Updated May 23, 2024
HTML

Garvys / rustfst

Star

Rust re-implementation of OpenFST - library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). A Python binding is also available.

Updated May 23, 2024
Rust

blip-radar / vatsim-parser

Star

Parser for a variety of VATSIM-related file formats

vatsim euroscope asr sct topsky-plugin

Updated May 23, 2024
Rust

metame-ai / awesome-audio-plaza

Star

Daily tracking of awesome audio papers, including music generation, zero-shot tts, asr, audio generation

awesome tts music-generation asr audio-generation zero-shot-tts awesome-music-generation

Updated May 23, 2024

MahmoudAshraf97 / whisper-diarization

Star

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

speech speech-recognition speech-to-text whisper asr speaker-diarization

Updated May 23, 2024
Jupyter Notebook

k2-fsa / sherpa

Star

Speech-to-text server framework with next-gen Kaldi

python cpp websocket pytorch speech-recognition transducer asr ctc end-to-end-asr

Updated May 23, 2024
C++

PaddlePaddle / PaddleSpeech

Star

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

Updated May 23, 2024
Python

winstxnhdw / CapGen

Star

A fast CPU-first video/audio transcriber for generating caption files with Whisper and CTranslate2, hosted on Hugging Face Spaces.

docker caddy automatic-speech-recognition whisper asr fastapi uvicorn-gunicorn huggingface huggingface-spaces ctranslate2

Updated May 23, 2024
Python

junuMoon / review

Star

손에 집히는 건 읽습니다

research paper asr llm

Updated May 23, 2024

AssemblyAI / assemblyai-java-sdk

Star

The AssemblyAI Java SDK provides an easy-to-use interface for interacting with the AssemblyAI API, which supports async and real-time transcription, audio intelligence models, as well as the latest LeMUR models.

java ai speech-to-text transcription stt asr assemblyai llm

Updated May 23, 2024
Java

deepgram-devs / deepgram-conversational-demo

Star

Deepgram Conversational AI demo

react nextjs tts stt asr deepgram vercel

Updated May 24, 2024
TypeScript

jdepoix / youtube-transcript-api

Sponsor

Star

This is a python API which allows you to get the transcript/subtitles for a given YouTube video. It also works for automatically generated subtitles and it does not require an API key nor a headless browser, like other selenium based solutions do!

python cli youtube youtube-video youtube-api captions subtitles transcript subtitle transcripts asr youtube-subtitles youtube-transcripts youtube-captions youtube-transcript translating-transcripts youtube-asr

Updated May 22, 2024
Python

speechbrain / speechbrain

Star

A PyTorch-based Speech Toolkit

Updated May 22, 2024
Python

KevKibe / African-Whisper

Star

🚀 Framework for seamless fine-tuning of Whisper model on a multi-lingual dataset and deployment to prod.

speech speech-recognition speech-to-text whisper asr speech-translation speech-transcription

Updated May 23, 2024
Python

Improve this page

Add a description, image, and links to the asr topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the asr topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

asr

Here are 1,018 public repositories matching this topic...

unnumsykar / knowledge-transfer-GenAI

k2-fsa / sherpa-onnx

wenet-e2e / wenet

NVIDIA / NeMo

DmitryRyumin / ICASSP-2023-24-Papers

flozi00 / atra

voicegain / platform

Garvys / rustfst

blip-radar / vatsim-parser

metame-ai / awesome-audio-plaza

MahmoudAshraf97 / whisper-diarization

k2-fsa / sherpa

PaddlePaddle / PaddleSpeech

winstxnhdw / CapGen

junuMoon / review

AssemblyAI / assemblyai-java-sdk

deepgram-devs / deepgram-conversational-demo

jdepoix / youtube-transcript-api

speechbrain / speechbrain

KevKibe / African-Whisper

Improve this page

Add this topic to your repo