Skip to content

Transcribes a podcast given its RSS feed or YouTube link using OpenAI's WhisperAI and formats into readable paragraphs using Hugging Face's sentence-transformers.

Notifications You must be signed in to change notification settings

fgloblek/TranscribeAndPrettifyPodcasts

Repository files navigation

TranscribeAndPrettifyPodcasts

Transcribes a podcast given its RSS feed or YouTube link using OpenAI's WhisperAI and formats into readable paragraphs.

The very clever idea for making paragraphs was taken from this notebook and uses Hugging Face's sentence-transformers

The example podcast used in the notebook is a Russian book podcast Knizhnyy Bazar.

The other notebook uses Whisper-JAX and needs TPU. I suggest taking advantage of Kaggle's TPUs. It's a slightly different approach and saves it into a markdown table of the form

Timestamp Text
... ...
(12:34) The spoken text appearing at (12:34) of the episode. Lorem ipsum...

which I needed to import into an Obsidian knowledge base. The example podcasts are Brains and Gains by Dr. Dave Maconi, processed using its YouTube link, and Where Optimal Meets Practical by Jordan Lips, processed using the RSS feed like before.

About

Transcribes a podcast given its RSS feed or YouTube link using OpenAI's WhisperAI and formats into readable paragraphs using Hugging Face's sentence-transformers.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published