Skip to content

Proof of concept for transcribing podcasts into text using GCP Speech2Text service

License

Notifications You must be signed in to change notification settings

emibcn/Podcast2Text

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DeepSource

Proof of Concept for transcoding podcasts into text using GCP Speech2Text service, following its NODE JS tutorial.

Installation

  1. Download this repo:
git clone https://github.com/emibcn/Podcast2Text.git
  1. Change directory into it:
cd Podcast2Text
  1. Create local directories:
mkdir flac credentials
  1. Create GCP credentials for consuming Speech2Text service at GCP IAM with -at least- Service Usage Consumer permission.
  2. Copy credentials file to ./credentials directory
  3. Create .env file with GOOGLE_APPLICATION_CREDENTIALS=[CREDENTIALS FILENAME] (without directory)

Usage

There is a script helper to transcode any audio file into text. It's syntax is:

./transcode.sh <FILEPATH> [START]
  • FILEPATH: Path (relative or absolute) to podcast audio file
  • START: Initial start seek (transcode beginning at this position). Same syntax as FFMPEG -ss option.

This will encode the supplied file to FLAC format into ./flac directory and then use the encoded file to send it to GCP Speech2Text service and get its transcription printed on screen.