📝 WebTranscript

Interactive web tool for automatically ⚙️ transcribing and subtitling videos from URL or file uploads in your chosen language. The transcript appears alongside the video player, complete with embedded subtitles.

Explore this GitHub demo page to interact with a pre-processed video file.

📝 Generate transcripts with subtitles and video preferences (URL/file) for selected languages 🗨️.

🗨️ Select languages for transcription and subtitling.

💾 Save file with video, transcript, subtitles and timestamps in one file.

📂 Load a file containing the video, transcript, subtitles and timestamps.

✗ Skip Mode selects segments of the transcript to be skipped during playback.

✎ Edit Mode to edit the transcript.

🔊 Single tap mode tap a word once to play.

🕓 Toggles the timestamps on and off.

🐳 Build & Launch ( Docker )

Make sure you have Docker installed.

docker build -t transcript-app -f transcript.dockerfile .
docker run -p 5000:5000 -p 6379:6379 transcript-app
# Stop Redis if it's running locally on that port (6379) or change port:
# sudo systemctl stop redis-server

🐧 Build & Launch ( Linux )

# installation apt-dependencies: Python (3.8 to 3.11), pip, git, redis and ffmpeg
sudo apt update && sudo apt install python3-pip python3-venv git redis-server ffmpeg

# clone git-repo
git clone [email protected]:LD239/WebTranscript.git && cd WebTranscript

# create virtual environment
python3 -m venv transcript-env
source transcript-env/bin/activate

# install pip-dependencies
python3 -m pip install -r requirements.txt

⌨️ bash 1 [start worker]:

sudo systemctl start redis-server
sudo systemctl status redis-server

source transcript-env/bin/activate
transcript-env/bin/celery -A app.celery worker --purge

⌨️ bash 2 [start webserver]:

source transcript-env/bin/activate
python3 app.py

🔍 Details

🏛️ Overview

+-----------------+                                +-----------------+
|   Frontend      |                                |   Web Server:   |
|  User Interface |                                |      Flask      |
+-------+---------+                                +--------+--------+
        |                                                   |                
        | HTTP Request/Response                             | HTTP Request/Response
        |-------------------------------------------------->|
        |                                                   |
        |                                                   |                
        |                    +------------------------------+   
        |                    |                              |
        |                    |                              |
        |                    |                              |
        |      +-------------v--------------+               |
        |      | task processing:           |               |
        |      | Celery + Redis             |               |
        |      +-------------+--------------+               |
        |                    |                              |
        |                    | Async Tasks                  |
        |                    |                              |
        |      +-------------v--------------+               |
        |      | video/audio download       |               |
        |      | yt_dlp                     |               |
        |      +-------------+--------------+               |
        |                    |                              |
        |                    | Video                        |
        |                    |                              |
        |      +-------------v--------------+               |
        |      | audio handling:            |               |
        |      | ffmpeg                     |               |
        |      +----------------------------+               |
        |                    |                              |
        |                    | Audio                        |
        |                    |                              |      
        |      +-------------v--------------+               |
        |      | transcription:             |               |
        |      | Whisper                    |               |
        |      +----------------------------+               |
        |                    |                              |
        |                    | Transcribed Text             |
        |                    |                              |
        |      +-------------v--------------+               |
        |      | translation:               |               |
        |      | googletrans or NLLB        |               |
        |      +----------------------------+               |
        |                    |                              |
        |<-------------------+                              |
        |    Translated Text with Timestamps + Video        |
        +-------------------------------------------------->|
        |                                                   |
        |<--------------------------------------------------+
        |     Updates (Translated Transcriptions, etc.)     |
        |                                                   |

📌 Acknowledge

The software is provided under the ⚖️ MIT licence, but please check the licence terms 📜 of the following: essential tools.

Without them, this project wouldn't run smoothly.

🌐 Web Server / REST-API: Flask
📨 Message broker: Redis
🔨 Task Processing: Celery
📥 Video Download: yt_dlp
🎧 Audio Extraction: ffmpeg
📝 Transcription: Whisper or whisper-timestamped
💬 Translation:
- googletrans OR
- NLLB

📝 Next Steps

Add support for audio files.
Dockerize the application using Docker Compose.
Implement APIs for individual components (Download, Audio Extraction, Translation).
Include a backend configuration file for selecting translators and transcriptors.
Front-End selection of APIs (URL + API-KEY)
Integrate summarization with timestamped quotes.
Implement error correction using Language Model-based techniques.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
static		static
templates		templates
LICENSE.md		LICENSE.md
README.md		README.md
app.py		app.py
config.json		config.json
entrypoint.sh		entrypoint.sh
index.html		index.html
lang_mapping.json		lang_mapping.json
preview_app.png		preview_app.png
requirements.txt		requirements.txt
transcript.dockerfile		transcript.dockerfile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

static

static

templates

templates

LICENSE.md

LICENSE.md

README.md

README.md

app.py

app.py

config.json

config.json

entrypoint.sh

entrypoint.sh

index.html

index.html

lang_mapping.json

lang_mapping.json

preview_app.png

preview_app.png

requirements.txt

requirements.txt

transcript.dockerfile

transcript.dockerfile

Repository files navigation

📝 WebTranscript

🐳 Build & Launch ( Docker )

🐧 Build & Launch ( Linux )

⌨️ bash 1 [start worker]:

⌨️ bash 2 [start webserver]:

🔍 Details

📝 Next Steps

About

Releases

Packages

Languages

License

LD239/WebTranscript

Folders and files

Latest commit

History

Repository files navigation

📝 WebTranscript

🐳 Build & Launch ( Docker )

🐧 Build & Launch ( Linux )

⌨️ bash 1 [start worker]:

⌨️ bash 2 [start webserver]:

🔍 Details

📝 Next Steps

About

Topics

Resources

License

Stars

Watchers

Forks

Languages