Speech To Clipboard

A tool for recording audio from a microphone, transcribing the recording, and copying the transcription to the clipboard.

Developed by Claus Helfenschneider Interactive Applications.

Features

The transcription is copied to the clipboard for easy pasting into other applications.
Comes with a CLI, a UI, and is usable as a python module.
Supports configurable text replacements, similar to the voice recording feature on iOS.
For example, it can replace the text "new line" with an actual new line or "bullet point" with "• ".
If ffmpeg is installed (optional), the audio will be converted to mp3 prior to transcription, for faster uploads when using the OpenAI API.
Configurable via a config file (config.ini), command line arguments (CLI), and replacements-mapping file (replacements.json).

Transcription backend

Either one or both of the following transcription backends are supported and can be used:

Local whisper model. For this, openai-whisper must be installed.
OpenAI's Whisper model via the OpenAI API. For this, an OpenAI API key is required.

Screenshots

CLI	UI
Verbose Mode: Silent Mode:

Installation

Clone the repository.
(Optional but recommended) Set up a virtual environment. Requires Python 3.11 or higher.
Install the requirements:
- pip install -r requirements.in for the latest versions (recommended), or
- pip install -r requirements.txt for pinned versions.
To build an .exe, install the dev requirements:
- pip install -r requirements-dev.in
Set the environment variable WHISPER_KEYBOARD_API_KEY to your OpenAI API Key. You can either set it in your global environment, add it to an .env file or specify it in the config.ini file under [openai] api_key.
- Note: An environment variable takes precedence over the value set in the config file.

Usage

CLI (Command Line Interface)

Run the command line interface: python speech_to_clipboard_cli.py
For available options, run python speech_to_clipboard_cli.py --help.

UI (User Interface)

Run the UI: python speech_to_clipboard_ui.pyw
Select your preferred microphone from the dropdown menu.
Press REC to start recording.
Press Stop Recording to end the recording. The audio will be sent to the OpenAI API for transcription, and the result will be copied to your clipboard.

Python Module

from settings import Settings
from core.speech_to_clipboard import SpeechToClipboard

speech_to_clip = SpeechToClipboard(
    audio_file_path=Settings.AUDIO_FILE_PATH,
    config_file_path=Settings.CONFIG_FILE_PATH,
    replacemetns_file_path=Settings.REPLACEMENTS_FILE_PATH,
    openai_api_key_env_var=Settings.OPENAI_API_KEY_ENV_VAR,
)

speech_to_clip.start_recording()
print("Recording...")
input("Press Enter to stop recording...")
speech_to_clip.stop_and_save_recording()
transcription = speech_to_clip.transcribe_recording()
print(transcription)

Create Executable With AutoPyToExe

To build an executable file (.exe on Windows) using AutoPyToExe, follow these steps:

Install the dev requirements: pip install -r requirements-dev.in
Depending on whether you want to build the UI or the CLI app, choose the corresponding configuration file:
- UI: auto-py-to-exe-config_ui.json
- CLI: auto-py-to-exe-config_cli.json
There are some absolute paths in the configuration file, which have to be replaced by the path to your local project. Alternatively you can just take the config file as a reference to adjust the settings in the UI.
Execute auto-py-to-exe -c <YOUR_CONFIG_FILE> with the adjusted config file.
Click *Convert .py to .exe
Note: In case you want to build UI and CLI, you need to build those separately, but after building both, you can move both excutables into the same directory, so that they use the same config file and resources, and delete the other/obsolete build directory.

Text Replacer

The tool features a simple text replacement system. When enabled via the Replacer checkbox, it can replace certain expressions as follows:

Expression	Replacement
`new line`	`\n`
`bullet point`	`•`
`en dash`	`–`
...	...

Configure or edit the replacements in the resources/replacements.json file.

Contributions & Feedback

Contributions and feedback are welcome! Please open an issue or submit a pull request.

Credits

By Claus Helfenschneider Interactive Applications @ www.interactive-applications.com

If you enjoy this project, please consider buying me a coffee, check out my website, and reach out to me. I'd love to hear from you!

I am open for hire/commissions.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Third-Party Licenses

This project uses the following third-party packages. Please refer to the respective license files for more details.

Package	License	License File
CustomTKinter	MIT	Link
OpenAI	MIT	Link
python-sounddevice	MIT	Link
python-soundfile	BSD-3-Clause	Link
numpy	BSD-3-Clause	Link
pyperclip	BSD-3-Clause	Link
pydub	MIT	Link
humanize	MIT	Link
auto-py-to-exe	MIT	Link

Development dependencies:

Package	License	License File
pylint	GPL-2.0-or-later	Link
yapf	Apache-2.0	Link

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
.vscode		.vscode
core		core
documentation		documentation
resources		resources
.dockerignore		.dockerignore
.gitignore		.gitignore
.style.yapf		.style.yapf
CHANGELOG.md		CHANGELOG.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
auto-py-to-exe-config_cli.json		auto-py-to-exe-config_cli.json
auto-py-to-exe-config_ui.json		auto-py-to-exe-config_ui.json
pylintrc		pylintrc
requirements-dev.in		requirements-dev.in
requirements-local-whisper.in		requirements-local-whisper.in
requirements.in		requirements.in
requirements.txt		requirements.txt
settings.py		settings.py
speech_to_clipboard_cli.py		speech_to_clipboard_cli.py
speech_to_clipboard_module.py		speech_to_clipboard_module.py
speech_to_clipboard_ui.pyw		speech_to_clipboard_ui.pyw

License

interactive-applications/speech-to-clipboard

Folders and files

Latest commit

History

Repository files navigation

Speech To Clipboard

Features

Transcription backend

Screenshots

Installation

Usage

CLI (Command Line Interface)

UI (User Interface)

Python Module

Create Executable With AutoPyToExe

Text Replacer

Contributions & Feedback

Credits

License

Third-Party Licenses

About

Topics

Resources

License

Stars

Watchers

Forks

Languages