OpenVoice Local LLM Interaction

This project combines OpenVoice TTS (Text-to-Speech) with a local LLM (Language Model) to create an interactive voice assistant. You can speak to the assistant using your microphone, and it will transcribe your speech, generate a response using the local LLM, and then speak the response back to you using OpenVoice TTS.

Prerequisites

Before running the code, make sure you have the following dependencies installed:

Python 3.7+
PyTorch
OpenAI Whisper
OpenVoice
Simpleaudio
Sounddevice

Installation

LM Studio

Visit the LM Studio website and follow the installation instructions for your operating system.
Make sure to set up the LM Studio API key and endpoint URL.

OpenVoice

Clone the repository:

git clone https://github.com/your-username/openvoice-local-llm.git

Navigate to the project directory:
```
cd openvoice-local-llm
```
Install the required dependencies using pip:
```
pip install -r requirements.txt
```

Install OpenVoice:

conda create -n openvoice python=3.9
conda activate openvoice
git clone [email protected]:myshell-ai/OpenVoice.git
cd OpenVoice
pip install -e .

Download and install the necessary models and checkpoints for OpenVoice and Whisper:
- OpenVoice TTS: Download the checkpoint from here and extract it to the checkpoints folder.
- Whisper: The model will be automatically downloaded when you run the code for the first time.

Usage

Make sure your microphone is connected and properly configured.
Run the test_ai.py script:
```
python test_ai.py
```
Press Enter to start recording your speech. Speak clearly into the microphone.
The assistant will transcribe your speech, generate a response using the local LLM, and then speak the response back to you using OpenVoice TTS.
To exit the program, type 'quit' and press Enter.

Configuration

You can customize the following parameters in the test_ai.py script:

base_url: The URL of your local LLM server.
api_key: The API key for your local LLM server.
config_path: The path to the OpenVoice TTS configuration file.
checkpoint_path: The path to the OpenVoice TTS checkpoint file.
whisper_model: The name of the Whisper model to use for speech recognition.
sample_rate: The sample rate for audio recording.
output_path: The path where the generated audio files will be saved.

Tips and Advanced Usage

Flexible Voice Style Control

Please see demo_part1.ipynb for an example usage of how OpenVoice enables flexible style control over the cloned voice.

Cross-Lingual Voice Cloning

Please see demo_part2.ipynb for an example for languages seen or unseen in the MSML training set.

Gradio Demo

We provide a minimalist local gradio demo. Launch it with python -m openvoice_app --share.

Advanced Usage

The base speaker model can be replaced with any model (in any language and style) that the user prefers. Use the se_extractor.get_se function as demonstrated in the demo to extract the tone color embedding for the new base speaker.

Tips to Generate Natural Speech

There are many single or multi-speaker TTS methods that can generate natural speech and are readily available. By simply replacing the base speaker model with the model you prefer, you can push the speech naturalness to a level you desire.

Windows Installation (VS Code)

Please use this guide if you want to install and use OpenVoice on Windows.

License

This project is open-source and available under the MIT License.

Acknowledgements

OpenVoice - Open-source voice cloning project
OpenAI Whisper - Automatic speech recognition (ASR) system

Contributing

Contributions are welcome! If you find any issues or have suggestions for improvements, please open an issue or submit a pull request.

Contact

If you have any questions or inquiries, please contact the project maintainer at [email protected] - Preston McCauley

Happy interacting with your voice assistant!

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
voiceChatLocalLLM.py		voiceChatLocalLLM.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

voiceChatLocalLLM.py

voiceChatLocalLLM.py

Repository files navigation

OpenVoice Local LLM Interaction

Table of Contents

Prerequisites

Installation

LM Studio

OpenVoice

Usage

Configuration

Tips and Advanced Usage

Flexible Voice Style Control

Cross-Lingual Voice Cloning

Gradio Demo

Advanced Usage

Tips to Generate Natural Speech

Windows Installation (VS Code)

License

Acknowledgements

Contributing

Contact

About

Releases

Packages

Languages

License

clearsitedesigns/LocalVoiceLLM

Folders and files

Latest commit

History

Repository files navigation

OpenVoice Local LLM Interaction

Table of Contents

Prerequisites

Installation

LM Studio

OpenVoice

Usage

Configuration

Tips and Advanced Usage

Flexible Voice Style Control

Cross-Lingual Voice Cloning

Gradio Demo

Advanced Usage

Tips to Generate Natural Speech

Windows Installation (VS Code)

License

Acknowledgements

Contributing

Contact

About

Resources

License

Stars

Watchers

Forks

Languages