build webui

bigsk1 · Jun 21, 2024 · 4effe26 · 4effe26
1 parent 3ea068e
commit 4effe26
Show file tree

Hide file tree

Showing 12 changed files with 1,104 additions and 15 deletions.
diff --git a/.env.sample b/.env.sample
@@ -1,13 +1,16 @@
 # Conditional API Usage: Depending on the value of MODEL_PROVIDER, that's what will be used when run.
 # You can mix and match; use local Ollama with OpenAI speech or use OpenAI model with local XTTS, etc.
 
-# Model Provider: openai or ollama
+# Model Provider: openai or ollama - once set if run webui can't change in ui until you stop server and restart
+# openai or ollama
 MODEL_PROVIDER=ollama
 
 # Character to use - Options: samantha, wizard, pirate, valleygirl, newscaster1920s, alien_scientist, cyberpunk, detective
 CHARACTER_NAME=wizard
 
-# Text-to-Speech Provider - Options: xtts (local uses the custom character .wav) or openai (uses OpenAI TTS voice)
+
+# Text-to-Speech Provider - Options: xtts (local uses the custom character .wav) or openai (uses OpenAI TTS voice) - once set if run webui can't change in ui until you stop server and restart
+# openai or xtts
 TTS_PROVIDER=xtts
 
 # OpenAI TTS Voice - When TTS_PROVIDER is set to openai above, it will use the chosen voice.
@@ -20,9 +23,6 @@ OPENAI_BASE_URL=https://api.openai.com/v1/chat/completions
 OPENAI_TTS_URL=https://api.openai.com/v1/audio/speech
 OLLAMA_BASE_URL=http://localhost:11434
 
-# OpenAI API Key for models and speech (replace with your actual API key)
-OPENAI_API_KEY=sk-proj-1111111111
-
 # Models to use - llama3 works well for local usage.
 # OPTIONAL: For screen analysis, if MODEL_PROVIDER is ollama, llava will be used by default.
 # Ensure you have llava downloaded with Ollama. If OpenAI is used, gpt-4o works well.
@@ -32,6 +32,10 @@ OLLAMA_MODEL=llama3
 # The voice speed for XTTS only (1.0 - 1.5, default is 1.1)
 XTTS_SPEED=1.2
 
+
+# OpenAI API Key for models and speech (replace with your actual API key)
+OPENAI_API_KEY=sk-proj-1111111
+
 # NOTES:
 # List of trigger phrases to have the model view your desktop (desktop, browser, images, etc.).
 # It will describe what it sees, and you can ask questions about it:

diff --git a/README.md b/README.md
@@ -15,6 +15,7 @@ Voice Chat AI is a project that allows you to interact with different AI charact
 - **Analyzes user mood and adjusts AI responses accordingly**: Get personalized responses based on your mood.
 - **You can, just by speaking, have the AI analyze your screen and chat about it**: Seamlessly integrate visual context into your conversations.
 - **Easy configuration through environment variables**: Customize the application to suit your preferences with minimal effort.
+- **WebUI or Terminal usage**: Can be ran with either 
 
 
 ## Installation
@@ -72,7 +73,7 @@ Voice Chat AI is a project that allows you to interact with different AI charact
  pip install -r cpu_requirements.txt
  ```
 
-Need to have Microsoft C++ Build Tools for TTS
+Need to have Microsoft C++ Build Tools on windows for TTS
 [Microsoft Build Tools](https://visualstudio.microsoft.com/visual-cpp-build-tools/)
 
 ### Download Checkpoints
@@ -166,8 +167,17 @@ XTTS_SPEED=1.2
 
 Run the application:
 
+Web UI
 ```bash
-python app.py
+uvicorn app.main:app --host 0.0.0.0 --port 8000
+```
+Find on http://localhost:8000/
+
+
+CLI Only
+
+```bash
+python cli.py
 ```
 
 ### Commands
@@ -211,28 +221,30 @@ You are a wise and ancient wizard who speaks with a mystical and enchanting tone
 }
 ```
 
-For XTTS find a .wav voice and add it to the wizard folder and name it as wizard.wav , the voice only needs to be 6 seconds long. Running the app will automaticly find the .wav when it has the characters name and use it. 
+For XTTS find a .wav voice and add it to the wizard folder and name it as wizard.wav , the voice only needs to be 6 seconds long. Running the app will automatically find the .wav when it has the characters name and use it. If only using Openai Speech a .wav isn't needed
 
 
 ## Watch the Demos
 
-GPU - 100% local - ollama llama3, xtts-v2
+Webui - OpenAI and Ollama
 
-[![Watch the video](https://img.youtube.com/vi/WsWbYnITdCo/maxresdefault.jpg)](https://youtu.be/WsWbYnITdCo)
+[![Watch the video](https://img.youtube.com/vi/bgdQkzGltdk/maxresdefault.jpg)](https://youtu.be/bgdQkzGltdk)
 
 
 
-CPU Only mode
+CLI
 
-Alien conversation using openai gpt4o and openai speech for tts.
+GPU - 100% local - ollama llama3, xtts-v2
 
-[![Watch the video](https://img.youtube.com/vi/d5LbRLhWa5c/maxresdefault.jpg)](https://youtu.be/d5LbRLhWa5c)
+[![Watch the video](https://img.youtube.com/vi/WsWbYnITdCo/maxresdefault.jpg)](https://youtu.be/WsWbYnITdCo)
 
 
-Valley girl conversation using ollama llama3, openai tts
 
-[![Watch the video](https://img.youtube.com/vi/HSEFH0UnZEk/maxresdefault.jpg)](https://youtu.be/HSEFH0UnZEk)
+CPU Only mode CLI
 
+Alien conversation using openai gpt4o and openai speech for tts.
+
+[![Watch the video](https://img.youtube.com/vi/d5LbRLhWa5c/maxresdefault.jpg)](https://youtu.be/d5LbRLhWa5c)
 
 
 ## License

diff --git a/app/__init__.py b/app/__init__.py