Integrate text-to-speech and speech-to-text functionality #44

amakropoulos · 2024-01-22T16:25:01Z

No description provided.

ArEnSc · 2024-02-10T18:37:50Z

please make this an optional package that is separate

amakropoulos · 2024-02-11T08:17:52Z

Yes certainly, it will be possible to attach STT or TTS to the chat functionality but it will not be enabled by default.

simoninithomas · 2024-03-03T20:17:14Z

Hey there 👋 , will you use Sentis for STT and TTS? Or do you have another idea?

We have some Sentis model on the Hub that are super fast (Tiny Whisper and Jets).

Tiny Whisper: https://huggingface.co/unity/sentis-whisper-tiny
Jets: https://huggingface.co/unity/sentis-jets-text-to-speech

Demo with Whisper: https://singularite.itch.io/jammo-the-robot-with-unity-sentis-whisper-version

amakropoulos · 2024-03-04T09:19:30Z

Hi, thank you for the suggestions!
I need to do a small exploration first, but yes I was thinking to start with your Whisper-Tiny model 🙂.
Ideally I would like to support a range of models e.g. similarly to whisper.cpp project but need to have it working cross-platform in Unity which is work-in-progress (link).

By the way, thanks a lot for your great work on the sharp-transformers ⭐!
I'm using it in the other repo, RAGSearchUnity, to build a RAG similarity search system!

siddhant-bharti · 2024-03-09T21:27:16Z

Hi @amakropoulos : I want this functionality for a project I am building! Are you planning to add this soon? I can help raise a PR for this functionality too if you are fine with this? Looking forward to hearing from you. Thanks!

amakropoulos · 2024-03-22T07:41:16Z

@siddhant-bharti I'm replying here as well :).
This is the next big feature that I'll work on soon.

@simoninithomas I can't use Jets because it has a cc-by-4.0 license.
The Unity Asset store does not allow packages with licenses that require attribution and I'd like LLM for Unity to be there as well (p.s. we are live on asset store as of last week 🎉 !)

amakropoulos · 2024-04-04T06:14:33Z

This feature is blocked at the moment.
I can't find an open-source library for TTS to integrate that fulfills the following requirements:

C/C++/C# code without many dependencies
MIT/Apache 2.0 or any other equivalent license that is open-source and attribution-free
allow multiple voices

The best solution would be Piper but at the moment has a potential license issue due to to using espeak (link).

Pipsun · 2024-04-10T08:45:15Z

This feature is blocked at the moment. I can't find an open-source library for TTS to integrate that fulfills the following requirements:

C/C++/C# code without many dependencies

MIT/Apache 2.0 or any other equivalent license that is open-source and attribution-free

allow multiple voices

The best solution would be Piper but at the moment has a potential license issue due to to using espeak (link).

Hello, i've made integration of your project with openCV for facetracking, vroid as avatar, vosk stt and piper tts, but i think that the most interesting is integration with rvc, but have no time for this. Maybe you know something about ready to use RVC Unity integrations?

Swiftyos · 2024-04-21T08:22:29Z

Adding TTS and STT functionality would take llamafile to the next level!

amakropoulos changed the title ~~Integrate text-to-speech and speech-to-text functionalities~~ Integrate text-to-speech and speech-to-text functionalities Jan 22, 2024

amakropoulos changed the title ~~Integrate text-to-speech and speech-to-text functionalities~~ Integrate text-to-speech and speech-to-text functionality Jan 22, 2024

amakropoulos added the enhancement New feature or request label Jan 22, 2024

amakropoulos added this to the v1.2.0 milestone Feb 15, 2024

amakropoulos modified the milestones: v1.2.0, v1.3.0 Mar 4, 2024

amakropoulos removed this from the v1.3.0 milestone Apr 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrate text-to-speech and speech-to-text functionality #44

Integrate text-to-speech and speech-to-text functionality #44

amakropoulos commented Jan 22, 2024

ArEnSc commented Feb 10, 2024

amakropoulos commented Feb 11, 2024

simoninithomas commented Mar 3, 2024

amakropoulos commented Mar 4, 2024

siddhant-bharti commented Mar 9, 2024

amakropoulos commented Mar 22, 2024 •

edited

Loading

amakropoulos commented Apr 4, 2024

Pipsun commented Apr 10, 2024 •

edited

Loading

Swiftyos commented Apr 21, 2024

Integrate text-to-speech and speech-to-text functionality #44

Integrate text-to-speech and speech-to-text functionality #44

Comments

amakropoulos commented Jan 22, 2024

ArEnSc commented Feb 10, 2024

amakropoulos commented Feb 11, 2024

simoninithomas commented Mar 3, 2024

amakropoulos commented Mar 4, 2024

siddhant-bharti commented Mar 9, 2024

amakropoulos commented Mar 22, 2024 • edited Loading

amakropoulos commented Apr 4, 2024

Pipsun commented Apr 10, 2024 • edited Loading

Swiftyos commented Apr 21, 2024

amakropoulos commented Mar 22, 2024 •

edited

Loading

Pipsun commented Apr 10, 2024 •

edited

Loading