Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

using this stuff for a newbie #1540

Open
elemich opened this issue Mar 18, 2024 · 1 comment
Open

using this stuff for a newbie #1540

elemich opened this issue Mar 18, 2024 · 1 comment

Comments

@elemich
Copy link

elemich commented Mar 18, 2024

hello, i'm new to speech recognition, vosx and python, but i want to translate speech from a simple video i downloaded from the internet (and later even tts'ing to my language or even speech to speech).
i have tried the listen_in_background function in the example with google engine and it works although i'm not able to obtain my goal (word by word translation)

with your software, the recognize_vosx in the callback keeps giving me the same result "Please download the model etc..." and i have done it and unzipped in vosk/model/it (i'm italian) but I can't get it to work.

so i have python3.12 installed, pyaudio, speech_recognition, and my ide for now is the simple IDLE, can you please give me a simple source to begin with this stuff?

@elemich elemich closed this as completed Mar 19, 2024
@elemich elemich reopened this Mar 19, 2024
@kakTAKls
Copy link

I am newbie, too. Have you build or download prebuilt a library libvosk.so and put it inside vosk-api/src? You can check in your python idle by importing module. For example, after ">>>" you can put the command from vosk import Model, KaldiRecognizer, SetLogLevel . If you can not see exceptions or errors, the library works. Then you can check https://github.com/alphacep/vosk-api/blob/master/python/example/test_simple.py and edit the file to obtain your goal. In addition, vosk can use ffmpeg to convert a video to the audio format PCM 16khz 16bit mono (you can use test_ffmpeg.py in the folder Although I have tried to learn python syntax, I would use just command line because it is easy. vosk-transcriber -m path-to-the directory-of-module/vosk-model-it-0.22 -t srt -i input.mp3 -o output.text

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants