-
Notifications
You must be signed in to change notification settings - Fork 746
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Typescript - incorrect type when using verbose_json as the whisper transcription response_format #702
Comments
We hope to add support for this in the coming months. |
I just ran this today and I get this back from the response includes langauge: const transcription = await this.client.audio.transcriptions.create({
file: fs.createReadStream(tempFileName),
response_format: 'verbose_json',
model: 'whisper-1'
}); {
task: "transcribe",
language: "english",
duration: 2.0399999618530273,
text: "Hello World and all the bunnies!",
segments: [
{
id: 0,
seek: 0,
start: 0,
end: 2,
text: " Hello World and all the bunnies!",
tokens: [ 50364, 2425, 3937, 293, 439, 264, 6702, 40549, 0, 50464 ],
temperature: 0,
avg_logprob: -0.5200682878494263,
compression_ratio: 0.8421052694320679,
no_speech_prob: 0.017731403931975365,
}
],
} I'm piggy backing off this issue, I did notice that if I set return type to 'text' the return type is expected is a
Question: are more tokens required/credits used if I request 'verbose_json' vs 'text'? |
…on response_format, fixes openai#702
Hello everyone, I encountered the same issue regarding the return object. As a temporary workaround in my project, I added an interface based on the documentation to better handle the function's return. Another approach could be to use a fork of the project and implement this fix. However, this always necessitates staying vigilant for possible updates and conflicts from the original repository. To address this, I've submitted a Pull Request with the correction, hoping the maintainers will integrate this fix. Let's wait and see. @dereckmezquita Regarding the question of whether the cost differs depending on the type of return, I'm not certain, but I believe it does not. The billing is based on the generation of data counted by tokens, not the size of the response. What the documentation makes clear is that requesting the "verbose_json" response in "words" segments increases latency. |
Confirm this is a Node library issue and not an underlying OpenAI API issue
Describe the bug
With whisper, while using the verbose_json response_format parameter, the
audio.transcriptions.create
returns a typeTranscription
, which does not include the extra details from verbose_jsonTo Reproduce
See the code snippet
Code snippets
OS
macOS
Node version
18.16.1
Library version
openai v4.28.4
The text was updated successfully, but these errors were encountered: