Releases · Macoron/whisper.unity

09 May 08:48

Macoron

1.3.1

30f9e11

1.3.1 Latest

Latest

New minor release. Includes update of whisper.cpp to 1.5.5 and bug fixes.

What's Changed

Fixed out of bounds exception during resampling by @Macoron in #74
Add visionOS support by @Macoron in #75
Added missing Accelerate framework by @Macoron in #76
Update README.md with VisionOS support by @yosun in #77
Updated whisper.cpp to 1.5.5 by @Macoron in #84

New Contributors

@yosun made their first contribution in #77

Full Changelog: 1.3.0...1.3.1

Contributors

yosun and Macoron

Assets 2

30 Nov 21:54

Macoron

1.3.0

25c8d26

1.3.0 - GPU Support

This release introduce whisper.cpp update to 1.5.1, GPU inference support and other minor improvements.

Whisper.cpp updated to 1.5.1

whisper.cpp 1.5.1 got a lot of improvements and bug fixes including better GPU usage.

Check original release notes for more information.

GPU Support

Whisper now supports GPU acceleration. This can drastically improve performance for some hardware.

Model	CPU	CUDA
tiny	1188 ms	185 ms
small	8992 ms	517 ms
large-v2	60325 ms	1946 ms

Tests of "jfk.wav" transcribing on Windows with Intel Core i5-12400F and Nvidia Geforce RTX 2070 Super.

Model	CPU	Metal
tiny	1113 ms	189 ms
small	6319 ms	860 ms
large-v2	40608 ms	3888 ms

Tests of "jfk.wav" transcribing on Apple M1 Pro.

For Windows and Linux you would need Nvidia GPU and installed CUDA Toolkit (tested with 12.2.0). Unity project compiled with enabled CUDA expects your end-users to have Nvidia GPU and CUDA libraries. Trying to run build without it will result error.

For MacOS you would need ARM CPU, like M1 or newer. iOS Metal inference isn't yet supported. In case of Intel or older hardware, whisper.cpp should fallback to CPU inference.

To activate GPU inference, go to Project Settings => Whisper => Enable CUDA or Enable Metal. For more information, check README.

Other

Ubuntu libs now compiled on Ubuntu 20.04. This might cause problems with Ubuntu 18.04. If you need support for earlier versions of Ubuntu or other distros, consider recompiling libs from source.

New loop mode for microphone was added. It creates a new endless non-stopping stream using Unity build-in circular microphone loop. This is very useful for whisper streaming transcription. To activate it - set Loop in MicrophoneRecord to "true".

What's Changed

Endless loop microphone and memory leak fix by @Macoron in #55
Updated whisper.cpp to 1.5.0 by @Macoron in #60
Add CUDA support for Windows by @Macoron in #61
Add CUDA support for Linux by @Macoron in #63
Metal support for MacOS by @Macoron in #64
Updated whisper.cpp to 1.5.1 by @Macoron in #65

Full Changelog: 1.2.1...1.3.0

Contributors

Macoron

Assets 2

25 Aug 09:54

Macoron

1.2.1

8725359

1.2.1

This release introduces VAD and some other minor improvements.

Voice Activity Detection (VAD)

Voice Activity Detection(VAD) was added to this project. It allows you to check if current audio has any speech detected. For example, you can finish microphone input when user stopped speaking.

output.mp4

Implementation of the VAD is very basic. It is direct port of energy-based VAD from whisper.cpp. Don't expect it to be very robust, but as a proof of concept it should work fine.

VAD Streaming

Now streaming supports VAD. This should drastically reduce hallucinations that was caused by silent audio regions.

output_novad.mp4

ggml.base.en, VAD disabled

output_vad.mp4

ggml.base.en, VAD enabled

What's Changed

Added VAD and VAD Stop by @Macoron in #44
Better logging by @Macoron in #48
VAD for streaming by @Macoron in #49
New stream events and more documentation by @Macoron in #53

Full Changelog: 1.2.0...1.2.1

Contributors

Macoron

Assets 2

25 Jul 21:05

Macoron

1.2.0

36526a3

1.2.0

New major release with a lot of changes.

whisper.cpp updated to 1.4.2

While 1.4.2 is technically still in beta, it was available for several month and seems to be working stable. The quality of transcription shouldn't have changed, however some results looks different comparing to previous versions. If this is critical for you, consider using previous releases.

Prompting

Whisper.unity now supports prompting. Prompting helps you to "guide" transcription style, names or specific terminology. It isn't as powerful as prompting LLM, but you can get really interesting results with it.

Streaming

output.mp4

First version of transcription streaming was added. Now transcription will be updating in real-time, using microphone or audio stream. This is mostly direct port of original whisper.cpp demo except VAD.

What's Changed

Update whisper.cpp to 1.4.2 by @Macoron in #30
Add prompting support by @SharafeevRavil in #25
Fixed string conversion error by @Macoron in #34
Add progress callback by @Macoron in #35
setter for modelPath by @achimmihca in #37
Quick-fix of il2cpp by @Macoron in #38
Sliding window streaming support by @Macoron in #40
Fixed some initialize by @Macoron in #41
Samples cleanup by @Macoron in #43

New Contributors

@achimmihca made their first contribution in #37

Full Changelog: 1.1.1...1.2.0

Contributors

Macoron, achimmihca, and SharafeevRavil

Assets 2

04 Jun 10:58

Macoron

1.1.1

9ffa632

1.1.1

Minor release. Add prebuild Linux binaries and Github Actions tests/builds.

What's Changed

Linux support by @Macoron in #21
Add github actions for test runner by @Macoron in #24
CI for build whisper.cpp libraries by @Macoron in #27
Fix unity test runner by @Macoron in #28

Full Changelog: 1.1.0...1.1.1

Contributors

Macoron

Assets 2

29 Apr 14:36

Macoron

1.1.0

cb7a5c3

1.1.0

This release adds timestamps and confidence data for segments and tokens. It changes signature of OnNewSegment event and WhisperResult class, so make sure to update your code if you used them.

What's Changed

Segments timestamp by @Macoron in #17
Tokens data and new subtitles demo by @Macoron in #18

Demos

Segments timestamps prediction

subtitles.mp4

whisper.tiny in subtitles demo, color shows confidence level for each token

Full Changelog: 1.0.3...1.1.0

Contributors

Macoron

Assets 2

21 Apr 18:08

Macoron

1.0.3

ddec093

1.0.3

What's Changed

Language API by @Macoron in #7
Input selector for microphone demo (+some refactoring) (#11) by @SharafeevRavil in #12
Support for Unity 2019.4 and newer by @Macoron in #15
Set prepare iOS for recording by @Macoron in #16

New Contributors

@SharafeevRavil made their first contribution in #12

Language detection example

Full Changelog: 1.0.2...1.0.3

Contributors

Macoron and SharafeevRavil

Assets 2

12 Apr 21:12

Macoron

1.0.2

dce5dae

1.0.2

What's Changed

Add basic unit testing in #5
Text segments streaming in #6
Minor readme changes

Full Changelog: 1.0.1...1.0.2

Text segment streaming

text-streaming.mp4

Assets 2

08 Apr 10:24

Macoron

1.0.1

c11254d

1.0.1

What's Changed

Expose more whisper parameters in #3
Faster Android inference in #4

Full Changelog: 1.0.0...1.0.1

Assets 2

27 Mar 22:40

Macoron

1.0.0

5275217

1.0.0

First release

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

New Contributors

Contributors

Whisper.cpp updated to 1.5.1

GPU Support

Other

What's Changed

Contributors

Voice Activity Detection (VAD)

VAD Streaming

What's Changed

Contributors

whisper.cpp updated to 1.4.2

Prompting

Streaming

What's Changed

New Contributors

Contributors

What's Changed

Contributors

What's Changed

Demos

Contributors

What's Changed

New Contributors

Language detection example

Contributors

What's Changed

Text segment streaming

What's Changed

Releases: Macoron/whisper.unity

1.3.1

What's Changed

New Contributors

Contributors

1.3.0 - GPU Support

Whisper.cpp updated to 1.5.1

GPU Support

Other

What's Changed

Contributors

1.2.1

Voice Activity Detection (VAD)

VAD Streaming

What's Changed

Contributors

1.2.0

whisper.cpp updated to 1.4.2

Prompting

Streaming

What's Changed

New Contributors

Contributors

1.1.1

What's Changed

Contributors

1.1.0

What's Changed

Demos

Contributors

1.0.3

What's Changed

New Contributors

Language detection example

Contributors

1.0.2

What's Changed

Text segment streaming

1.0.1

What's Changed

1.0.0