[1.0.1] - 2024-01-28
With push-to-talk being a highly requested feature, I figured it's worth creating a release now that we have two new recording modes — hold-to-record and press-to-toggle! Thanks to @McEsgow for their initial push-to-talk PR (#28). Plus, we're using faster-whisper now (#11)!
Some more shout-outs:
- @uberkael for resolving non-English keyboard issues by migrating to pynput (#10).
- @thfrei for sharing their whisper-writer implementation that has continuous recording and transcribing (#21).
- @danshapiro for sharing their whisper-writer implementation that has push-to-talk (#21).
- @filyp for sharing their whisper-simple-dictation project that has push-to-talk (#25).
- @avi-cenna for sharing their whisper-server project that integrates with Hammerspoon (#14).
Changelog
Added
- New message to identify whether Whisper was being called using the API or running locally.
- Additional hold-to-talk (PR #28) and press-to-toggle recording methods (Issue #21).
- New configuration options to:
- Choose recording method (defaulting to voice activity detection).
- Choose which sound device and sample rate to use.
- Hide the status window (PR #28).
Changed
- Migrated from
whisper
tofaster-whisper
(Issue #11). - Migrated from
pyautogui
topynput
(PR #10). - Migrated from
webrtcvad
towebrtcvad-wheels
(PR #17). - Changed default activation key combo from
ctrl+alt+space
toctrl+shift+space
. - Changed to using a local model rather than the API by default.
- Revamped README.md, including new Roadmap, Contributing, and Credits sections.
Fixed
- Local model is now only loaded once at start-up, rather than every time the activation key combo was pressed.
- Default configuration now auto-chooses compute type for the local model to avoid warnings.
- Graceful degradation to CPU if CUDA isn't available (PR #30).
- Removed long prefix of spaces in transcription (PR #19).
Full Changelog: v1.0.0...v1.0.1