Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve setup issues for macOS (Venture 13.4): pyaudiowpatch -> pyaud… #91

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
61 changes: 45 additions & 16 deletions AudioRecorder.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,15 @@
import custom_speech_recognition as sr
import pyaudiowpatch as pyaudio
import os
from datetime import datetime

try:
import pyaudiowpatch as pyaudio
except ImportError:
if os.name != "nt":
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Would like a design where we only query the OS once, and inject the OS dependency into the generic code.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you that a structure like this:

if os.name != "nt":
    import pyaudiowpatch as pyaudio
else:
    import pyaudio

would look better. But since now the main platform is Windows, I thought that doing a check every time on an OS that is not fully supported would be redundant. And I thought the approach I chose was the best.

However, I don't mind changing that.

import pyaudio
else:
raise

RECORD_TIMEOUT = 3
ENERGY_THRESHOLD = 1000
DYNAMIC_ENERGY_THRESHOLD = False
Expand Down Expand Up @@ -37,23 +45,44 @@ def __init__(self):
self.adjust_for_noise("Default Mic", "Please make some noise from the Default Mic...")

class DefaultSpeakerRecorder(BaseRecorder):
def __init__(self):
with pyaudio.PyAudio() as p:
wasapi_info = p.get_host_api_info_by_type(pyaudio.paWASAPI)
default_speakers = p.get_device_info_by_index(wasapi_info["defaultOutputDevice"])

if not default_speakers["isLoopbackDevice"]:
for loopback in p.get_loopback_device_info_generator():
if default_speakers["name"] in loopback["name"]:
default_speakers = loopback
break
else:

# Different implementations of obtaining the info dict of a default speaker, for different platforms
if os.name == "nt":
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is what I mean, dependency injection seems appropriate here.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dependency injection at the class level is not the best idea, since their scope will only be at the class declaration level (but not even at the class level). Minimal reproducible example:

class A:
    import os
    print("os_0", os)
    
    def __init__(self):
        print("os_1", os)
        
A()
print("os_2", os)

So it's better to leave the imports at the module level.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sry what I meant was I would like the code to be extendable where we would just have a function at the beginning
SetUpForOS() which handles all the set up, meaning that we wouldn't need to query for the OS 4 times all over the place. Dependency injection is just one technique we could use to get this code generic, which I thought would help here since we only need to override specific functionally for the classes depending on OS.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are talking about monkey patching, but putting it in a separate function is inappropriate. This will confuse the code and complicate its further maintenance. I suggest a more elegant option: group methods in one place, and replace them for a specific OS using a decorator.

MRE for this:

import sys


def os_dependent(target_class):
    current_os = sys.platform
    name_prefix = "_os_"
    name_prefix_len = len(name_prefix)
    
    for os_dep_name in (method_name for method_name in dir(target_class)
                        if method_name.startswith(name_prefix)):
        method_platform_end_index = os_dep_name.find("_", name_prefix_len)
        
        if os_dep_name[name_prefix_len:method_platform_end_index] == current_os:
            # If this method is for the current system, replace.
            setattr(target_class, os_dep_name[method_platform_end_index+1:],
                    getattr(target_class, os_dep_name))
        
    return target_class


@os_dependent
class OSDependentClass:
    def _os_linux_method(self):
        print("_os_linux_method")
        
    def _os_darwin_method(self):
        print("_os_darwin_method")
        
    def method(self):
        # win32 method
        print("method")
        
    def _os_win32_method1(self):
        ...
        
    def method1(self):
        ...
        
print(OSDependentClass.method.__name__)
print(OSDependentClass.method1.__name__)

Or it could be rewritten to apply the decorator not on the class but on os-dependent methods (should be faster, but less convenient)

pros:

  • good readability
  • method signatures are not hidden
  • structure saving
  • comfortably
  • all replacements are made once at the first import

cons:

  • takes a little longer (in processor cycles) than manual "if-else"

It's not just a one-time solution. This won't be the last time you'll need to write OS-specific code. It would be appropriate to put the decorator in a separate file, and use it in other places when necessary.

And to perform OS-dependent actions at the module level, need to use the if-else block(or leave it as it is now - try-catch block). It will be also more appropriate than a standalone function.

def _get_default_speaker(self):
# Requires PyAudioWPatch >= 0.2.12.6
with pyaudio.PyAudio() as p:
try:
# Get loopback of default WASAPI speaker
return p.get_default_wasapi_loopback()

except OSError:
print("[ERROR] Looks like WASAPI is not available on the system.")

except LookupError:
print("[ERROR] No loopback device found.")

else:
def _get_default_speaker(self):
# At the moment, recording from speakers is only available under Windows
# raise NotImplementedError("Recording from speakers is only available under Windows")

# As far as I understand, now the code style does not provide
# for error handling - only print them.
print("[ERROR] Recording from speakers is only available under Windows.")


def __init__(self):
default_speaker = self._get_default_speaker()

if not default_speaker:
print("[ERROR] Something went wrong while trying to get the default speakers.")
super().__init__(source=sr.Microphone(sample_rate=16000), source_name="Speaker")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think its appropriate to duplicate default mic here. Might be better to just have source as None and extend code to allow none sources where we do nothing with that recorder...not sure

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what I originally wanted to do, BUT for this it will be necessary to make a number of changes to the main.py, AudioTranscriber.py and possibly in other places, since the current logic does not provide for such interventions.

Again, if you think this is a necessary change, we can discuss it. I thought that rewriting a significant part of the code to create a temporary (until the issue with the speakers is resolved) patch is unnecessary.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think a rewrite is necessary, more of a extension of the code where we also allow no op recorder would be fine right? Only thing is AudioTranscriber init should be able to handle none source. Hopefully code is written where supporting MacOS with no op recording can be extended easily again to supporting MacOS with specific mac recording.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SevaSk Any update here?

return

source = sr.Microphone(speaker=True,
device_index= default_speakers["index"],
sample_rate=int(default_speakers["defaultSampleRate"]),
device_index=default_speaker["index"],
sample_rate=int(default_speaker["defaultSampleRate"]),
chunk_size=pyaudio.get_sample_size(pyaudio.paInt16),
channels=default_speakers["maxInputChannels"])
channels=default_speaker["maxInputChannels"])
super().__init__(source=source, source_name="Speaker")
self.adjust_for_noise("Default Speaker", "Please make or play some noise from the Default Speaker...")
self.adjust_for_noise("Default Speaker", "Please make or play some noise from the Default Speaker...")
12 changes: 10 additions & 2 deletions AudioTranscriber.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,17 @@
import custom_speech_recognition as sr
import io
from datetime import timedelta
import pyaudiowpatch as pyaudio
from heapq import merge

try:
import pyaudiowpatch as pyaudio
except ImportError:
if os.name != "nt":
import pyaudio
else:
raise


PHRASE_TIMEOUT = 3.05

MAX_PHRASES = 10
Expand Down Expand Up @@ -112,4 +120,4 @@ def clear_transcript_data(self):
self.audio_sources["Speaker"]["last_sample"] = bytes()

self.audio_sources["You"]["new_phrase"] = True
self.audio_sources["Speaker"]["new_phrase"] = True
self.audio_sources["Speaker"]["new_phrase"] = True
5 changes: 4 additions & 1 deletion custom_speech_recognition/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,10 @@ def get_pyaudio():
try:
import pyaudiowpatch as pyaudio
except ImportError:
raise AttributeError("Could not find PyAudio; check installation")
if os.name != "nt":
import pyaudio
else:
raise
from distutils.version import LooseVersion
if LooseVersion(pyaudio.__version__) < LooseVersion("0.2.11"):
raise AttributeError("PyAudio 0.2.11 or later is required (found version {})".format(pyaudio.__version__))
Expand Down
5 changes: 3 additions & 2 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ openai-whisper==20230314
Wave==0.0.2
openai==0.27.6
customtkinter==5.1.3
PyAudioWPatch==0.2.12.5
PyAudioWPatch==0.2.12.6; sys_platform == 'win32'
pyaudio==0.2.13; sys_platform != 'win32'
--extra-index-url https://download.pytorch.org/whl/cu117
torch
torch