Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[♻️] Add tqdm for Progress Bars #176

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

natmfat
Copy link

@natmfat natmfat commented Jan 9, 2024

Addresses #168

I'm not entirely sure what loops should have been modified, so

  1. I introduced the tqdm dependency
  2. Replaced custom timing logic with tqdm (in places I could find it)

Notebook runs fine but unit tests seemed to have failed on the configuration step.

Example of progress bar in the notebook:
Example of progress bar in the notebook

@JacobGlennAyers
Copy link
Contributor

Nice! I think the most important place for these to go would be on the os.listdir(path) loops inside of the generate_automated_labels_tweetynet, generate_automated_labels_microfaune, generate_automated_labels_birdnet, etc... since those are fairly large bottlenecks.

@JacobGlennAyers
Copy link
Contributor

Here is an example from some unpushed changes I was playing around with on the template-matching branch -
Note that the tqdm is on the loop that iterates through the audio files in a directory
def generate_automated_labels_template_matching(
audio_dir,
isolation_parameters,
manual_id="template",
normalized_sample_rate=44100):
"""

audio_dir (string)
        - Path to directory with audio files.

    isolation_parameters (dict)
        - Python Dictionary that controls the various label creation
          techniques. 

    manual_id (string)
        - controls the name of the class written to the pandas dataframe.
        - default: "template"

    normalized_sample_rate (int)
        - Sampling rate that the audio files should all be normalized to.

Returns:
    Dataframe of automated labels for the audio clips in audio_dir.
"""

logger = logging.getLogger("Template Matching Autogenerated Labels")
assert isinstance(audio_dir, str)
assert isinstance(isolation_parameters, dict)
assert isinstance(manual_id, str)
assert isinstance(normalized_sample_rate, int)
assert normalized_sample_rate > 0
bandpass = False
b = None
a = None
if "cutoff_freq_low" in isolation_parameters.keys() and "cutoff_freq_high" in isolation_parameters.keys():
    bandpass = True
    assert isinstance(isolation_parameters["cutoff_freq_low"], int)
    assert isinstance(isolation_parameters["cutoff_freq_high"], int)
    assert isolation_parameters["cutoff_freq_low"] > 0 and isolation_parameters["cutoff_freq_high"] > 0
    assert isolation_parameters["cutoff_freq_high"] > isolation_parameters["cutoff_freq_low"]
    assert isolation_parameters["cutoff_freq_high"] <= int(0.5*normalized_sample_rate)
    
# initialize annotations dataframe
annotations = pd.DataFrame()

# processing the template clip
try:
    # loading the template signal
    TEMPLATE, SAMPLE_RATE = librosa.load(isolation_parameters["template_path"], sr=normalized_sample_rate, mono=True)
    if bandpass:
        b, a = butter_bandpass(isolation_parameters["cutoff_freq_low"], isolation_parameters["cutoff_freq_high"], SAMPLE_RATE)
        TEMPLATE = filter(TEMPLATE, b, a)
    
    TEMPLATE_spec = generate_specgram(TEMPLATE, SAMPLE_RATE)
    TEMPLATE_mean = np.mean(TEMPLATE_spec)
    TEMPLATE_spec -= TEMPLATE_mean
    TEMPLATE_std_dev = np.std(TEMPLATE_spec)
    n = TEMPLATE_spec.shape[0] * TEMPLATE_spec.shape[1]


except KeyboardInterrupt:
    exit("Keyboard Interrupt")
except BaseException:
    checkVerbose("Failed to load and process template " + isolation_parameters["template_path"], isolation_parameters)
    exit("Can't do template matching without a template")

# looping through the clips to process
for audio_file in tqdm(os.listdir(audio_dir)):
    # skip directories
    if os.path.isdir(audio_dir + audio_file):
        continue
    # loading in the audio clip
    try:
        SIGNAL, SAMPLE_RATE = librosa.load(os.path.join(audio_dir, audio_file), sr=normalized_sample_rate, mono=True)
        if bandpass:
            SIGNAL = filter(SIGNAL, b, a)
    except KeyboardInterrupt:
        exit("Keyboard Interrupt")
    except BaseException:
        checkVerbose("Failed to load " + audio_file, isolation_parameters)
        continue
    
    # generating local score array from clip
    try: 
        local_score_arr = template_matching_local_score_arr(SIGNAL, SAMPLE_RATE, TEMPLATE_spec, n, TEMPLATE_std_dev)
    except KeyboardInterrupt:
        exit("Keyboard Interrupt")
    except BaseException:
        checkVerbose("Failed to collect local score array of " + audio_file, isolation_parameters)
        continue

    # passing through isolation technique
    try:
        new_entry = isolate(
            local_score_arr,
            SIGNAL,
            SAMPLE_RATE,
            audio_dir,
            audio_file,
            isolation_parameters,
            manual_id=manual_id,
        )
        if annotations.empty:
            annotations = new_entry
        else:
            annotations = pd.concat([annotations, new_entry])
    except KeyboardInterrupt:
        exit("Keyboard Interrupt")
    except BaseException as e:
        checkVerbose(e, isolation_parameters)
        checkVerbose("Error in isolating bird calls from " + audio_file, isolation_parameters)
        continue

annotations.reset_index(inplace=True, drop=True)
return annotations

@JacobGlennAyers
Copy link
Contributor

I imagine a message could look something like: "Performing Template Matching on " + dir_name

@JacobGlennAyers
Copy link
Contributor

I'll also add that in the near future we will be adding in two new local score array generation techniques; template-matching and a foreground-background segmentation technique. Once those are up, it will be easier to add tqdm to their respective loops all at once.

@Sean1572
Copy link
Contributor

@JacobGlennAyers Whats the status of this branch? Seems like you wanted to add something before more development was done for it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants