Better handling of multichannel in decibel scaling #1778

bmcfee · 2023-11-08T18:33:03Z

Is your feature request related to a problem? Please describe.

When we implemented multichannel support #1130 , one subtle issue that pops up is the calculation of a reference amplitude in decibel scaling. This is because when reference is given as a function (eg, ref=np.max), it is applied globally to the entire input to compute a scalar value:

librosa/librosa/core/spectrum.py

Lines 1866 to 1868 in 5ca70f5

 if callable(ref): 

 # User supplied a function to calculate reference power 

 ref_value = ref(magnitude)

This makes sense in the single channel case. However, in the multichannel case, it can break assumptions of channel independence. This was first noticed in the mfcc calculation, where deriving mfccs (which depends on db scaling) can produce different results on a multichannel input than if it was applied independently to each channel. It's also popping up in #1766 by way of onset detection (again, depending on db scaling).

Describe the solution you'd like
Instead of aggregating globally, we could aggregate only over the trailing dimensions. Using keepdims=True should result in channel-wise independent processing. This would be a breaking change, but I expect it would be more in line with expected behavior so it could be worth doing.

There is a subtlety here though. It's not always clear how many trailing dimensions should be included. Most of the time it would be two (time and frequency); sometimes it could be more (eg if patch processing spectra) or less (db scaling single-channel measurements). Probably we'd need some API to expose the aggregation dimensions here, and that would require some thought.

The text was updated successfully, but these errors were encountered:

bmcfee added enhancement Does this improve existing functionality? API change Does this change the behavior of existing API? labels Nov 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better handling of multichannel in decibel scaling #1778

Better handling of multichannel in decibel scaling #1778

bmcfee commented Nov 8, 2023

Better handling of multichannel in decibel scaling #1778

Better handling of multichannel in decibel scaling #1778

Comments

bmcfee commented Nov 8, 2023