Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better handling of multichannel in decibel scaling #1778

Open
bmcfee opened this issue Nov 8, 2023 · 0 comments
Open

Better handling of multichannel in decibel scaling #1778

bmcfee opened this issue Nov 8, 2023 · 0 comments
Labels
API change Does this change the behavior of existing API? enhancement Does this improve existing functionality?

Comments

@bmcfee
Copy link
Member

bmcfee commented Nov 8, 2023

Is your feature request related to a problem? Please describe.

When we implemented multichannel support #1130 , one subtle issue that pops up is the calculation of a reference amplitude in decibel scaling. This is because when reference is given as a function (eg, ref=np.max), it is applied globally to the entire input to compute a scalar value:

if callable(ref):
# User supplied a function to calculate reference power
ref_value = ref(magnitude)

This makes sense in the single channel case. However, in the multichannel case, it can break assumptions of channel independence. This was first noticed in the mfcc calculation, where deriving mfccs (which depends on db scaling) can produce different results on a multichannel input than if it was applied independently to each channel. It's also popping up in #1766 by way of onset detection (again, depending on db scaling).

Describe the solution you'd like
Instead of aggregating globally, we could aggregate only over the trailing dimensions. Using keepdims=True should result in channel-wise independent processing. This would be a breaking change, but I expect it would be more in line with expected behavior so it could be worth doing.

There is a subtlety here though. It's not always clear how many trailing dimensions should be included. Most of the time it would be two (time and frequency); sometimes it could be more (eg if patch processing spectra) or less (db scaling single-channel measurements). Probably we'd need some API to expose the aggregation dimensions here, and that would require some thought.

@bmcfee bmcfee added enhancement Does this improve existing functionality? API change Does this change the behavior of existing API? labels Nov 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API change Does this change the behavior of existing API? enhancement Does this improve existing functionality?
Development

No branches or pull requests

1 participant