Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Defaults for Multiscale STFT loss #38

Open
turian opened this issue Sep 12, 2022 · 2 comments
Open

Defaults for Multiscale STFT loss #38

turian opened this issue Sep 12, 2022 · 2 comments

Comments

@turian
Copy link
Contributor

turian commented Sep 12, 2022

        fft_sizes=[1024, 2048, 512],
        hop_sizes=[120, 240, 50],
        win_lengths=[600, 1200, 240],

These are the defaults provided. What sample rate are they intended for?

(Just curious, how did you choose them? But desired sample rate is more important for me.)

@csteinmetz1
Copy link
Owner

This is a good question and likely should be added to the docstring.

These are the values from the paper we based the implementation on https://arxiv.org/abs/1910.11480. Based on the paper they are meant for audio at 24 kHz. I generally do not use these default values in most of my setups which are at a higher sample rate. DDSP opted to use a larger number of window and frame sizes which perhaps mitigates somewhat the variability across sample rates.

@turian
Copy link
Contributor Author

turian commented Sep 13, 2022

Yeah. I guess I take a more hardcore mindset here and believe that NO defaults should be provided, and the docstring should give a few examples (with associated SRs) and their cites. The way it is now, it's a bit easy to footgun yourself I think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants