Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add surround to FMOD plugins #329

Merged
merged 13 commits into from
May 20, 2024

Conversation

Schroedingers-Cat
Copy link
Contributor

This PR adds surround (up to 7.1) to the Steam Audio FMOD plugins (spatializer, reverb, return). It has been tested on Windows 10 using FMOD 2.2.20, VS2022 and Unity 2022.3. There was also a compile issue due to an incomplete name. Generally, the process functions were capable of handling surround apart from the initialization.

FMOD supports some formats that the current state of these plugins aren't prepared to deal with. If the input format is 7.1.4 the plugins will stick to 7.1 as fallback. Formats like raw or default will ask FMOD to not send the process audio callback (effectively bypassing the plugin).

This partially fixes #311. At the time of writing, I was under the impression that this was happening on non-Windows platforms which is why the issue is also about the Steam Audio plugin being silent on Linux.

@Schroedingers-Cat
Copy link
Contributor Author

@lakulish let me know if you want any changes.

@lakulish
Copy link
Collaborator

@Schroedingers-Cat Thanks for putting together this PR. I'm testing it, and I think there's a little more work needed and some questions to think about:

  • This change makes it so the output format of (say) the spatializer effect is always the same as the input format. So if I have a mono event, the output is always mono, unless I right-click on the meters at the very left of the signal chain and set the format to Surround 5.1 or something else. This will then make it so the input and output format of the spatializer effect is Surround 5.1, which is probably not what we want. I think what we want is for the output format of the spatializer effect to be set to the current output format of the final mix.
  • If I add a spatializer effect to an event, with the effect's input format set to stereo, play the event, then stop the event, and then change the input format to 5.1 and play the event, the plugin crashes. This is likely because it needs to reallocate various intermediate buffers (effect->inBuffer, effect->directBuffer, etc.)
  • There's a more fundamental question here. Suppose we arrange for the input format to be mono and the output format to be 5.1. In that case, should we a) output binaural audio to the first two channels of the 5.1 output buffer, or b) pan the audio to all 6 channels of the 5.1 output buffer? And how should we interpret the "direct binaural" setting of the spatializer effect? Should it mean a) "if direct binaural is on, always render 2 channels of binaural audio regardless of output format", or b) "if direct binaural is on, render 2 channels of binaural audio if the output format is stereo, otherwise pan to the output format"?

Feedback welcome! I don't think there's a single correct answer here, so we probably want to do the most flexible or most widely useful thing.

@Schroedingers-Cat
Copy link
Contributor Author

@lakulish thanks a lot for your feedback!

I didn't notice the crash at first because my test project is 7.1. The effects were initialized with 8 channels which makes switching to formats with less channels not crash the plugin. I've pushed some changes to the initialization functions of the plugins which fix the crash. The changes also reset the processing effects so that they continue to work after a format switch.

I also agree that Steam Audio's output format should match its track's output format. However, what I get when setting all properties of outputbuffer to match its speaker layout is that on a mono in and 7.1 out track the Steam Audio Spatializer becomes a mono in mono out effect:
78828e890ad3f11682e9367b312af64d032e7916

Since the FMOD plugin API docs mention in several places that the output format should be set with regard to the input format, I made the effect's output format follow FMOD's input format and that made it work in non-stereo formats like 5.1.

But: a mono input (super common in sound design) should be possible to be played back as non-mono output. And even FMOD's built in spatializer plugin clearly follow the track's output format. I see two ways forward:

  1. Make a hack that Steam Audio never outputs mono, but at least stereo:
  • this will cover the vast majority VR use cases where the output format is usually stereo with HRTFs and the input format is mostly mono in or stereo in
  • any other setups can workaround this by creating their audio events on a dedicated audio track and making the input format of the event's master track match their output format so that the dedicated audio track will be down/upmixed automatically by FMOD
  1. Find out how a spatializer plugin can follow FMOD's master track output format:
  • maybe you know how it works?
  • I asked around the FMOD forums and am waiting for a response
  • I'll try to read the docs again (I already did a thorough read of the process plugin API and the types involved)

Suppose we arrange for the input format to be mono and the output format to be 5.1. In that case, should we a) output binaural audio to the first two channels of the 5.1 output buffer, or b) pan the audio to all 6 channels of the 5.1 output buffer?

I'd say for a mono input and 5.1 output, the user's intent is non-HRTF so panning to all six channels would be the best choice. A 5.1 output is likely a speaker setup and I personally find playback of HRTF filtered material on non-headphones like speakers not sounding very pleasant and not desirable. In the case of a 5.1 headphone, the device will do its own HRTF processing so the signal should not be HRTF processed to that point.

And how should we interpret the "direct binaural" setting of the spatializer effect? Should it mean a) "if direct binaural is on, always render 2 channels of binaural audio regardless of output format", or b) "if direct binaural is on, render 2 channels of binaural audio if the output format is stereo, otherwise pan to the output format"?

My take would be b) because the output format is the last element in the audio path and thus should take priority when determining how to best represent the sound's position. Since I cannot think of a use case for HRTFs on non-stereo setups, rendering a binaural two-channel signal to a six channel track would be contradictory and sub-optimal to the target format.

Additionally, it might be beneficial to have a global bool parameter that enables HRTF rendering for all Steam Audio Spatializer instances that have been marked to use HRTF (for processing paths like direct, reflections etc.) if the current output format is stereo. This way, FMOD would automatically select the best spatialization technique with regards to the sound designers intent and the output format. To make this backwards compatible with previous versions, the parameter could be inverted. So if it's false or undefined, HRTFs will be enabled for stereo output (that's your suggested option b). If it evaluates to true, then HRTFs will be off for every format.

Let me know what you think!

@Schroedingers-Cat
Copy link
Contributor Author

@lakulish I got a response from an FMOD developer regarding the event's output format issue. The reason the FMOD spatializer can follow an event's output format is because it uses an internal API that the developer thinks won't be easy to make publicly accessible. So it isn't feasible to implement following the event's output format.

But there's an alternative approach. We can retrieve the output format of the final mix (not the master track of an event, but the master output of the mixer), which opens up possibilities for improvement.

My suggestion is to introduce a new UI option for the spatializer (and other plugins) named Determine Output Format. This option, represented as an enum with From Input Format and From Final Output choices, would enable users to control the spatializer's output format behavior.

By defaulting to From Final Output, we ensure backward compatibility with existing FMOD VR projects (which typically use stereo output) while enhancing rendering for projects supporting non-stereo layouts. This solution should accommodate a wide range of scenarios and improve spatialization across different speaker setups.

Would love to hear your take on this!

@Schroedingers-Cat
Copy link
Contributor Author

@lakulish I added a dropdown control to the UIs that allows users to control whether the input, mixer or final out defines the format for the Steam Audio effects output (7458e70). It defaults to From Mixer which should make sense in most situations and be backwards compatible with existing VR projects. Feedback is welcome!

I also changed the buffer assignment logic to allow decreasing the buffers when the channel number decreases (cc38c38).

So we should have the first two points of your original post covered unless you have more feedback to those.

At the moment, enabling an HRTF option on the spatializer will still render two channels only even if the speaker format is something like 5.1 at the spatializer's output (third point of your original post). My preferred option would be to ignore the HRTF settings on non-stereo output formats, so that the channels you see at the output of the plugin always correspond to the format Steam Audio is rendering to.

Regarding my suggestion to have a project-wide parameter controlling the HRTF rendering of events with the option activated: I didn't yet find a way to realize this in an acceptable way. FMOD Studio parameters cannot be accessed from the context of a spatializer/DSP plugin. While the DSP parameters could be accessed from the game engine integrations, it would require iterating over every DSP of every track of an event and comparing the effect's name to identify the plugin. Also the HRTF parameter cannot be automated due to an FMOD limitation.

@lakulish
Copy link
Collaborator

lakulish commented Apr 8, 2024

@Schroedingers-Cat Thanks for your response, and for looking into this in detail. I agree that offering users a new parameter for controlling the output channel format of the spatializer effect is a good idea. Thanks for implementing this! As for what to do when going from (say) mono to 5.1, I also find it more intuitive to just use 5.1 panning instead of binaural rendering into the first two channels, so I agree on that front too.

One way to handle the case where Direct Binaural is enabled and the output format is 5.1, would be to replace the Direct Binaural boolean parameter with an enum-valued parameter, which lets users choose between "use panning regardless of output format", "use binaural if output format is stereo, but panning otherwise", and "always use binaural, even if the output format is not stereo". Maybe this isn't necessary, though, so I would perhaps wait to see if there is an actual need for this before implementing it.

I'll run through some more tests, but otherwise this PR looks reasonable. Thanks again!

@Schroedingers-Cat
Copy link
Contributor Author

One way to handle the case where Direct Binaural is enabled and the output format is 5.1, would be to replace the Direct Binaural boolean parameter with an enum-valued parameter, which lets users choose between "use panning regardless of output format", "use binaural if output format is stereo, but panning otherwise", and "always use binaural, even if the output format is not stereo". Maybe this isn't necessary, though, so I would perhaps wait to see if there is an actual need for this before implementing it.

@lakulish Good idea, that would prevent the confusing situations for users I described earlier. I also agree with your point that there might not be a need for rendering binaural to non-stereo formats. I personally cannot think of a reason why you'd want to do that. As long as I have been working on binaural mixes, the final output was always stereo.
So I'd suggest to evaluate the HRTF bools only if the effect's output is stereo. We could rename the option to "HRTF on Stereo Out" or something like this to make it clearer what it does. If you want to discuss this first with your colleagues, I can wait with the implementation.

I still believe there's a need for a simple mechanism to globally control the HRTF rendering of HRTF enabled spatializer instances to differentiate between users using headphones and users using a stereo speaker setup. I asked on the FMOD forums but didn't receive a response yet. The only ideas that I think could work are the following:

  • Attach an HRTF bool to the FMOD core system object (via getUserData and setUserData) which should be accessible from the DSP plugin context but I'm unsure how users of the Steam Audio library will react to Steam Audio taking over this
  • Create an additional float parameter in the UI that is interpreted by Steam Audio as stereo-headphone with a value <= 0.5 and as stereo-speakers with a value > 0.5 which users can add automation to a global parameter (more work for the sound designer, but less intrusive than the other option)

Any feedback on this?

I'll run through some more tests, but otherwise this PR looks reasonable. Thanks again!

Happy to help!

+ Implement automatic global HRTF setting for FMOD plugins
+ Only render binaural when output channels are stereo
@Schroedingers-Cat
Copy link
Contributor Author

@lakulish I've made some more changes to address the HRTF switching concern.

I Introduced a new boolean Steam Audio setting to globally turn of HRTFs. This setting is communicated to the global state of the Steam Audio plugins and is evaluated by each plugin variant to decide whether to render binaural audio or not. The new setting defaults to false, making it backwards compatible with the previous behavior of Steam Audio.

The new Steam Audio setting can be easily exposed to a game's options menu or automatically set if enough information about the user's playback system is available.

Also, binaural rendering is now active only when the output channels are two, as we discussed earlier.

All changes have been made available to the Unity FMOD plugin as well as the Unity Audio plugin.

Let me know what you think!

@lakulish
Copy link
Collaborator

@Schroedingers-Cat Looking at these changes a little more, I wonder if we should decouple the "allow panning to surround" part and the "globally disable binaural" part. Would you mind splitting this into two PRs? I think we're ready to merge the "allow panning to surround" part right away, but might want to think through the "globally disable binaural" part some more. Thanks!

@Schroedingers-Cat
Copy link
Contributor Author

@lakulish no problem, I removed the global HRTF option code from this PR and brought it back for PR #356.

@lakulish lakulish merged commit 4d76d72 into ValveSoftware:master May 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FMOD: Steam Audio Spatializer plugin is stereo only and silent on Linux
2 participants