Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SoundWire: start adding BPT/BRA support #4679

Draft
wants to merge 16 commits into
base: topic/sof-dev
Choose a base branch
from

Conversation

plbossart
Copy link
Member

@plbossart plbossart commented Oct 31, 2023

replacement and extension of PR ##4676

@bardliao @ujfalusi @shumingfan @jack-cy-yu @oder-chiou @charleskeepax @rfvirgil comments welcome.

@plbossart plbossart force-pushed the sdw/start-BRA-support branch 6 times, most recently from db68142 to 323f5fb Compare November 7, 2023 04:11
@charleskeepax
Copy link

Still going through things but overall looks pretty sensible to me. Some initial thoughts/questions would be:

  1. Do we really want to bypass all the frame allocation stuff? Feels like it might be nice in the future to be able to support BRA whilst audio streams are running, for example downloading a large block of filter coefficients, or algorithm module to a device. There are questions about how one would decide how much frame to allocate, although I guess a good start might just be whatever is not used by audio.
  2. Regarding the debugfs interface, I don't actually mind skipping regmap for this, the current regmap is very hard to implement bulk transfers into. However, would it make sense just to have a binary file that you read and write for transferring the actual data rather than the weird hoops with a separate command file and using request firmware to load data for writes?

@plbossart
Copy link
Member Author

Still going through things but overall looks pretty sensible to me. Some initial thoughts/questions would be:

  1. Do we really want to bypass all the frame allocation stuff? Feels like it might be nice in the future to be able to support BRA whilst audio streams are running, for example downloading a large block of filter coefficients, or algorithm module to a device. There are questions about how one would decide how much frame to allocate, although I guess a good start might just be whatever is not used by audio.

Yes we could enable a BPT/BRA stream on top of an existing audio transfer. We can use the remaining bandwidth.

The problem is the other way around. When you have an existing BPT/BRA transfer and you want to start *another * audio stream, then the audio stream would be stuck until the BPT/BRA transfer finishes. This is a less-than-desirable solution, in that we cannot adjust the bandwidth of BRA on the fly. Once BPT/BRA has allocated all the frame for control/tables, then there's no way to walk back and leave enough space for other audio streams.

This is even more important now with devices that have multiple functions on the same link. All the firmware download stuff for SDCA is function specific, so it's really not that crazy to see concurrency of audio and BRA happening at some point for different functions.

  1. Regarding the debugfs interface, I don't actually mind skipping regmap for this, the current regmap is very hard to implement bulk transfers into. However, would it make sense just to have a binary file that you read and write for transferring the actual data rather than the weird hoops with a separate command file and using request firmware to load data for writes?

Oh the debugfs stuff is just to test without waiting for codec drivers to change.

The goal is that the codec driver can use its own binary file or ACPI table and directly do the open/send/wait/close sequence. With SDCA it's even going to be a handshake where the hardware reports it needs firmware download or not, that part has to be entirely handled in the codec driver.

@charleskeepax
Copy link

Yes we could enable a BPT/BRA stream on top of an existing audio transfer. We can use the remaining bandwidth.

The problem is the other way around. When you have an existing BPT/BRA transfer and you want to start *another * audio stream, then the audio stream would be stuck until the BPT/BRA transfer finishes. This is a less-than-desirable solution, in that we cannot adjust the bandwidth of BRA on the fly. Once BPT/BRA has allocated all the frame for control/tables, then there's no way to walk back and leave enough space for other audio streams.

Although you have the same problem anyway right? If you block out the whole bus for the BRA, the audio cant start till it is done either, but I guess this way it does at least force people to only BRA when audio isn't going to be happening. I guess yeah the ideal behaviour would be when a new audio stream starts you would shrink the bandwidth allocated to BRA, but as you say its not obvious how one would implement that. Will have a little meditation on this.

This is even more important now with devices that have multiple functions on the same link. All the firmware download stuff for SDCA is function specific, so it's really not that crazy to see concurrency of audio and BRA happening at some point for different functions.

Perhaps there might be some options to reserve a portion of the bus for BRA, but it would undoubtedly end up being system specific and certainly you could imagine there being times it would be more ideal to use the whole bus.

Oh the debugfs stuff is just to test without waiting for codec drivers to change.

cool cool, will ignore that bit for now.

@plbossart plbossart force-pushed the sdw/start-BRA-support branch 3 times, most recently from 2992436 to 19d606a Compare November 15, 2023 23:28
@keqiaozhang
Copy link
Collaborator

SOFCI TEST

@plbossart plbossart force-pushed the sdw/start-BRA-support branch 3 times, most recently from 5f48d4d to 1a38147 Compare December 7, 2023 20:38
@plbossart plbossart force-pushed the sdw/start-BRA-support branch 2 times, most recently from b03addd to 5ee39c5 Compare December 7, 2023 22:04
@plbossart
Copy link
Member Author

updates based on upstream comments

Split Cadence documentation to different file
Simplified API to use send_async/wait
Moved alignment requirements to platform-specific drivers
Reduced alignment on Intel platforms to 32-bytes (was 128)
Moved crc8 table to Cadence driver
Various error handling improvements

The Bulk Register Access protocol was left as a TODO topic since
2018. It's time to document this protocol and the design of its Linux
support.

Signed-off-by: Pierre-Louis Bossart <[email protected]>
The register definitions are missing a BULK_ENABLE bitfield which must
be set for DP0.

In addition, the existing mapping from PDI to Data Port is 1:1. That's
fine for PCM streams which are by construction in one direction
only. The BTP/BRA protocol is bidirectional and relies on DP0 only,
which breaks the 1:1 mapping. DP0 MUST be mapped to both PDI0 and
PDI1, with PDI1 taking care of the TX direction and PDI1 of the RX
direction.

Signed-off-by: Pierre-Louis Bossart <[email protected]>
In the existing definition of sdw_stream_runtime, the 'type' member is
never set and defaults to PCM. To prepare for the BPT/BRA support, we
need to special-case streams and make use of the 'type'.

No functional change for now, the implicit PCM type is now explicit.

Signed-off-by: Pierre-Louis Bossart <[email protected]>
BPT/BRA need to be special cased, i.e. there's no point in using the
bandwidth allocation since the entire frame can be used.

Signed-off-by: Pierre-Louis Bossart <[email protected]>
For BPT support, we want to allocate the entire audio payload and
bypass the allocation based on PCM/PDM parameters.

Signed-off-by: Pierre-Louis Bossart <[email protected]>
DP0 (Data Port 0) is very similar to regular data ports, with minor
tweaks we can reuse the same code.

Signed-off-by: Pierre-Louis Bossart <[email protected]>
Add definitions and helpers for the BPT/BRA protocol. Peripheral
drivers (aka ASoC codec drivers) can use this API to send bulk data
such as firmware or tables. The design intent is however NOT to
directly use this API but to rely on an intermediate regmap layer.

The API is only available when no other audio streams have been
allocated, and only one BTP/BRA stream is allowed per link.  To avoid
the addition of yet another lock, the refcount tests are handled in
the stream master_runtime alloc/free routines where the bus_lock is
already held. Another benefit of this approach is that the same
bus_lock is used to handle runtime and port linked lists, which
reduces the potential for misaligned configurations.

In addition to exclusion with audio streams, BPT transfers have a lot
of overhead, specifically registers writes are needed to enable
transport in DP0. Most DMAs don't handle too well very small data sets
and they may have alignment limitations.

The size and alignment requirements are for now not handled by the
core but must be checked by platform-specific drivers.

Signed-off-by: Pierre-Louis Bossart <[email protected]>
Add a convenience pointer to the 'sdw_bus' structure. BPT is a
dedicated stream which will typically not be handled by DAIs or
dailinks. Since there's only one BPT stream per link, storing the
pointer at the link level seems rather natural.

Signed-off-by: Pierre-Louis Bossart <[email protected]>
The Cadence IP expects a specific format (detailed in the
Documentation). Add helpers to copy the data into the DMA buffer.

The crc8 table is for now only used by the Cadence driver. This table
might be moved to a common module at a later point if needed by other
controller implementations.

Signed-off-by: Pierre-Louis Bossart <[email protected]>
Mirror abstraction added for master ops.

Signed-off-by: Pierre-Louis Bossart <[email protected]>
Add SoundWire BPT DMA helpers as a separate module to avoid circular
dependencies.

For now this assumes no link DMA, only coupled mode.

Signed-off-by: Pierre-Louis Bossart <[email protected]>
This is needed to be shared between open/send_async/close.

Signed-off-by: Pierre-Louis Bossart <[email protected]>
Add support for BTP API using Cadence and hda-sdw-bpt helpers.

Signed-off-by: Pierre-Louis Bossart <[email protected]>
Add code to show what codec drivers will need to do to enable BPT/BRA
transfers. The only difference is to set the 'command_type' file to
'1'. A zero-value will rely on regular read/write commands in Column0.

Signed-off-by: Pierre-Louis Bossart <[email protected]>
DP0 is required for BPT/BRA transport.

Signed-off-by: Pierre-Louis Bossart <[email protected]>
When the firmware is involved, the data can be transferred with a
CHAIN_DMA on LNL+.

The CHAIN_DMA needs to be programmed before the DMAs per the
documentation. The states are not exactly symmetrical, on stop we must
do a PAUSE and RESET.

The FIFO size of 10ms was determined experimentally. With the minimum
of 2ms, errors were reported by the codec, likely because of xruns.

The code flow deals with the two TX and RX CHAIN_DMAs in symmetrical
ways, i.e.
alloc TX
alloc RX
enable TX
enable RX
disable RX
disable TX
free RX
free TX

Signed-off-by: Pierre-Louis Bossart <[email protected]>
@plbossart
Copy link
Member Author

Verified on LNL w/ and w/o the DSP, time for reviews @bardliao @ujfalusi

@plbossart
Copy link
Member Author

back to draft since we need to retest/rebase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants