Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] SoundWire BPT/BRA w/ CHAIN_DMA fails with DMA buffer-alloc failure #8751

Closed
plbossart opened this issue Jan 17, 2024 · 5 comments
Closed
Assignees
Labels
bug Something isn't working as expected
Milestone

Comments

@plbossart
Copy link
Member

plbossart commented Jan 17, 2024

Describe the bug
When trying stress tests on the SoundWire BPT/BRA protocol with a CHAIN_DMA, I see a buffer allocation failure after exactly 60 iterations

To Reproduce
Need dedicated kernel branch thesofproject/linux#4679

Reproduction Rate
100%

Expected behavior
no buffer allocation issues.

Impact
showstopper for the feature, but not immediate P1

Environment

  1. Branch name and commit hash of the 2 repositories: sof (firmware/topology) and linux (kernel driver).

  2. Name of the topology file
    n/a

  3. Name of the platform(s) on which the bug is observed.
    LNL

Screenshots or console output
chain-dma-alloc-fail.txt

The alloc failure seems to happen exactly at the same iteration, so this looks like a memory leak of sorts.

root@jf-lnlm-rvp-sdw-1:~# bash loop.sh 
0
1
2
...
57
58
59
60
write.sh: line 9: echo: write error: Invalid argument

@plbossart plbossart added the bug Something isn't working as expected label Jan 17, 2024
@lgirdwood
Copy link
Member

@mwasko @marcinszkudlinski fyi - looks like a memory leak in the chain DMA FW driver ?

@lgirdwood lgirdwood added this to the v2.9 milestone Jan 17, 2024
@plbossart
Copy link
Member Author

I changed the code on the kernel to make sure the two TX and RX chain DMAs were handled in a symmetrical way

  • alloc TX
  • alloc RX
  • enable TX
  • enable RX
  • disable RX
  • disable TX
  • free RX
  • free TX

and I still see the failure.

mtrace-alloc-sequence.log
mtrace.log

This points to a memory leak somewhere rather than a fragmentation issue.

Not good news, this makes the issue even harder to root-cause. Gah.

@plbossart
Copy link
Member Author

plbossart commented Jan 19, 2024

what's super weird is that the heap max_allocated value progresses by two steps of 24, and never reaches a stable point.

see e.g transitions

87280
87304
87328

allocation.log

@plbossart
Copy link
Member Author

Bumping the FIFO size to 16ms (from 10ms) has surprisingly a positive effect in that I can run 295 iterations (instead of 60), but the same problem occurs with a failed memory allocation.
The max allocated value keeps also increasing with steps of 24 as in the 10ms, clearly pointing to memory leaks not clearly reflected in the 'free' stat reports.

[  510.473853] <inf> mem_allocator: heap_alloc_aligned: heap allocated: 39480 free: 933744 max allocated: 133600
[  510.473860] <inf> dma: dma_get: dma_get() ID 0 sref = 1 busy channels 0
[  510.473870] <inf> mem_allocator: heap_alloc_aligned: heap allocated: 39672 free: 933536 max allocated: 133600
[  510.473878] <inf> mem_allocator: heap_alloc_aligned: heap allocated: 70392 free: 902800 max allocated: 133600
[  510.473925] <err> chain_dma: chain_task_init: comp:0 0x0 chain_task_init(): PLB FIFO ask 30720 buff_size 30720 total allocated 30720
[  510.473940] <inf> mem_allocator: heap_alloc_aligned: heap allocated: 70416 free: 902760 max allocated: 133600
[  510.473945] <inf> chain_dma: chain_task_start: comp:128 0x80 chain_task_start(), host_dma_id = 0x00000000
[  510.473956] <inf> chain_dma: chain_host_start: comp:128 0x80 chain_host_start(): dma_start() host chan_index = 0
[  510.473966] <inf> chain_dma: chain_link_start: comp:128 0x80 chain_link_start(): dma_start() link chan_index = 0
[  510.473975] <inf> mem_allocator: heap_alloc_aligned: heap allocated: 70440 free: 902720 max allocated: 133600
[  510.473980] <inf> ll_schedule: zephyr_ll_task_schedule_common: task add 0xa0104b50 0x20510U priority 0 flags 0x0
[  510.473998] <inf> ll_schedule: zephyr_domain_register: zephyr_domain_register domain->type 1 domain->clk 5 domain->ticks_per_ms 38400 period 1000
[  510.474811] <inf> ipc: ipc_cmd: rx	: 0xe030909|0x3000
[  510.474825] <inf> mem_allocator: heap_alloc_aligned: heap allocated: 70592 free: 902552 max allocated: 133600
[  510.474835] <inf> mem_allocator: heap_alloc_aligned: heap allocated: 70912 free: 902216 max allocated: 133600
[  510.474845] <inf> mem_allocator: heap_alloc_aligned: heap allocated: 71440 free: 901672 max allocated: 133600
[  510.474851] <inf> dma: dma_get: dma_get() ID 0 sref = 1 busy channels 0
[  510.474861] <inf> mem_allocator: heap_alloc_aligned: heap allocated: 71968 free: 901128 max allocated: 133600
[  510.474868] <inf> dma: dma_get: dma_get() ID 0 sref = 1 busy channels 0
[  510.474878] <inf> mem_allocator: heap_alloc_aligned: heap allocated: 72160 free: 900920 max allocated: 133600
[  510.474886] <inf> mem_allocator: heap_alloc_aligned: heap allocated: 133600 free: 839464 max allocated: 133600
[  510.474970] <err> chain_dma: chain_task_init: comp:0 0x0 chain_task_init(): PLB FIFO ask 61440 buff_size 61440 total allocated 92160
[  510.474985] <inf> mem_allocator: heap_alloc_aligned: heap allocated: 133624 free: 839424 max allocated: 133624
[  510.474990] <inf> chain_dma: chain_task_start: comp:137 0x89 chain_task_start(), host_dma_id = 0x00000100
[  510.475000] <inf> chain_dma: chain_link_start: comp:137 0x89 chain_link_start(): dma_start() link chan_index = 0
[  510.475011] <inf> chain_dma: chain_host_start: comp:137 0x89 chain_host_start(): dma_start() host chan_index = 0
[  510.475021] <inf> mem_allocator: heap_alloc_aligned: heap allocated: 133648 free: 839384 max allocated: 133648
[  510.475026] <inf> ll_schedule: zephyr_ll_task_schedule_common: task add 0xa0195610 0x20510U priority 0 flags 0x0
[  511.497050] <inf> ll_schedule: do_task_run: perf_cycle task 0xa0104b50, 0x20510U cpu avg 1505 peak 7301
[  511.497065] <inf> ll_schedule: zephyr_domain_thread_fn: ll core 0 timer avg 447, max 1117, overruns 0
[  511.498053] <inf> ll_schedule: do_task_run: perf_cycle task 0xa0195610, 0x20510U cpu avg 1434 peak 2303
[  511.846623] <inf> ipc: ipc_cmd: rx	: 0xe010909|0x0
[  511.846636] <inf> chain_dma: chain_link_stop: comp:137 0x89 chain_link_stop(): dma_stop() link chan_index = 0
[  511.846643] <inf> chain_dma: chain_host_stop: comp:137 0x89 chain_host_stop(): dma_stop() host chan_index = 0
[  511.847058] <inf> ll_schedule: zephyr_ll_task_done: task complete 0xa0195610 0x20510U
[  511.847061] <inf> ll_schedule: zephyr_ll_task_done: num_tasks 2 total_num_tasks 2
[  511.847868] <inf> ipc: ipc_cmd: rx	: 0xe010000|0x0
[  511.847881] <inf> chain_dma: chain_host_stop: comp:128 0x80 chain_host_stop(): dma_stop() host chan_index = 0
[  511.847888] <inf> chain_dma: chain_link_stop: comp:128 0x80 chain_link_stop(): dma_stop() link chan_index = 0
[  511.848055] <inf> ll_schedule: zephyr_ll_task_done: task complete 0xa0104b50 0x20510U
[  511.848058] <inf> ll_schedule: zephyr_ll_task_done: num_tasks 1 total_num_tasks 1
[  511.848070] <inf> ll_schedule: zephyr_domain_unregister: zephyr_domain_unregister domain->type 1 domain->clk 5
[  511.855191] <inf> ipc: ipc_cmd: rx	: 0xe000909|0x0
[  511.855203] <err> chain_dma: chain_task_free: comp:137 0x89 chain_task_free(): PLB releasing 61440 total allocated 30720
[  511.855213] <inf> dma: dma_put: dma_put(), dma = 0x400da870, sref = 0
[  511.855221] <inf> dma: dma_put: dma_put(), dma = 0x400da900, sref = 0
[  511.855973] <inf> ipc: ipc_cmd: rx	: 0xe000000|0x0
[  511.855988] <err> chain_dma: chain_task_free: comp:128 0x80 chain_task_free(): PLB releasing 30720 total allocated 0
[  511.855996] <inf> dma: dma_put: dma_put(), dma = 0x400da8b8, sref = 0
[  511.856005] <inf> dma: dma_put: dma_put(), dma = 0x400da948, sref = 0
[  511.882575] <inf> ipc: ipc_cmd: rx	: 0xe030000|0x3000
[  511.882595] <inf> mem_allocator: heap_alloc_aligned: heap allocated: 38344 free: 934896 max allocated: 133648
[  511.882605] <inf> mem_allocator: heap_alloc_aligned: heap allocated: 38664 free: 934560 max allocated: 133648
[  511.882615] <inf> mem_allocator: heap_alloc_aligned: heap allocated: 39096 free: 934112 max allocated: 133648
[  511.882621] <inf> dma: dma_get: dma_get() ID 0 sref = 1 busy channels 0
[  511.882636] <inf> mem_allocator: heap_alloc_aligned: heap allocated: 39528 free: 933664 max allocated: 133648
[  511.882643] <inf> dma: dma_get: dma_get() ID 0 sref = 1 busy channels 0
[  511.882653] <inf> mem_allocator: heap_alloc_aligned: heap allocated: 39720 free: 933456 max allocated: 133648
[  511.882661] <inf> mem_allocator: heap_alloc_aligned: heap allocated: 70440 free: 902720 max allocated: 133648
[  511.882708] <err> chain_dma: chain_task_init: comp:0 0x0 chain_task_init(): PLB FIFO ask 30720 buff_size 30720 total allocated 30720
[  511.882723] <inf> mem_allocator: heap_alloc_aligned: heap allocated: 70464 free: 902680 max allocated: 133648
[  511.882728] <inf> chain_dma: chain_task_start: comp:128 0x80 chain_task_start(), host_dma_id = 0x00000000
[  511.882740] <inf> chain_dma: chain_host_start: comp:128 0x80 chain_host_start(): dma_start() host chan_index = 0
[  511.882748] <inf> chain_dma: chain_link_start: comp:128 0x80 chain_link_start(): dma_start() link chan_index = 0
[  511.882758] <inf> mem_allocator: heap_alloc_aligned: heap allocated: 70488 free: 902640 max allocated: 133648
[  511.882763] <inf> ll_schedule: zephyr_ll_task_schedule_common: task add 0xa0104b50 0x20510U priority 0 flags 0x0
[  511.882781] <inf> ll_schedule: zephyr_domain_register: zephyr_domain_register domain->type 1 domain->clk 5 domain->ticks_per_ms 38400 period 1000
[  511.883631] <inf> ipc: ipc_cmd: rx	: 0xe030909|0x3000
[  511.883646] <inf> mem_allocator: heap_alloc_aligned: heap allocated: 70640 free: 902472 max allocated: 133648
[  511.883656] <inf> mem_allocator: heap_alloc_aligned: heap allocated: 70960 free: 902136 max allocated: 133648
[  511.883666] <inf> mem_allocator: heap_alloc_aligned: heap allocated: 71488 free: 901592 max allocated: 133648
[  511.883673] <inf> dma: dma_get: dma_get() ID 0 sref = 1 busy channels 0
[  511.883683] <inf> mem_allocator: heap_alloc_aligned: heap allocated: 72016 free: 901048 max allocated: 133648
[  511.883690] <inf> dma: dma_get: dma_get() ID 0 sref = 1 busy channels 0
[  511.883700] <inf> mem_allocator: heap_alloc_aligned: heap allocated: 72208 free: 900840 max allocated: 133648
PLB c0 0 alloc_chunk failed[  511.883713] <inf> mem_allocator: heap_alloc_aligned: PLB alloc failed for 61440 bytes
[  511.883716] <inf> mem_allocator: heap_alloc_aligned: heap allocated: 72208 free: 900840 max allocated: 133648
[  511.883725] <err> buffer: buffer_alloc: buffer_alloc(): could not alloc size = 61440 bytes of type = 32
[  511.883730] <err> chain_dma: chain_task_init: comp:0 0x0 chain_task_init(): failed to alloc dma buffer
[  511.883738] <inf> dma: dma_put: dma_put(), dma = 0x400da870, sref = 0
[  511.883746] <inf> dma: dma_put: dma_put(), dma = 0x400da900, sref = 0
[  511.883758] <err> ipc: ipc_cmd: ipc4: FW_GEN_MSG failed with err 6

Uploading mtrace.log…

@plbossart
Copy link
Member Author

Additional traces show there's a memory leak in the firmware!

[    0.000000] <inf> mem_allocator: rmalloc: PLB rmalloc: bytes 20 new ptr 0x400f8480
[    0.000000] <inf> mem_allocator: rzalloc: PLB rzalloc: bytes 20 new ptr 0x400f8480
[    0.000000] <inf> mem_allocator: rmalloc: PLB rmalloc: bytes 24 new ptr 0xa00f84c0
[    0.000000] <inf> mem_allocator: rzalloc: PLB rzalloc: bytes 24 new ptr 0xa00f84c0
[   77.457091] <inf> mem_allocator: rmalloc: PLB rmalloc: bytes 20 new ptr 0x40102500
[   77.457096] <inf> mem_allocator: rzalloc: PLB rzalloc: bytes 20 new ptr 0x40102500
[   77.462550] <inf> mem_allocator: rmalloc: PLB rmalloc: bytes 148 new ptr 0x40102540
[   77.462553] <inf> mem_allocator: rzalloc: PLB rzalloc: bytes 148 new ptr 0x40102540
[   77.462560] <inf> chain_dma: chain_task_create: comp:0 0x0 chain_task_create(): PLB start
[   77.462571] <inf> mem_allocator: rmalloc: PLB rmalloc: bytes 320 new ptr 0xa0102600
[   77.462575] <inf> mem_allocator: rzalloc: PLB rzalloc: bytes 320 new ptr 0xa0102600
[   77.462580] <inf> chain_dma: chain_task_init: comp:0 0x0 chain_task_init(): PLB start
[   77.462591] <inf> mem_allocator: rmalloc: PLB rmalloc: bytes 528 new ptr 0x40102780
[   77.462598] <inf> mem_allocator: rzalloc: PLB rzalloc: bytes 528 new ptr 0x40102780
[   77.462616] <inf> mem_allocator: rmalloc: PLB rmalloc: bytes 528 new ptr 0x401029c0
[   77.462621] <inf> mem_allocator: rzalloc: PLB rzalloc: bytes 528 new ptr 0x401029c0
[   77.462638] <inf> mem_allocator: rmalloc: PLB rmalloc: bytes 168 new ptr 0xa0102c00
[   77.462643] <inf> mem_allocator: rzalloc: PLB rzalloc: bytes 168 new ptr 0xa0102c00
[   77.462655] <inf> mem_allocator: rballoc_align: PLB rballoc_align: bytes 16384 new ptr 0xa0102d00
[   77.462681] <err> chain_dma: chain_task_init: comp:0 0x0 chain_task_init(): PLB FIFO ask 16384 buff_size 16384 total allocated 32768
[   77.462690] <inf> chain_dma: chain_task_init: comp:0 0x0 chain_task_init(): PLB done
[   77.462695] <inf> chain_dma: chain_task_create: comp:0 0x0 chain_task_create(): PLB done

[   77.462698] <inf> ipc: ipc4_add_comp_dev: ipc_add_comp_dev: PLB: before adding ICD stuff 20 bytes
[   77.462710] <inf> mem_allocator: rmalloc: PLB rmalloc: bytes 20 new ptr 0x40106d40 <<< NEVER RELEASED!!!
[   77.462715] <inf> mem_allocator: rzalloc: PLB rzalloc: bytes 20 new ptr 0x40106d40
[   77.462718] <inf> ipc: ipc4_add_comp_dev: ipc_add_comp_dev: PLB: after adding ICD stuff 20 bytes

[   77.462756] <inf> mem_allocator: rmalloc: PLB rmalloc: bytes 20 new ptr 0x40106d80
[   77.462761] <inf> mem_allocator: rzalloc: PLB rzalloc: bytes 20 new ptr 0x40106d80
[   78.835185] <inf> mem_allocator: rfree: PLB rfree: ptr 0x40106d80
[   78.836210] <inf> mem_allocator: rfree: PLB rfree: ptr 0x40102500
[   78.840783] <inf> chain_dma: chain_task_free: comp:137 0x89 chain_task_free(): PLB start
[   78.840786] <err> chain_dma: chain_task_free: comp:137 0x89 chain_task_free(): PLB releasing 16384 total allocated 16384
[   78.840791] <inf> mem_allocator: rfree: PLB rfree: ptr 0x40102780
[   78.840810] <inf> mem_allocator: rfree: PLB rfree: ptr 0x401029c0
[   78.840828] <inf> mem_allocator: rfree: PLB rfree: ptr 0xa0102d00
[   78.840838] <inf> mem_allocator: rfree: PLB rfree: ptr 0xa0102c00
[   78.840845] <inf> mem_allocator: rfree: PLB rfree: ptr 0xa0102600
[   78.840853] <inf> mem_allocator: rfree: PLB rfree: ptr 0x40102540
[   78.840861] <inf> chain_dma: chain_task_free: comp:137 0x89 chain_task_free(): PLB done
[   78.841503] <inf> chain_dma: chain_task_free: comp:128 0x80 chain_task_free(): PLB start
[   78.841508] <err> chain_dma: chain_task_free: comp:128 0x80 chain_task_free(): PLB releasing 16384 total allocated 0
[   78.841511] <inf> mem_allocator: rfree: PLB rfree: ptr 0x400fe000
[   78.841523] <inf> mem_allocator: rfree: PLB rfree: ptr 0x400fe1c0
[   78.841541] <inf> mem_allocator: rfree: PLB rfree: ptr 0xa00fe480
[   78.841553] <inf> mem_allocator: rfree: PLB rfree: ptr 0xa00fe380
[   78.841560] <inf> mem_allocator: rfree: PLB rfree: ptr 0xa00fde80
[   78.841568] <inf> mem_allocator: rfree: PLB rfree: ptr 0x400fddc0
[   78.841576] <inf> chain_dma: chain_task_free: comp:128 0x80 chain_task_free(): PLB done

The use of the component is just wrong

#if CONFIG_COMP_CHAIN_DMA
int ipc4_chain_manager_create(struct ipc4_chain_dma *cdma)
{
	const struct sof_uuid uuid = {0x6a0a274f, 0x27cc, 0x4afb, {0xa3, 0xe7, 0x34,
			0x44, 0x72, 0x3f, 0x43, 0x2e}};
	const struct comp_driver *drv;
	struct comp_dev *dev;

	drv = ipc4_get_drv((uint8_t *)&uuid);
	if (!drv)
		return -EINVAL;

	dev = drv->ops.create(drv, NULL, cdma);
	if (!dev)
		return -EINVAL;

	/* differentiate instance by unique ids assignment */
	const uint32_t comp_id = IPC4_COMP_ID(cdma->primary.r.host_dma_id
					      + IPC4_MAX_MODULE_COUNT, 0);
	dev->ipc_config.id = comp_id;
	dev->ipc_config.pipeline_id = cdma->primary.r.host_dma_id
				      + IPC4_MAX_MODULE_COUNT;

	return ipc4_add_comp_dev(dev); <<<< MEMORY ALLOCATED
}

int ipc4_chain_dma_state(struct comp_dev *dev, struct ipc4_chain_dma *cdma)
{
	const bool allocate = cdma->primary.r.allocate;
	const bool enable = cdma->primary.r.enable;
	int ret;

	if (!dev)
		return -EINVAL;

	if (allocate) {
		if (enable)
			ret = comp_trigger(dev, COMP_TRIGGER_START);
		else
			ret = comp_trigger(dev, COMP_TRIGGER_PAUSE);
	} else {
		if (enable)
			return -EINVAL;

		/* remove chain part */
		ret = comp_trigger(dev, COMP_TRIGGER_PAUSE);
		if (ret < 0)
			return ret;
		comp_free(dev); <<<<< MEMORY NOT FREED!!!!
	}
	return ret;
}
#endif

plbossart added a commit to plbossart/sof that referenced this issue Jan 22, 2024
The CHAIN_DMA code relies on the component code unfortunately without
a symmetry between memory allocation and release.

ipc4_chain_manager_create() relies on the ipc4_add_comp_dev() helper,
which internally allocate 20 bytes for an icd item included in a list.

ipc4_chain_dma_state() directly releases the component, without
removing the icd item from the list.

This slow memory leak, combined with alignment requirements, gradually
prevents a DMA buffer from being allocated.

This patch adds a search in the list and frees the item. With this
modification the CHAIN_DMA stress tests can loop forever.

Closes: thesofproject#8751
Signed-off-by: Pierre-Louis Bossart <[email protected]>
plbossart added a commit to plbossart/sof that referenced this issue Jan 22, 2024
The CHAIN_DMA code relies on the component code unfortunately without
a symmetry between memory allocation and release.

ipc4_chain_manager_create() relies on the ipc4_add_comp_dev() helper,
which internally allocate 20 bytes for an icd item included in a list.

ipc4_chain_dma_state() directly releases the component, without
removing the icd item from the list.

This slow memory leak, combined with alignment requirements, gradually
prevents a DMA buffer from being allocated.

This patch adds a search in the list and frees the item. With this
modification the CHAIN_DMA stress tests can loop forever.

Closes: thesofproject#8751
Signed-off-by: Pierre-Louis Bossart <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working as expected
Projects
None yet
Development

No branches or pull requests

2 participants