Dynamic memory pool sizes, changed at runtime? #2258

ximion · 2024-04-15T17:21:55Z

ximion
Apr 15, 2024

Hi!
Thank you for making Iceoryx, it is a really nice tool :-)

I have the following issue that I did not find any solution for in the documentation or code yet: I have an Iceoryx system with a data producer, which provides camera data from a wide array of data acquisition devices, and some consumers of the data. If the system was static, I could define a memory pool that has a payload size of the expected camera video frame, and set chunks which make it work for the expected framerate(s).

Unfortunately though, I have no idea which camera the user will attach, and how big the individual frames will be. Worse, I do not even know how many cameras will be attached. And if I load() a memory block, and none is available, Iceoryx also seems to fail instead of waiting for an available memory block, which makes the system simply lose data (when the GUI is used, even "silently" to a user).

I could paper over this issue by just setting a large chunk size with a decently sized count, but then RouDi will hold gigabytes of shared memory which the OS could actually use a lot better for other things.

The primary application / data producer does know the expected resource demand though, and in theory it could tell RouDi in advance. So:

Is there any way that I can tell RouDi to alter the available memory pool chunk sizes and chunk counts at runtime, without a RouDi restart? (as a restart would kill all apps that are connected)
When loaning a memory chunk, is there a way to make the producer wait if the system theoretically would have a suitably-sized chunk, but currently none is available because lots of data is in flight?

Users could fiddle with the pool sizes manually until everything works, but dynamically altering them at runtime would be a lot nicer. The application has all information available for that, and RouDi is not used by anything else on the system but that application and the processes it shares data with (which is why I am looking forward to the multiple-RouDi support :-D).

Maybe you have an answer for this, even knowing that it isn't possible would already be valuable :-)
Best,
Matthias

elBoberido · 2024-04-15T22:39:26Z

elBoberido
Apr 15, 2024
Maintainer

Hi Matthias,

short answer is sadly no.

The longer answer is more complicated. Let me explain. Depending on your use case, you could hack you around some of the limitations and if you are brave you could try iceoryx2 which would remove these hacks or at least make them less hacky.

Unfortunately, there is currently no way to block on allocation if no chunks are available. The only option would be to periodically "poll" by retrying to allocate chunks. The terminal will be flooded with error messages, though.

In case your apps do not need to be connected all the time, there is an experimental API which allows the apps to disconnect and safely restart RouDi with a different configuration. If this would help, I can tell you a bit more about that.

Being able to connect to multiple RouDi from the same application will be quite some work since the architecture of the current C++ based iceoryx wasn't designed with such use cases in mind.

If your project is more of a spare time project, iceoryx2 might already be interesting for you, especially if you could use Rust. We should soon have C bindings to make it also work from other languages. It is also not yet on feature parity with the current C++ based iceoryx and we are also working to smoothen the rough edges. But it does not require a central daemon and the memory can be configured at runtime when the service is created. It still has a quite static memory config but we are planing to make it more dynamic with the downside of potentially having to deal with memory fragmentation. But when we are done, you should be able to select between multiple allocation strategies, depending on your needs and the safety requirements of your domain.

Btw, can you share more details about your project? Is it a spare time project or more of a day job one?

Best,
also Mathias :)

4 replies

ximion Apr 16, 2024
Author

Hi Mathias! Thank you for the reply! :-)

The project is a day job one, but an academic project - and also open-source. It's called Syntalos and is a system used for scientific data acquisition and synchronization in life sciences in our lab and others in our department and wider area (and hopefully in a lot more places once it's published, which is where Iceoryx comes in!).

The system is based on a graph of modules with input and output ports, like in audio processing, but instead of only handling audio, arbitrary scientific data can be shoveled between modules. This ranges from electrophysiological recordings, to image scans, pressure sensor data and streamed video. Data flows from a source through processing nodes to (usually) some kind of sink module that records it to disk. People can program some processing in Python and other languages (Java and MATLAB are demanded, urgh...), and experience has shown that in science, people either do the craziest stuff or want to do some kind of crazy stuff - so for some modules, process isolation is extremely beneficial.

Giving users the ability to crash the whole data acquisition pipeline is not a great thing, and it has happened in the past that people just locked up the system with a simple time.sleep(600) in a Python script - so all of the stuff that users can touch is running in separate processes now (to improve system resilience), which means that we need to pass data between processes. Until a few weeks ago, Syntalos had a custom implementation for low-latency message passing, but it was pretty crude because it was originally not intended to ultimately grow as complex as it eventually did. Iceoryx is actually a perfect solution for us, as it not only fits well with Syntalos' data model, but also had an impressively low latency (even with unfortunately copying twice as we can't use IOX types everywhere) and most importantly allows me to stop maintaining an IPC implementation ^^

So, no iceoryx2 for me yet :-/ (combining Rust with C++ has also been a major pain for me in the past, but I guess you might have solved all that already :-D). According to the docs, iceoryx2 also doesn't support req/rep patterns yet for communication, only pub/sub - and we'd need both...

Being able to connect to multiple RouDi from the same application will be quite some work since the architecture of the current C++ based iceoryx wasn't designed with such use cases in mind.

All that would be needed is to either re-execute RouDi, or at a later time have a private RouDi for Syntalos, so other application could use Iceoryx in peace without conflicting with Syntalos (fortunately nobody in our space uses Iceoryx yet, so there are no collisions right now - but ROS uses it, which could at some point become an issue).

Unfortunately, there is currently no way to block on allocation if no chunks are available. The only option would be to periodically "poll" by retrying to allocate chunks. The terminal will be flooded with error messages, though.

I thought about doing exactly that, but that felt like an insane hack, at which point I decided to better ask you first if there was a better way ;-)

In case your apps do not need to be connected all the time, there is an experimental API which allows the apps to disconnect and safely restart RouDi with a different configuration. If this would help, I can tell you a bit more about that.

So, I think this would actually be the solution I need! Syntalos knows all the modules it has in advance, and can also guess fairly well how much data is in-flight at a time and how big it is. Both the module processes as well as the master process also have queues, so using the wait-for-publisher/consumer options of Iceoryx, I could ensure that I never run out of memory. Restarting the separate module processes or asking them to reattach is also a possibility.

So, in other words, if I could detach the main application from RouDi, calculate sensible mempool sizes, re-exec RouDi and then reattach to it before a data acquisition pipeline is launched, it would actually solve my problem without any drawbacks - at least none that I can currently see :-)

Okay, that was a quite verbose reply... But I guess you know very well now what I am trying to do :-D
Thank you for your reply, the last option would be really interesting!

elBoberido Apr 16, 2024
Maintainer

We are always happy to help academic projects :)

All that would be needed is to either re-execute RouDi, or at a later time have a private RouDi for Syntalos, so other application could use Iceoryx in peace without conflicting with Syntalos (fortunately nobody in our space uses Iceoryx yet, so there are no collisions right now - but ROS uses it, which could at some point become an issue).

The experimental API on master introduced a domain id which makes it possible to run multiple RouDi in parallel. There would even be the option to make ICEORYX_RESOURCE_PREFIX a compile time option, then you could have a fully isolated RouDi.

I thought about doing exactly that, but that felt like an insane hack, at which point I decided to better ask you first if there was a better way ;-)

Depending on how fast you want to react when the memory becomes available, there is another solution. We publish some introspection data every second. You could subscribe to the memory topic and wait till the specific mempool has again some free chunks. You probably won't win much by this, except that the terminal is not flooded with error messages.

So, I think this would actually be the solution I need! Syntalos knows all the modules it has in advance, and can also guess fairly well how much data is in-flight at a time and how big it is. Both the module processes as well as the master process also have queues, so using the wait-for-publisher/consumer options of Iceoryx, I could ensure that I never run out of memory. Restarting the separate module processes or asking them to reattach is also a possibility.

Yes, this could work although you might need to hold quite a lot of chunks in reserve without needing them since the blocking will only kick in if the queue is full and the subscriber can cache additional MAX_CHUNKS_HELD_PER_SUBSCRIBER_SIMULTANEOUSLY.

So, in other words, if I could detach the main application from RouDi, calculate sensible mempool sizes, re-exec RouDi and then reattach to it before a data acquisition pipeline is launched, it would actually solve my problem without any drawbacks - at least none that I can currently see :-)

You do not need to restart RouDi, you can also run it as part of your man application. This could simplify a few things but you have to take care of a few others.

It is also important that no application is registered at RouDi anymore when it shuts down. If you cannot guarantee this, an option would be to extend RouDi to shut down when the last app unregisters. You could then just start a second RouDi and forget about the first one. The only problem would be how to communicate the latest domain id to all the applications trying to connect to RouDi.

According to the docs, iceoryx2 also doesn't support req/rep patterns yet for communication, only pub/sub - and we'd need both...

The experimental API currently also supports only pub-sub but it isn't difficult to add req-resp, I just did not yet have time. If you want, you could try to create a patch. I could give you some hints what needs to be changed and pub-sub is a good place to pick up some ideas for implementation. I'd like to have it in the 3.0 release but was quite busy with the founding of a company the last weeks. Luckily the bureaucracy is mostly done and there is more time for coding again.

Funnily someone had a similar request. These two comment should help you with the experimental API and RouDi
#2248 (comment)
#2248 (comment)

If you need more help, don't hesitate to ask

ximion Apr 19, 2024
Author

The experimental API on master introduced a domain id which makes it possible to run multiple RouDi in parallel. There would even be the option to make ICEORYX_RESOURCE_PREFIX a compile time option, then you could have a fully isolated RouDi.

That sounds very neat! We are building our own RouDi anyway and launch and terminate it with the main application. So compiling in a prefix would be fine...

Depending on how fast you want to react when the memory becomes available, there is another solution. We publish some introspection data every second. You could subscribe to the memory topic and wait till the specific mempool has again some free chunks.

That would be nice in theory, but if we have a camera acquiring images at 600 fps, even one second delay is pretty bad and could trigger an internal safety check that would stop the data acquisition run (occasionally, some modules are simply not fast enough to handle the requested volume of data, and that's okay. It's only an issue if not the module itself causes the delay, but Iceoryx not being able to dole out the requested chunks fast enough because e.g. they weren't configured in the appropriate size or another module is using them all up). Furthermore, the appeal of Iceoryx is pretty much the low latency (among some other things), so waiting for a seconds would be a bit sad...

Both the module processes as well as the master process also have queues, so using the wait-for-publisher/consumer options of Iceoryx, I could ensure that I never run out of memory. Restarting the separate module processes or asking them to reattach is also a possibility.

Yes, this could work although you might need to hold quite a lot of chunks in reserve without needing them since the blocking will only kick in if the queue is full and the subscriber can cache additional MAX_CHUNKS_HELD_PER_SUBSCRIBER_SIMULTANEOUSLY.

Oh no, that sounds like an issue... But I am not sure if I understand this correctly: Assuming both pub and sub are blocking and have a queue size of 1 set, and I have 4 chunks available. Pub loans a chunk (3 left) fills it and sub receives it, but is slow processing it. Pub loans another chunk (2 left) and sends it, but sub is still not done, so we now have sub processing a chunk and one chunk in the queue in-flight. If now Pub tries to get another chunk, it should wait, shouldn't it? So I would need QueueSize+1 chunks for this scheme in the best case, and QueueSize+SubscriberCount chunks in the absolute worst case. Is MAX_CHUNKS_HELD_PER_SUBSCRIBER_SIMULTANEOUSLY per pub/sub connection, or does it govern all the possible chunks that a subscriber can hold? In the latter case, the subscriber couldn't process an incoming req/rep request while dealing with a pub/sub chunk in parallel, which would be a bit sad.

What I want to avoid at all cost is silently loosing data, which seems to be a harder issue than I anticipated, actually...

You do not need to restart RouDi, you can also run it as part of your man application. This could simplify a few things but you have to take care of a few others.

You mean in the same process? Having RouDi as a child process of the main app and then having it connect to it is quite neat right now, as in case of any kind of crash, RouDi can always shutdown cleanly and remove the shared resources.
But since the main app is connected via the IOX runtime, we can never shutdown RouDir without killing everything. We could detach first though, of course.

It is also important that no application is registered at RouDi anymore when it shuts down. If you cannot guarantee this, an option would be to extend RouDi to shut down when the last app unregisters. You could then just start a second RouDi and forget about the first one. The only problem would be how to communicate the latest domain id to all the applications trying to connect to RouDi.

I can most likely guarantee all processes connected to RouDi are shut down, except for the main app, that one has to be kept running (and would have to detach).

If you want, you could try to create a patch.

I would have to get proficient in Rust first - I have only played with it for a few toy projects, and this would be way more complicated than what I did before...

I'd like to have it in the 3.0 release but was quite busy with the founding of a company the last weeks. Luckily the bureaucracy is mostly done and there is more time for coding again.

Neat! Good luck to you and ekxide :-)

I hope I can find the time to look at the experimental RouDi API soon! Maybe it solves this issue, without me having to give my users a high-stakes "guess the right chunk size!" game.
I probably also somehow need to globally hook into the runtime's error handler, so critical issues aren't hidden in console output that only developers look at.

elBoberido Apr 22, 2024
Maintainer

That would be nice in theory, but if we have a camera acquiring images at 600 fps, even one second delay is pretty bad and could trigger an internal safety check that would stop the data acquisition run (occasionally, some modules are simply not fast enough to handle the requested volume of data, and that's okay. It's only an issue if not the module itself causes the delay, but Iceoryx not being able to dole out the requested chunks fast enough because e.g. they weren't configured in the appropriate size or another module is using them all up). Furthermore, the appeal of Iceoryx is pretty much the low latency (among some other things), so waiting for a seconds would be a bit sad...

Indeed, this would kill your latency. In this case, the "polling" would be the only option. I think it would be possible to add a notification mechanism to the MemPools to be able to use a Listener or WaitSet but I think it is a lot of non-trivial work.

Since we want to become more flexible with iceoryx2, we are looking for use cases and I'd like to know if this is something that would help you:

single consumer which does not cache the received samples or at least only keeps the last N
- multiple consumer might also work but increase the chance to run out of memory, especially with a bad actor
single producer with a circular buffer to allocate memory
- producer asks for X amount of memory
- if there is not enough memory left, the producer waits for a consumer to release memory
the consumer releases the oldest sample first which leads to free the tail of the circular buffer
the producer wakes up and allocates the freed memory

This would be pretty memory efficient and would not suffer from fragmentation but a bad actor could block the whole chain.

Oh no, that sounds like an issue... But I am not sure if I understand this correctly: Assuming both pub and sub are blocking and have a queue size of 1 set, and I have 4 chunks available. Pub loans a chunk (3 left) fills it and sub receives it, but is slow processing it. Pub loans another chunk (2 left) and sends it, but sub is still not done, so we now have sub processing a chunk and one chunk in the queue in-flight. If now Pub tries to get another chunk, it should wait, shouldn't it?

I think I was not fully clear in my answer. In this case pub would still be able to get a chunk but would be blocked in send. So, assuming you have multiple subscriber with a queue size of 1 and each subscriber is a good actor and holds at max 1 sample, then you need QueueSize + SubscriberCount + 1 in the worst case. Each subscriber could hold MAX_CHUNKS_HELD_PER_SUBSCRIBER_SIMULTANEOUSLY which means if your subscriber do not release the chunk before taking a new one, worst case would be QueueSize + (SubscriberCount * MAX_CHUNKS_HELD_PER_SUBSCRIBER_SIMULTANEOUSLY) + 1.

Is MAX_CHUNKS_HELD_PER_SUBSCRIBER_SIMULTANEOUSLY per pub/sub connection, or does it govern all the possible chunks that a subscriber can hold? In the latter case, the subscriber couldn't process an incoming req/rep request while dealing with a pub/sub chunk in parallel, which would be a bit sad.

It is per connection. You can have other pub-sub connection which would still be able to take sample or a req-res pair. The MAX_CHUNKS_HELD_PER_SUBSCRIBER_SIMULTANEOUSLY is a hard limit from iceoryx, though. You can of course reduce that limit with your own logic and prevent taking samples if there are too many in flight. Might be easier said than done, though.

You mean in the same process? Having RouDi as a child process of the main app and then having it connect to it is quite neat right now, as in case of any kind of crash, RouDi can always shutdown cleanly and remove the shared resources.
But since the main app is connected via the IOX runtime, we can never shutdown RouDir without killing everything. We could detach first though, of course.

With the experimental API it would be possible to detach. This would at least give you an option. With RouDi as child process you do not even need to create a toml config but can create it programmatically.

I can most likely guarantee all processes connected to RouDi are shut down, except for the main app, that one has to be kept running (and would have to detach).

Right. In this case the introspection might help you to determine if all processes are detached.

I would have to get proficient in Rust first - I have only played with it for a few toy projects, and this would be way more complicated than what I did before...

I was unclear again. This would be a patch for the C++ based iceoryx. The Rust based iceoryx2 does not use a static runtime and also no central daemon.

Neat! Good luck to you and ekxide :-)

Thanks :)

I probably also somehow need to globally hook into the runtime's error handler, so critical issues aren't hidden in console output that only developers look at.

This is also possible with the current main branch. Additionally, you could also easily hook into the logger and forward it to whatever fits your needs :)

Another sweet improvement of the experimental API is the graceful handling of runtime errors. If the runtime (hidden behind the Node in the API) fails, it does not terminate but uses iox::expected to communicate errors to the user.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dynamic memory pool sizes, changed at runtime? #2258

{{title}}

Replies: 1 comment 4 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Dynamic memory pool sizes, changed at runtime? #2258

ximion Apr 15, 2024

Replies: 1 comment · 4 replies

elBoberido Apr 15, 2024 Maintainer

ximion Apr 16, 2024 Author

elBoberido Apr 16, 2024 Maintainer

ximion Apr 19, 2024 Author

elBoberido Apr 22, 2024 Maintainer

ximion
Apr 15, 2024

Replies: 1 comment 4 replies

elBoberido
Apr 15, 2024
Maintainer

ximion Apr 16, 2024
Author

elBoberido Apr 16, 2024
Maintainer

ximion Apr 19, 2024
Author

elBoberido Apr 22, 2024
Maintainer