Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Qt Thread data exchanging broken #7268

Open
marcusmueller opened this issue Apr 15, 2024 · 3 comments
Open

Qt Thread data exchanging broken #7268

marcusmueller opened this issue Apr 15, 2024 · 3 comments

Comments

@marcusmueller
Copy link
Member

What happened?

We have a problem with getting events into QThreads; essentially, it seems that we can't just exchange data from non-QThread threads into QThreads. What does happen when we do this frequently enough is that we corrupt, probably the stack, enough for malloc tables to get hit. ouch!

I've been trying to slightly read up on the topic, but a lot of info on data exchange into QThreads seems outdated (Qt3-era); some seems wrong. The issue seems to be that Qt's object life time control falls flat if you EMIT a signal from a non-QThread. I can't believe it's impossible to get data into Qt, so this warrants further consideration.

Right now, however, I'm stumped how to do data exchange correctly; the classical "you've been doing it wrong" articles

are not conclusive on that (as far as I can tell). However, what is sure is that we can't just call arbitrary Qt setters from the context of the main (python) thread (or something like the xmlrpc server thread).

I suspect the answer here is work-intense:

  • wrap all setters on our QObjects (that is, all our xyz_forms, e.g. freqdisplayform) in slots
  • replace all setter usage in our GR sinks with signalling of these slots
  • potentially, not sure: wrap complex objects that need to be passed this way into QObjects themselves, so that Qt handles the life times without copy?

System Information

n/a

GNU Radio Version

3.11-git (main)

Specific Version

No response

Steps to Reproduce the Problem

see #6766 (as linked above)

Generally, hammering a qtgui sink's set_something function from a Python thread seems to do the job.

Relevant log output

No response

@willcode
Copy link
Member

True.

@marcusmueller
Copy link
Member Author

After quite a bit of rubber ducking, gdb stepping and source code reading:

Passing of complex large objects works by registering a creator function and the type. I suspect one would want to use a type wrapping a refcounting data holder. Then again, doing multi-thread-safe refcounting requires barriers that might be more expensive than just copying say a 16k element FFT window or similar.

@marcusmueller
Copy link
Member Author

The problem

  • Good news: the payload data transport isn't broken. It correctly wraps the data into a Qt Signal on the GR block side and sends it over to be handled within the Qt main GUI thread.
  • This isn't true for everything else that we exchange with the GUI thread, though
  • Our block setters "illegally" modify GUI state while it's being used

Why did that not occur to us before?

  • Classically, all the qt block interactions were very slow, which for a start reduced the likelihood of anything going wrong
  • In fact, most interactions with Qt GUI were done from within the Qt GUI main thread! You click on a widget, that leads to the even loop in the GUI main thread sending a signal, which then gets handled later in the same thread
    • even changing our Python flowgraph variables from say a range widget is calling a Python setter (which needs to acquire python lock) from within the GUI thread!
  • Now, we have things like people use XMLRPC to change Qt properties, and that actually triggers these bugs! But they've been there before as sporadic crashes, is my guess.
  • I think a lot of single-threaded scheduler ideas are also baked into the architecture; this might become a problem later on, as we fix the above issues

The approach

  • Setters need to be signal emitters in the Qt sense
    • A "signal" is just a function declaration in the C++ header that Qt's moc fills with code, which contains the logic of serializing / type erasing all the argument, putting them into Qt's equivalent of PMTs, and delivering them to the target's event queue.
    • a slot, conversely, is really just a normal function; nothing special about that, but Qt generates the code to de-serialize and thus type-restore the arguments and calling the function we wrote.
    • Which means, for custom objects, there needs to be a constructor that Qt can use for that
  • There are a lot of setters in gr-qtgui code that's public GNU Radio API (and not just public Qt API), and some of them correctly use setters, some don't
  • Some do "complex" things, requiring multiple signals to be sent
    • the functionality of these might need to get moved from the GNU Radio block, sending multiple signals to the Qt form/widget, to the target in some cases, to avoid situations where competing calls (e.g. from the XMLRPC thread) collide, but that is getting into the weeds of corner cases
    • the interesting part is where a block function (setter) needs to get some information from the GUI thread first to then interact with the GUI thread, see next

The getter sub-problem

  • Luckily, we haven't seen a crash on a getter yet – it stands to reason that an SDR application just isn't very prone to asking a sink which color its line markers are very often
  • It's still a problem, because we can't fix it with simple Qt Signal/slots: that's a one-way fire and forget mechanism, not RPC; Qt offers no good way to get data back from a slot
    • QConcurrency doesn't apply to non-QThreads, and GNU Radio doesn't run in a QThread
    • "Fire a signal, wait for reply signal" requires the block side to implement slots on the non-Qt side, and a state machine to continue whatever operation was started; also, doesn't solve the issue of what to do when the same getter was called twice before the reply from the main thread comes back
    • "Fire a signal, put the address of a ref-counted condition variable in, let the slot on the Qt GUI thread side hand back the result through that condvar" is at least thread safe, but it shoehorns the publisher / multi-subscriber model that Qt Signal/slots are into a publish/let the first responder define the value.

Recommended Solution

  • Get the setters consistently ported to Qt signals/slots.
  • The mechanics of that is typically, for each non-Qt class in gr-qtgui (that is, anything ending in _sink.*.h)
    • finding all QWidgets and QPointer<SomethingForm> held as members of the class
    • find all usages of d_found_member->method() or d_found_member.field in the impl.cc
    • if calling into a SomethingForm, check whether that itself emits a single signal.
      • If single signal, no direct meddling with the Qt thread data, fine.
      • If single direct call into what needs to be done from the GUI main thread, convert to signal (might require converting the setter to a signal that can be emitted!)
      • If (multiple calls or signals) that need to happen in context in one go, check whether you need to acquire a lock from the Qt main thread, or move the complex operation into the slot you're calling
  • ignore the getters for now, or use one of the two solutions sketched above

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants