-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Socket read race condition with URC #175
Comments
So the main issue I think is that a socket can be closed by the modem, but this does not mean all data should be "freed" (in |
Sigh, I quickly tried to put a return statement in The modem does not support reading from a socket that was already closed. Meaning, the actual race condition is in the modem.. The MCU side must somehow read out the data faster than the modem receives the socket close from the server? Yet on AT, this is even at the -same time-, meaning it is impossible to read out this data (unless the server does not close the connection immediately afterwards?) |
Hmmm, I did wonder if this might be the case. Let me talk to the relevant people internally to see how they anticipated this would work. |
While I'm checking that, I guess this is what the "shutdown" state of a TCP socket is for, but the AT interface does not export the shutdown state of a socket out as far as the MCU. So, if you do NOT have control over the speed at which the remote server is closing the TCP connection, it might be that you must use PPP mode. I can't recall which MCU platform you are using but, if it is ESP-IDF, there is a preview branch of our proposed integration with PPP mode, on ESP-IDF, which passes internal testing, here: https://github.com/u-blox/ubxlib/tree/preview_cell_ppp_espidf_rmea If you are using ESP-IDF and you fancy going that way, instructions on how to build it can be found here: ...and an example of using it can be found here: ubxlib/example/sockets/main_ppp_espidf.c Line 189 in 8549e3e
Note that we've not release this yet [i.e. it remains subject to change, and further testing] as we want to get it working with Zephyr first, and Zephyr PPP support is still experimental (it seems to be fighting back). |
Well, that sort of defeats the purpose of having a network stack in the module, which is a huge benefit. If the module supports sockets over AT, I would expect them to be usable :) |
Understood: I'm trying to determine how this is meant to work within the module. |
Some other info:
However, on the LENA-R8, this seems to simply be refused. (Again, this is less relevant for me, as it should not even fix the issue, but .. it is also unexpected, so I thought I'd mention it) |
Are you doing this through
...or directly with your own AT commands? If the latter than I think it should be |
Understood. FYI, LENA-R8 is, internally, utterly different from SARA-R422 (in terms of SW as well as in terms of HW); when you tried this with SARA-R422 did it appear to hold-off sending the |
Yes, I think so. I now swapped hardware so I can double check later if I swap back, but if you look at older posts I made, like #165 (comment) You can see in the logs
It shows how the SOCL indeed comes just after the last outstanding byte was read + the connection was closed by the remote in the meantime. |
In that case I think the best thing to do is for me to report this as a LENA-R8 bug. However, that doesn't help you as the FW release cycles for our cellular modules are loooong (given certification requirements etc.). Do you have any influence over the behaviour of the server, any way to delay the socket close? |
Yes, I think this is indeed a modem issue, and please do report it:) I already noticed something in your release notes that the LENA-R8 did not support the httplib correctly. I think now manually doing HTTP over sockets won't be that trivial either, at least when the server closes the socket upon replying. But indeed, I also figured that waiting for the release was not going to be a fast path. |
FYI, not sure if it matters in this use-case or not, but one reason to have the server not close a connection immediately is, if this is with HTTPS, then the device has the opportunity to send further requests on the same connection (or alternatively close the connection) without having to go through all of the TLS security stuff again. In other words, putting the connection-closure responsibility on the device, with just a timeout-type connection closure on the server, gives that little but more flexibility/efficiency. Of course, if this is purely your FOTA server and the device is not going to use the connection any further, I can see why you do what you do, just thought I'd mention it. |
Hi Arnout: I can confirm that this has been reported as a LENA-R8 bug and a fix has been made. When a release with that fix in might be public I've no idea I'm afraid, it will depend on approval cycles and the like: would you like me to leave this open until then? |
Thanks for the feedback! |
OK: I will close this issue but, should I find out that a release is scheduled, I will update this issue anyway. |
Hi,
This is somewhat related to #168, but in short, I am seeing the following race condition:
Scenario:
The scenario is that the following happens on AT:
Meaning that we write to a socket, and immeidiately afterwards, the modem sends an URC with new data length + socket closes (because server has sent all the data)
Issue:
This results in the following race condition:
In
uCellSockRead
:The read code, first checks if the URC has executed, and has a length.
But as you can see, during the lock, the URC parsing came in, and only then determined the length.
What's worse:
The SOCL URC is parsed immeidately afterwards.
Which then executes, still while the other thread is yielding in atcallbacks
The result of this though, is that
The socket handle is set to -1, causing the library to poll an invalid socket for data (AT+USORD=-1,0), and the socket API fails to deliver the data.
(FYI, I noticed this while testing on the LENA-R8, while the SARA-R4 did not show this behavior. Presumably because the reply timing is sutbly different)
Kind regards,
Arnout
The text was updated successfully, but these errors were encountered: