R410 socket drops on MQTT connections

Hi Everyone,

First thank you for taking the time to read.

I am using a Sara R410M connected to an ESP32 module. I am very experienced with ESP32 but am new to cellular in embedded applications.

My application is to develop a Gateway device with uses the ESP32 as the main processor and the Sara R410 as the connectivity method to the Internet. We’ll be communicating data to various cloud services over MQTT sockets(Azure, AWS IoT, Losant, etc). We have already developed this product with WiFi connectivity(ESP32 stand alone) but now wish to make a cellular variant.

I have written an ESP32 compatible library for the Sara-R410 which implements the functionality of WiFiClient and WiFiClient secure. The application instantiates an object of the library and then passes it to the MQTT PubSub client which then handles connectivity and data transmissions to the cloud. Everything is working perfectly except for the fact that the MQTT socket connection keeps dropping out. I establish a connection to the cloud via MQTT then I read the connection status of the Socket via the at+usoctl=0,10 command in a loop. It returns a 4(connection established) for about 23 seconds, then after that it returns a 9(Last_Ack status) so the application has to connect again.

Can anyone give insight as to why the socket keeps disconnecting after 23 seconds on the Sara R410? I am using a Hologram sim with the data set to unlimited. I have tried this with 3 different cloud services and each time the connection drops after 23 seconds or so.

Thank you

Update.

I have determined the cause of the dropped connection is the MQTT Keep Alive timeout. So the server(AWS in this case) is closing the socket since no data or Ping Request has been received for the timeout period specified by the client(ESP32+R410) which was set to 30 seconds. I have determined Ping Requests are being sent from the hardware to AWS but a response is not always received. There seems to be a reliability issue over the cellular connection but still trying to work that out.

Hi IOTrav, sorry to bother you, I’m working on a product that embed a cellular modem as alternative to the ethernet one.
I’m in touch with different mobile operator in order to understand what is for me a new world.
Two things for which I’m striving to get information are the

  • NAT translation table entries timeout
  • PDP Context timeout

Because both of them lead to a broken TCP connection I’m working to understand the maxinum ping period that I can use in order to keep the connection alive. This is direclty related to the idle traffic generated just to keep the MQTT connection alive and working.
With europen or even worse international roaming SIM the variety of policies used by the operators can lead to a very short ping period in order to make the application reliable. Up to know the most restrictive operator that I’ve meet is Vodafone that require a ping period less that 60 seconds. This can produce with MQTToTLS a montly traffic that can be as high as ~14MiB.
I’m wondering if your problem was related the nat or pdp timeout? Have you solved the problem lowering the MQTT PING?

Best regards
Mirko

Hello IOTrav,

I’ve a similar problem. I’ve a custom board with ESP32 connected to uBLOX SARA-R410M-02B (firmware version L0.0.00.00.05.08 [Apr 17 2019 19:34:02]). Would like to check whether you were able to send data to AWS?
If yes how you had process the data and certification? You have used certification by ublox or in ESP32?
Thanks.

@Anna
I am using the R410M with a Raspberry Pi that connects to an MQTT agent on AWS. I am using a default Mosquitto agent on AWS (ubuntu), with no special or additional configuration. I configured AWS to allow incoming traffic on the default MQTT ports. While I have experienced the 23 second time-out mentioned above, I have had no problem sending to AWS.

@IOTrav
I looked into the 23 second issue last year and if memory serves it turns out that 23 is approximately 1.5 times the default keep-alive timeout (15 seconds). I believe the MQTT client algorithm will drop the connection after 23 (= 1.5x15) seconds without a ping response. I think this was due to either a super-busy agent, or a weak link in the communications that introduced too much latency (probably the latter), In any case, the problem was transient and went away after a short time.

1 Like

Thank you very much @davidm.