Nova socket closed automatically?


#1

The following code sniplet works fine… but only for a few hours… then I get the dreaded “Socket error: timed out” in the console

hologram = HologramCloud(dict(),network=‘cellular’)
result = hologram.network.connect()
if result == False:
print ‘Failed to connect to cell network’
else:
try:
hologram.openReceiveSocket()
while True:
i += 1
ts = time.time()
timestamp = datetime.datetime.fromtimestamp(ts).strftime(’%Y-%m-%d %H:%M:%S’)
receivedMsg = hologram.popReceivedMessage()
connect_status = str(hologram.network.getConnectionStatus())
print str(PID) + ': ’ + connect_status + ’ ’ + timestamp + " received: " + str(receivedMsg)
if len(str(receivedMsg))>4:
result = parseSMS(receivedMsg)
# we never get here, console says socket timed out
# even though the connect_status = CLOUD_CONNECTED

If I restart the Python script, all is well again, data is received immediately

Any thought a) as to the nature of the problem b) how to ensure the socket is still open?


#2

More data. While the script above is running (and getting socket errors on the dashboard), if I issue the bash
sudo hologram modem connect -v
then the code above starts working again, meaning that it receives data from the cloud.
So two points:

  • why does it timeout…
  • and why is network.getConnectionStatus() reporting CLOUD_CONNECTED when it clearly isn’t?

#3

Hello Mixie,

It looks like there might be a few things at play here. First, the aging time for a connection is 60 minutes so you would need to make sure that data is transmitted or received in that time frame to keep the connection alive. If the connection is not kept alive then the network will force a disconnect which led to the issue you describe where you can’t connect but the status shows as connected. We are working on fixing this issue.

Best,
Maiky


#4

Hello Maiky,

I wasn’t aware of the 60 minutes timeout. I need to listen 24x7.
Until a fix is available, is there a code option to increase the timeout?
Or do I need to disconnect and reconnect every, say, 55 minutes?
Or is there another way to detect if the socket is truly alive?

Thank you,
Francois


#5

Hey Francois.

The aging time is based on our partner carrier specs so we cannot modify it. However, your session will only be closed if no data is transmitted or received. The simplest way around this would be to send a heartbeat before the 60th minute.

To clarify, the bug here is the wrong status indicator which is being worked on.

Best,
Maiky


#6

Hello Maiky,

Ah, the carriers… I understand the bug is about incorrect status indicator.
So I should just do a sendMessage with one byte every 55 minutes? Or you meant something else for the heartbeat?

Best,
Francois


#7

Hello Maiky,

As for as I can tell, it is not every 60 minutes.

At 06:58 I sent
recv = hologram.sendMessage("!", topics=[“heartbeat”,“pigate”])
At 07:34 (36 minutes later)
the message from not sent to the Nova.

The Dashboard console says: Last active 8 hours ago on AT&T Mobility - CG
Which is incorrect because a message was received at 06:58

What do you suggest?
Best,
Francois


#8

Hey Francois,

What caused you to get the Socket error at 07:34? Did you send a message from or to the Pi. Also was this an SMS message or something sent through TCP?

Best,
Maiky


#9

Hi Maiky
At 07:34 I ran my code (that normally works) that does a POST with a JSON message.
$url = “https://dashboard.hologram.io/api/1/devices/messages/$therealdevice/$mydevicekey”;
$jsonData = array(
‘deviceid’ => $deviceid,
‘protocol’ => ‘TCP’,
‘port’ => 4010,
‘data’ => $mydata
);
$jsonDataEncoded = json_encode($jsonData);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $jsonDataEncoded);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, array(‘Content-Type: application/json’));
$result = curl_exec($ch);

I can see the message posted on the Dashboard console.

Thanks,
Francois


#10

Update:
A test on 3/14 at 7:54am worked.
After the test I am not sending a packet from the device every 29 minutes. I see all those packets in the Dashboard Console.
I tried a test today, 8 minutes after the last packet sent.
–> Socket error: timed out (see image below)
So it looks like

  1. 60 minutes is not the carrier’s socket timeout. Or 29 minutes.
    –> is it some other number?
  2. The Dashboard Console is incorrect: how come it shows an inbound traffic today at 10:38am yet shows the my Nova was “Last active 2 days ago on AT&T Mobility - CG” (see second screenshot)
    –> so please allow me to be skeptical that the disconnected socket is on AT&T’s side.

Any idea?



#11

Here’s even more interesting. Device sends reliably packets. But cannot receive them.
So how could the socket be in a state of “timed out”?


#12

Used a brand new Nova with a different SIM. Same result.


#13

Hey Mixie,

There is a documented lag on the Dashboard when it comes to usage data, so I wouldn’t read too much into that factoring into what is going on there. We are working on getting this to be as close to real time as possible.

Could you please provide all your code so we can take a look into this.

Best,
Maiky


#14

Hey Maiky,

Gist Nova – you should be able to make it run easily with minor changes (IDs, etc.).
I don’t think the issue is related to the delay in the Dashboard.
Fundamentally I don’t get why the Nova can send data to the Dashboard but when sending to the Nova a few minutes later the socket is timed out.
The problem only occurs after a few hours of inactivity. I can’t tell how long because the connection status call is not working :wink:

Regards,
Francois


#15

Hey Francois,

The first thing I would recommend here is to make sure you update your SDK as frequently as possible to make sure you are on the latest version. We are constantly updating it, so that will make sure you have the most streamlined experience.

As far as your code goes, I don’t see anything wrong. However, as a fail safe I would recommend building in a timed or sms based reconnect feature to make sure your Nova attempts to reconnect without you having to be physically near it.

With regards to sending and receiving messages with the Nova, these are two separate processes so I wouldn’t assume one should work if the other one is working.

I showed this post to Dom, one of our full stack engineers. Who might be able to add a little more context to this.


#16

Regarding what I talked about with Maiky:

  1. The send code is completely independent of whether or not a PPP connection has been established so the fact that you are able to send messages and not recieve them makes sense in that regard. If you have established a PPP connection the send will use the PPP session but otherwise it uses AT commands on the modem to create a temporary socket connection and send the message. This saves on message overhead so your data usage is lower for sending messages.

  2. We are aware of the diconnection issue and a fix is in internal review/QA. Hopefully this will make it into the next SDK release.

  3. The SMS solution Maiky mentioned is because a data connection is not required for the Nova/Modems to be notified that a message is in the queue from the tower. So long as you have a signal and SMS is supported by your device/country, you should be able to recieve SMS. https://hologram.io/docs/reference/cloud/python-sdk/#-popreceivedsms- You could use this as a flag that a message in awaiting it online, or as a way to read in a message so long as the length is under the max.


#17

Hey Dom,

  1. That makes sense. Send works independently from receiving.
  2. I think it is critical for devices that primarily receive before doing something. In particular the getConnectionStatus that is broken.
  3. We are not using SMS. And network.reconnect does not work – the hologram object is destroyed already.
    The answer in code (so far…) is to every 52 minutes try to closeReceiveSocket and disconnect and if that doesn’t work (because hologram object is None) find the pppd processs, kill that process then reconnect:
    Kinda ugly…

def holo_reconnect():
try:
# is Hologram still here?
print "about to do closeReceived"
hologram.closeReceiveSocket()
print "closeReceived"
print "about to do network.disconnect "
hologram.network.disconnect()
print "network.disconnect"
time.sleep(5)
except NameError:
# we have to kill the PID
for process in psutil.process_iter(attrs=[‘name’]):
if process.info[‘name’] == ‘pppd’:
print "Process pppd found. Terminating it."
process.terminate()
print "about to instantiate hologram"
hologram = HologramCloud(dict(),network=‘cellular’)
print "instantiated"
print "about to connect"
recv = hologram.network.connect()
print "connected"
return recv


#18

Regarding #2, that will be fixed in the next SDK release, the patch has been applied internally so that a disconnection from the tower should be properly handled. it will broadcast an event as well so you will know when that disconnect happens and be able to let your code know to handle tearing down or whatever, it should also atttempt to reconnect with the tower.

For #3 that was more just a suggested alternative since you can send SMS from our API to your device and that doesnt require a persistent connection.

That is an interesting bug that you bring up I will have to bring it up with the SDK team to see why the hologram object would possibly be none there… a disconnect should not destroy the object and you should not have to create a new hologram object.


#19

Getting a broadcast event would be great – please provide a Gist to illustrate how to implement properly.
Is there a way to get notified when the next SDK will be available?


#20

If you star or follow the repo on github it will notify you when we release a new version etc…

As for the event system it has been there for a few releases at least and if you want to see how its implemented you can look here:

To do that locally you could probably just do like:

import Hologram.Event

event = Event()
event.subscribe('cellular.disconnected', handleDisconnect)

def handleDisconnect():
    #do some stuff here

I wrote that off the top of my head so you might have to change the import and event object creation. Also the new event should be cellular.forced_disconnect but I just wanted to show you now how it the event system should work.

I have a hunch about your hologram object being none. Is your hologram object a global variable? how is it being passed to that function? if its an instance variable it would make sense that it is none and if it is global make sure you mark it as global in that function as global hologram