Messages reaching network, not reaching server AGAIN

longtimer · July 16, 2020, 4:11pm

Last year, I had raised the issue that one of our devices was sending data but nothing was delivered to our backend. The related issue was this one (Add tool to grab bytes sent through service).

We are currently encountering this same issue with a different device. The device has been sending messages for more than a year, but suddenly, the messages are not reaching the backend though we can see that they are reaching the network. Nothing has been delivered to our server from that device for 5 days. Our devices are not updated in the field and our service has not changed on the backend. Messages from other identical devices are working properly with their messages delivered.

I ask again if there is any way to debug this as my previous support request was supposed to be passed to customer support but I heard nothing further and it was automatically closed.

Reuben · July 16, 2020, 4:40pm

When you say that you can see they are reaching the network does that mean you are seeing on the dashboard that the device is connected and transferring data? If that’s the case, it doesn’t sound like a network issue.

Can the device ping any other servers?

How do you know the device isn’t reaching your server? Are you doing a packet trace and not seeing anything come through or is it just not being recorded by the application? Is it possible there’s an authentication issue there? If you are using TLS, have you confirmed the clock is set right on the device? Clock skew could cause a sudden failure like that.

longtimer · July 16, 2020, 4:56pm

I am seeing on the dashboard that the device is sending according to the results in the “Recent Data Sessions”. The number of bytes sent is consistent with what was sent previously when the device data was delivered.

The device is super simple and simply detects events at a site, sending the data over the network to our servers. It does not maintain a network presence aside from when it decides to send data.

I know that the data from that device isn’t reaching the server because we have the COAP stack dumping out any messages that it receives and generating errors for anything it does not like. I have a client application and can send messages to the COAP server to trigger either scenario. The device is simple so no authentication is performed prior to receipt of the message. We validate the message and source after it is processed by the COAP stack.

Are you able to confirm that the infrastructure from the packet network is sending out a message to the internet? All the dashboard seems to show is the connection from the device to the network and not any gateway communication. Even having this information would be helpful in debugging.

Reuben · July 16, 2020, 11:06pm

For security and resourcing reasons, we mainly just have access to session metadata and not actual content of messages. If you want to PM me your device ID I can take a look and see if there’s any extra information that might be helpful, but we mostly can see what you can see on the dashboard.

By the way, is this a custom server application or are you using something like AWS IoT?

longtimer · July 16, 2020, 11:21pm

We are using the the Californium COAP stack embedded within our own server.

I understand that you cannot see the message content, but what I see as missing functionality is some indication that the message has been sent to the internet. Right now, only half of the networking information is present in the dashboard in terms of knowing that the device has sent a message. I have worked on telecom to IP gateway software in the past and indicating that the message had been sent was an important part of the operations, administration and maintenance visibility. Without that feedback, the message could be dropped at the gateway and there is no feedback.

Reuben · July 17, 2020, 4:02pm

Thanks for sending the device ID over. We’ll see if we can spot anything on our end and get back to you.
One thing to note is that our network is currently built to be basically protocol/application agnostic so from our end we don’t even really know if you’re sending a message via COAP or making HTTP requests or whatever. All we see are bytes.
But yeah, let me see if there’s any useful information on the network side.

longtimer · July 17, 2020, 4:57pm

Thanks for your help. Can you tell me which part of the path the bytes you are showing represent? Is it the path from the device to the core network, from the device to the cell tower, etc? Knowing this and showing this in the dashboard would provide better understanding.

longtimer · July 21, 2020, 4:51pm

Hi Reuben,

While the network detail is being investigated, are you able to answer my last question?

Thanks,

Jason

Reuben · July 21, 2020, 4:56pm

It is bytes transferred over the local carrier’s network that are then reported to us

Reuben · July 21, 2020, 5:01pm

So I’m actually looking into this right now. There doesn’t appear to be anything wrong with that SIM from a network perspective. We’re also seeing bytes go in both directions. That’s maybe the one thing you wouldn’t see on our dashboard yet is how the traffic splits up.
Your last session was 904 bytes on AT&T. 140 were upstream and 764 were downstream. The fact that bytes are going in both directions makes me think it’s getting some sort of response back.

longtimer · July 21, 2020, 5:15pm

Thanks for looking. If I understand correctly, this could support my hypothesis that the gateway gets the message but fumbles on it somehow. Do you have any visibility into the provider and packet gateway messaging? We are connecting with the hologram provider, but I am not sure what view you have there.

If you look at some of the smaller messages sent, how do they break down between upstream and downstream? From our standpoint, we would expect all to be the same.

Reuben · July 21, 2020, 5:27pm

Yes, that same breakdown happens a few times. You have a few that are 140/764 and then a couple that are smaller at 70/382. Maybe those only sent one message and the bigger ones are two?

We can request some SIM diagnostics on the gateway but this usually involves pulling in a network engineer and tends to be more of a premium support feature. I’ll see what our customer success team thinks.

Have you run any packet traces on your server to see if anything is actually reaching it from the device?

Reuben · July 21, 2020, 5:37pm

Spoke with the team. Can you email success@hologram.io and someone can help you out with getting those gateway diagnostics?

longtimer · July 21, 2020, 5:38pm

Thanks for the help, will do.

longtimer · July 21, 2020, 6:28pm

The message doubling makes sense. There are a few situations where we would send the message twice such as communications being slow or an unrecognized response. We do have packet tracing on the server port because we were not getting messages from the one device.