pppd fails when running early in boot process

Hello,

I’ve been running into a problem lately that’s been a bit elusive. I’m using a Nova with a Raspberry Pi, and trying to configure the system to establish cellular connection on boot using systemd. Unfortunately, it looks like when attempting connection during the boot process, the connection fails with “timeout sending Config-Requests” error messages. If I disable the appropriate systemd service unit, however, and then manually start the service after SSHing into the Pi, the connection works as planned. It seems like there’s some dependency that’s not properly prepared on boot, but I can’t figure out what it is. Does anyone have any pointers?

systemd service description:

[Unit]
Description=Cellular Network through PPP
ConditionPathExists=/dev/ttyUSB1
After=dev-ttyUSB1.device network.target

[Service]
ExecStart=/usr/bin/pon nova-m
WorkingDirectory=/home/pi
StandardOutput=inherit
StandardError=inherit
User=root
Type=forking

[Install]
WantedBy=multi-user.target

Error messages at end of logs during failure:

Feb 18 19:13:56 raspberrypi pppd[550]: Serial connection established.
Feb 18 19:13:56 raspberrypi pppd[550]: Using interface ppp0
Feb 18 19:13:56 raspberrypi pppd[550]: **Connect: ppp0 <--> /dev/ttyUSB1**
Feb 18 19:13:58 raspberrypi pppd[550]: **PAP authentication succeeded**
Feb 18 19:14:42 raspberrypi pppd[550]: **CCP: timeout sending Config-Requests**
Feb 18 19:14:42 raspberrypi pppd[550]: **IPCP: timeout sending Config-Requests**
Feb 18 19:14:48 raspberrypi pppd[550]: **Connection terminated.**
Feb 18 19:14:49 raspberrypi pppd[550]: **Modem hangup**

Chatscript and peers files pulled from here: GitHub - hologram-io/hologram-tools: The client-side tools you need to help build your next cellular-connected product

I have ran into this and what I end up doing is adding a delay before my code executes. I believe it has to do with the order the system starts these services in and really the most consistent way we’ve found is just waiting a few seconds.

Yes starting too early fails to start ppp sessions. I use a root crontab to start/verify ppp is started and has a viable network connection.

Using a crontab schedule $sudo crontab -e
5/* * * * * /home/your account/ppLog/verify_ppp_active.

Every 5 minutes crontab will launch the verify program, test whether there is a ppp active and if there is connectivity.

If all is true it exits.

If the ppp session is not present in the network interface list, the program will invoke the sudo hologram Modem network connect or the older hologram modem connect.

If the ppp is present and the connection has failed to reach the internet the current ppp session is terminated and a new connect cycle starts. Then the service simply exits.

This allows recovery from failed network and tower issues for persistent connections. Sometime those connections can be down for hours. I have devices throughout the US. I have used this for 3 years now and there was only one failed state that the service could not recover from on the Pi-2. This is when the cell modem gets locked up some reason and only a power cycle can fix it. For Pi-3 it is possible to power cycle the USB port only and then reconnect once the modem has come back up.

Thanks @CLO2! It sounds like you have a lot of in-the-field experience that is helpful here.

Would you mind sharing how you’re terminating the ppp session? I’ve seen some cases where I couldn’t get the ppp connection to connect after a failure (without a power reset), and so I’m wondering if maybe I wasn’t properly terminating the first session.

My alive process, first searches the network connections by using getifaddrs(&addrs); then walking down the list comparing the names of the interfaces looking for “ppp0”. If that interface is not found then the ppp0 session was ended for some unknown reason.

The alive issues “$sudo hologram modem disconnect” and then reattempts a "$sudo hologram modem connect (network connect) and if that does not work as you have pointed out.

My service is a bit more aggressive.
It captures the task list “$ps ax” > “task.txt”

Then it searches “task.txt” file looking for any task that contains “pppd connect” and also contains the “/Hologram”. When it finds those two matches on the task, it extracts the PID for the next step.

For all those that it finds both items, the program issues "$sudo kill -9 [PID extracted above]’

After it has terminated all those tasks it wait 30 seconds and then attempts to reattach.

If the attachment fails, it logs the fail in its own log and exits.

The task is run again by crontab 5 minutes later. And it attempts to perform the task again. Upon that failure, even though crontab will launch the task again in 5 minutes it will read the log and determine enough time has not elapsed and exit without attempting to reattach until 30 minutes past from the last failed attempt. That process continues until it reconnects. Which it will do at some point. The Pi3 is better, upon the second fail the USB reset is issued and normally will attach in a couple of minutes from that process.

The log has shown me that sometimes the reconnection process can take hours. If would be really better if we could reset the cell modem on the Pi2 as we can on the Pi3. There are some cases where it has to have a hard power down to reconnect.

I have had that happen in factories that have a lot of electrical noise and the cell connection is weak. Issuing a reboot does not help those situations, the cell modem needs a hard power down, rebooting without the power down does not help. So I deploy Pi3’s as they have USB reset ability that allows my control software to continue to execute the primary tasks regardless of the cell connectivity.

I hope that helps.

1 Like