What is a watchdog?
A watchdog is an electronic timer used for monitoring hardware and software functionality. Software uses a watchdog timer to detect and recover fatal failures.
In this tutorial, we will setup a watchdog software  on Raspberry Pi. The same commands work on other Linux systems, such as Debian or Ubuntu, too.
Why to use a watchdog?
We use a watchdog to make sure we have a functional computer. If a problem comes up, the computer should be able to recover itself back to a functional state.
The purpose of the Meazurem Gateway is to listen sensor data and upload the data to the cloud. So it’s crucial that the gateway is running and stays connected to the Internet.
In this setup, we are interested in network and process failures. We will configure the board to reboot if network is down for too long, or a specific process isn’t running anymore.
Install the watchdog software
We are using the latest official Raspberry Pi Linux distribution called Raspbian. At the time of writing, the latest version is Raspbian Buster.
First, log in and configure Internet connectivity on your Raspberry Pi.
The next step is to install the watchdog software:
sudo apt install watchdog
Configure the watchdog to monitor a process
Most Linux software daemons write a PID (.pid) file. The file contains a process ID (PID) which identifies the running process. The PID can be used for killing the process, or in case of watchdog, to monitor that the process is running.
Starting from Meazurem Gateway version 1.4.2, the software writes the PID to ~/.meazurem/meazurem.pid. Make sure you start the gateway automatically at boot if you use the process monitoring! See Meazurem Setup for instructions.
To monitor the Meazurem Gateway daemon, let’s configure the watchdog to monitor it. Open the configuration file
sudo nano /etc/watchdog.conf
and add the following lines
# monitor Meazurem Gateway daemon by PID pidfile = /home/pi/.meazurem/meazurem.pid
Now, after starting the watchdog, it will check that the Meazurem Gateway daemon is running. In case the process dies, the watchdog will reboot the device by default.
Configure the watchdog for network ping
Equally important to a process monitoring, is to check for network connectivity. This is achieved by pinging an IP address.
First, check the approriate network interface you want to monitor:
For the WiFi interface, we wil get something like
wlan0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 192.168.0.11 netmask 255.255.255.0 broadcast 192.168.0.255 inet6 aa11::bb22:cc33:dd44:ee55 prefixlen 64 scopeid 0x20<link> ether aa:bb:cc:dd:ee:ff txqueuelen 1000 (Ethernet) RX packets 395 bytes 116061 (113.3 KiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 345 bytes 50275 (49.0 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
It’s also possible to check any other interface such as 3G/4G modem. This is useful if you have the gateway at a remote location with wireless broadband.
Now, configure the watchdog
sudo nano /etc/watchdog.conf
with something like this, for example:
# use the correct interface from the previous step interface = wlan0 # an internet or local address to test, for example 192.168.0.1 ping = 18.104.22.168
In this example, the watchdog monitors wlan0 interface (used for Wi-Fi connectivity). The watchdog will test the Internet connectivity by pinging to 22.214.171.124.
Start and check the watchdog status
By default, the watchdog on Raspbian Buster will use timeout of 60 seconds until reboot. It will check the configured tests once per second.
Our process is not that sensitive and we want to cause a bit less network traffic. Let’s configure the watchdog to run the tests once per 30 seconds, and reboot if a test is failing for longer than 300 seconds.
# timeout [sec] until reboot retry-timeout = 300 # interval [sec] of testing interval = 30
Finally, start the watchdog service with
sudo service watchdog start
That’s it. You can check the status of the watchdog anytime by running
sudo service watchdog status
Test the watchdog functionality
It’s important to test the watchdog functionality after the setup. This is to make sure you configured it correctly and it works the way you intended!
Remember to do this step every time you change the configuration!
: watchdog(8) - Linux man page, https://linux.die.net/man/8/watchdog