Difference between revisions of "Watchdog"

From wiki.netio-products.com
Jump to navigation Jump to search
(Created page with "thumb|Watchdog diagram Watchdog je nástroj, který umožňuje sledování dané (IP) adresy pomocí příkazu ping. Tímto způsobem lze sl...")
 
 
(12 intermediate revisions by 3 users not shown)
Line 1: Line 1:
[[File:Watchdog_diagram_user.png|thumb|Watchdog diagram]]
+
[[File:4.0.0-watchdog.png|thumb|Watchdog diagram]]
Watchdog je nástroj, který umožňuje sledování dané (IP) adresy pomocí příkazu ping. Tímto způsobem lze sledovat například dostupnost zařízení připojeného na výstupu nebo konektivitu do internetu. V případě neúspěchu lze následně vyvolat návaznout akci (viz Actions). Výstupem Watchdogu je proměnná `FAIL`, která nabává hodnot `TRUE` při vyhodnocení neúspěchu a `FALSE` při úspěšném volání ping. 
 
Př.: V kombinaci s návaznými akcemi můžeme v případě nenalezení zařízení na síti (neúspěšný ping) vyvolat restart zařízení na daném výstupu. Postup navázání akce na Watchdog je popsán ve článku **Akce** (TODO)
 
  
Watchdog is a tool that allows you to ping a given (IP) address using the ping command. In this way, you can eg. monitor the availability of a device connected to the output or connectivity to the Internet.
+
Watchdog is a function periodically pinging to one defined IP address or URL. It's periodically checking reply from defined IP device by ping (ICMP). You can monitor physical presence of the IP device or Internet connectivity. You can use several WatchDog functions in parallel.  
  
In case of failure, it is possible to invoke a follow-up rule-based action (see Rules). Output of Watchdog is the variable <code>FAIL</code>. It shows value <code>TRUE</code> when evaluation ends with a failure or <code>FALSE</code> when pinging is successful.
+
Based on each Watchdog state one or several RULEs can be executed. Each [https://wiki.netio-products.com/index.php?title=Rules '''Rule'''] can perform several actions (Set Output, Short Off (restart) output, Toggle output or send Alarm state to the NETIO Cloud service. Based on this Alarm state can NETIO Cloud send email to defined recipient. All Watchdogs are listed in the JSON protocol wit their current states. It can be used by 3rd party software.
 +
 
 +
Each Watchdog function state is the <code>FAIL</code> variable.  
 +
* Watchdog '''Fail = FALSE''' = ping answer is '''OK'''
 +
* Watchdog '''Fail = TRUE''' = ping answer '''not received'''
 +
 
 +
<sup><code>FW 4.0.0+</code></sup>Once network connection is not active the watchdog is not in operation. To detect these stater use Connectivity events in RULES.  
  
 
== Structure ==
 
== Structure ==
Line 29: Line 33:
 
| maxTimeouts
 
| maxTimeouts
 
| int
 
| int
| Number of failed pings required to evaluate to '''FAIL=TRUE'''
+
| Number of failed pings required to evaluate to <code>FAIL=TRUE</code>
 
|-
 
|-
 
| timeToReboot
 
| timeToReboot
Line 37: Line 41:
 
| maxRestarts
 
| maxRestarts
 
| int
 
| int
| Maximum number of restarts when an error condition is declared. After this limit is reached, the Watchdog remains in an FAIL=TRUE state until the next successful ping.
+
| Maximum number of restarts when an error condition is declared. After this limit is reached, the WatchDog remains in an <code>FAIL=TRUE</code> state until the next successful ping.
If set to 0, the Watchdog will restart the device after each '''FAIL=TRUE''' is declared.
+
If set to 0, WatchDog will restart output (trigger rule) after each FAIL=TRUE is declared and not only if FAIL status is changed (This can cause indefinite restarts when the ping remains unsuccessful).
 +
 
 +
If set to -1, WatchDog will restart output (trigger rule) only once and then remains in FAIL=TRUE state until next successful ping.
 +
|-
 +
| startDelay
 +
| int
 +
| Starting delay in seconds at watchdog startup.(Enable or device power up)<sup><code>FW 4.0.0+</code></sup>
 
|}
 
|}
  
 
+
<b>Example:</b>
== Examples ==
+
{
 
+
   "target": "192.168.101.130",
The Watchdog monitors specified IP address. The ping is sent every 60s and waits 5s for the device to respond. If the ping fails twice in a row, Watchdog returns '''FALSE''' and restarts the device (Rule for the restart action must be defined separately, see Rules). After the device restarts, it waits 120 seconds and starts monitoring again. This cycle repeats for a total of 3 times. If the device does not respond even once, the Watchdog remains in error.
+
   "pingInterval": 5,
 
+
   "timeout": 1,
```
 
{
 
   "target": "192.168.101.180",
 
   "pingInterval": 60,
 
   "timeout": 5,
 
 
   "maxTimeouts": 2,
 
   "maxTimeouts": 2,
   "timeToReboot": 120,
+
   "timeToReboot": 10,
   "maxRestarts": 3
+
   "maxRestarts": -1,
}
+
   "startDelay": 0
```
+
}
 
 
---
 
 
 
Watchdog sleduje zadanou doménu. Ping se posílá každých 10s a čeká se 5s na odpověď. Při neúspěšném pingu čeká Watchdog 180s a zkusí to znovu celkem 5x, poté vrátí hodnotu **false**. Následně dojde k vyvolání akce pro restart zařízení (Akce musí být definována separátně viz #1127). 
 
Po restartu se Watchdog vrátí z chybového stavu a znovu pouští cyklus, celkem 3x. Poté zůstává Watchdog v chybovém stavu až do prvního úspěšného pingu.
 
 
 
Watchdog monitors the specified domain. The ping is sent every 10s and waits 5s for a response. If the ping fails, Watchdog waits 180 seconds and tries again for a total of 5 times, then returns '''FALSE'''. Subsequently, an action is called to restart the device (Rule for the restart action must be defined separately, see Rules).
 
 
 
After restart, the Watchdog returns from the error state and runs the cycle again, a total of 3 times. Then the Watchdog remains in an error state until the first successful ping.
 
 
 
''Note: In case of unsuccessful domain translation (eg communication with the DNS server fails), error will be written to the device log. Failed DNS resolution therefore prevents pinging of URLs, but has no effect on pinging IP addresses. Eg. pinging local network IP addresses will continue to work.''
 
 
 
```
 
{
 
  "target": "google.com",
 
   "pingInterval": 10,
 
  "timeout": 5,
 
  "maxTimeouts": 5,
 
  "timeToReboot": 180,
 
  "maxRestarts": 3
 
}
 
```
 

Latest revision as of 19:01, 19 June 2023

Watchdog diagram

Watchdog is a function periodically pinging to one defined IP address or URL. It's periodically checking reply from defined IP device by ping (ICMP). You can monitor physical presence of the IP device or Internet connectivity. You can use several WatchDog functions in parallel.

Based on each Watchdog state one or several RULEs can be executed. Each Rule can perform several actions (Set Output, Short Off (restart) output, Toggle output or send Alarm state to the NETIO Cloud service. Based on this Alarm state can NETIO Cloud send email to defined recipient. All Watchdogs are listed in the JSON protocol wit their current states. It can be used by 3rd party software.

Each Watchdog function state is the FAIL variable.

  • Watchdog Fail = FALSE = ping answer is OK
  • Watchdog Fail = TRUE = ping answer not received

FW 4.0.0+Once network connection is not active the watchdog is not in operation. To detect these stater use Connectivity events in RULES.

Structure

Variable Value Description
target IP / URL Monitored address
pingInterval int [s] Time interval between pings
timeout int [s] Time waiting for answer
maxTimeouts int Number of failed pings required to evaluate to FAIL=TRUE
timeToReboot int [s] Time the Watchdog waits after announcing an error condition before starting a new cycle
maxRestarts int Maximum number of restarts when an error condition is declared. After this limit is reached, the WatchDog remains in an FAIL=TRUE state until the next successful ping.

If set to 0, WatchDog will restart output (trigger rule) after each FAIL=TRUE is declared and not only if FAIL status is changed (This can cause indefinite restarts when the ping remains unsuccessful).

If set to -1, WatchDog will restart output (trigger rule) only once and then remains in FAIL=TRUE state until next successful ping.

startDelay int Starting delay in seconds at watchdog startup.(Enable or device power up)FW 4.0.0+

Example:

{
 "target": "192.168.101.130",
 "pingInterval": 5,
 "timeout": 1,
 "maxTimeouts": 2,
 "timeToReboot": 10,
 "maxRestarts": -1,
 "startDelay": 0
}