Another day, another Check_MK plugin!
This one is inspired by smokeping, but different because it doesn’t need smokeping. It does need the tool formerly known as Matt’s TraceRoute, aka mtr. It’s installed on all my machines by default and easily available in all distro’s that are worthy. Even pokemon OS has it 😉
The reason I wanted to build this plugin was first of all because of pretty graphs (of course!). The second reason was that my girlfriend had some network issues to figure out, but only ping and DNS resolve times don’t paint a complete picture. This plugin makes some graphs that hopefully fill that void a bit 🙂
Now that you’ve skipped the last 2 paragraphs, here are some example graphs that I made while testing the plugin:
This is the plugin status per host on the service overview page of Check_MK. As you can see I configured multiple hosts.
Configuration is simple, you install the agent plugin on the machine where you want to run the pings from. Next you copy the example mtr.cfg to your Check MK agent’s configuration directory, which in my case ends up as /etc/check_mk/mtr.cfg. In this file you make a section per host you want to ping, so for example:
# Options here, see the example for the default options
interval = 2
Once mtr.cfg is in place the plugin will automatically start mtr for every host the next time the agent is called. Because MTR takes a relatively long time to run (default it ‘only’ does 10 pings) they’re started in the background. Once the process is complete the report will be parsed as soon as check_mk_agent is called again, after which a new MTR run is started (unless you set time=300, which will wait until 5 minutes have passed since the last run).
When the results come in the alarms will be generated if necessary and pnp4nagios creates some pretty graphs:
Above is the graph of the destination graph where you can see packet loss as black bars (scaled to match the height of your graph), and best, avg and worst ping times. I’ve tried to make it a bit ‘smokey’, but rrdgraph is quite a beast to time.
NOTE: I made these graphs for RRDTool 1.5 and higher since they use gradients, so if you have RRDTool 1.4 or older your graphs might look quite a bit less fancy or even fail completely. Sorry! 🙂
“But BenV! That graph looks like utter garbage!
Oh yeah?! Patches are welcome! 🙂
Above graph is made for every hop in between the source and destination. They don’t have alarms, but sometimes it’s interesting to see that certain hops are slow all of a sudden.
Which is another interesting discussion: how relevant are these hops in between?
Well, since routing can give you different results for every packet they’re not THAT interesting, especially since often the hops will drop your icmp packets completely. On the other hand, if you see that all hops are suddenly slow you can draw certain conclusions from that about your own network/internet speed. (as opposed to only the destination being slow).
Just be aware that “hop 4” can be router A one time and router B the next time, just as the hopcount can vary.
This ‘hop overview’ shows all the average times for all hops including the destination host. Again, individual hops don’t mean much, but seeing all hops in a single graph together with the destination host can sometimes reveal interesting patterns. (such as a slow destination host)
As with my Check MK fail2ban plugin I’ve added the part so you can edit the parameters in WATO. Here you see the default values for the plugin.
If you click on the ‘MTR’ label under “Type of check” you can create custom rules to change the parameters for a specific host, like this:
Note that you can disable these alarms by setting them to 0.
Of course now you want my plugin, so without wasting more bytes:
V0.5.2: mtr-0.5.2.mkp (25683 downloads) Agent updated to stop failing when mtr ran twice. Will try to fix this better in a future update, but at least it doesn’t crash anymore and will after 1 failed report return to properly parsing the specific host.
V0.5.1: — fsckup on my side, I patched an older version. Use 0.5.2 instead 😉
V0.5: mtr-0.5.mkp (76 downloads) Alarms can now actually be disabled by setting levels to 0, better compatibility with older mtr / unpexpected mtr output and copes better with mtr returning multiple lines for a hop. (only first line is parsed)
V0.4: mtr-0.4.mkp (63 downloads) Initial release