BenV's notes

Check_MK plugin: lmsensors

by on Jun.07, 2011, under Software

On the subject of Pretty Graphs (see my earlier post), I decided to write a plugin (a.k.a. ‘Package’) for Check_MK in order to monitor (and make pretty graphs!) of the sensor output of lmsensors.
Most machines support this out of the box these days, and it’s always interesting to see the conditions of your machine. In case you don’t know, it gives the temperature and voltage of your CPU and mainboard. Thus it’s a good source for making pretty graphs πŸ™‚

Since these plugins are very easy to install (once you’ve got check_mk up and running that is) and still nobody had written one for lmsensors, I decided to do it myself. Writing python isn’t my strongest point (yet), but these are good opportunities to learn.

One of the issues I ran into while writing the plugin was that PNP4Nagios fails on service names that have a plus character in them. For instance, I had Sensor +12V. This created the files Sensor_+12V.rrd and corresponding xml, but when one would go to the PNP4Nagios graph of that sensor it would request a file called Sensor__12V.rrd, which obviously failed.
Therefore I molested the names a bit, so your sensor might now be simply called MB12V instead of M/B+12V.

Configuration of LM-Sensors
For my plugin to work you need to make sure that you have the ‘sensors’ tool. This normally comes in a package called “lmsensors” or “lm-sensors”. Note that you obviously need some kind of hardware sensor on your machine that’s supported by lm-sensors for it to work, including the required kernel module. Fortunately this will often work out of the box.
After making sure you have sensors and that running ‘sensors’ will give output like this:

benv@localhost:~$ sensors
it8720-isa-0228
Adapter: ISA adapter
in0: +1.04 V (min = +0.00 V, max = +4.08 V)
in1: +1.66 V (min = +0.00 V, max = +4.08 V)
in2: +3.39 V (min = +0.00 V, max = +4.08 V)
+5V: +3.04 V (min = +0.00 V, max = +4.08 V)
in4: +3.10 V (min = +0.00 V, max = +4.08 V)
in5: +1.90 V (min = +0.00 V, max = +4.08 V)
in6: +4.08 V (min = +0.00 V, max = +4.08 V)
5VSB: +3.04 V (min = +0.00 V, max = +4.08 V)
Vbat: +3.30 V
fan1: 2235 RPM (min = 10 RPM)
fan2: 0 RPM (min = 0 RPM)
fan3: 2500 RPM (min = 0 RPM)
fan5: 0 RPM (min = 0 RPM)
temp1: +41.0Β°C (low = +127.0Β°C, high = +127.0Β°C) sensor = thermistor
temp2: +36.0Β°C (low = +127.0Β°C, high = +90.0Β°C) sensor = thermal diode
temp3: +38.0Β°C (low = +127.0Β°C, high = +127.0Β°C) sensor = thermistor
cpu0_vid: +1.250 V

k10temp-pci-00c3
Adapter: PCI adapter
temp1: +32.0Β°C (high = +70.0Β°C)

You might see things like ‘in0’ instead of ‘CPU Voltage’. Don’t ask me what voltage corresponds with what sensor on your mainboard, but you can rename the sensor output by editing /etc/sensors.conf or /etc/sensors3.conf depending on your flavor of linux.
In order for the check_mk plugin templates to recognize the type you’ll need to make sure they have some kind of indication of the type of sensor. For instance label the temperature sensors with ‘temp’ or ‘temperature’. The default names like ‘in0’ will also work, but something like ‘Pizza sensor’ obviously won’t.

To change labels or ignore certain sensors because they give bogus data (not connected etc), first find your adapter type.
In the example above this is it8720-isa-0228. Now edit the sensors.conf file and add a section
for this adapter if it isn’t already there.

Here’s an example for renaming in0 to “CPU Voltage” and turning off the second fan since it’s not connected.
Also we’ll change the minimum and maximum voltage for the CPU Voltage — this determines when nagios will send out an alarm or not:

chip "it8720-isa-0228"
set in0_min 1.0
set in0_max 2.0
label in0 "CPU Voltage"
ignore fan2

After changing the sensors file you’ll need to make lmsensors aware of the configuration change by running ‘sensors -s’. (might need root).

benv@localhost:~$ sensors -s
benv@localhost:~$ sensors
it8720-isa-0228
Adapter: ISA adapter
CPU Voltage: +1.31 V (min = +1.00 V, max = +2.00 V)
fan1: 2235 RPM (min = 10 RPM)
fan3: 2500 RPM (min = 0 RPM)
# some stuff deleted to save space :)

Tada. Now repeat this process for all sensors πŸ™‚

Installation:
There are two parts to installing a Check MK plugin. First on the host that actually runs check_mk we need to install the package. This is quickly done:

root@checkmk# wget http://notes.benv.junerules.com/wp-content/uploads/2011/12/lmsensors-1.4.mkp
root@checkmk# md5sum lmsensors-1.4.mkp
115bd50557d1db7e934baa64e172e506 lmsensors-1.4.mkp
root@checmk# check_mk -vP install lmsensors-1.4.mkp
Installing lmsensors version 1.4.
Checks:
lmsensors
Checks man pages:
lmsensors
Agents:
lmsensors
Multisite extensions:
plugins/perfometer/lmsensors.py
PNP4Nagios templates:
check_mk-lmsensors.php
check_mk-lmsensors.fan.php
check_mk-lmsensors.temp.php
check_mk-lmsensors.volt.php
root@checkmk# check_mk -II
lmsensors.fan 2 new checks
lmsensors.volt 4 new checks
root@checkmk# check_mk -O

Done. Soon there will be pretty graphs for this machine πŸ™‚

Now for a remote machine you will need to put the agent in place. Since this is only a single file it’s trivial to do:

benv@checkmk$ scp /usr/share/check_mk/agents/lmsensors root@othermachine:/usr/share/check_mk/agents

Note that the place you want to put that thing in is the $MK_LIBDIR/plugins directory. In my case, this was /usr/lib/check_mk_agent/plugins, but it could very well be somewhere else on your system. You can find it in the check_mk_agent script if you don’t know:

benv@somemachine$ grep MK_LIBDIR `which check_mk_agent`
export MK_LIBDIR="/usr/lib/check_mk_agent"
PLUGINSDIR=$MK_LIBDIR/plugins

Let Check_MK do an inventory on your remote machine [check_mk -II $machine] and the rest goes automagically! πŸ™‚

And now we have pretty graphs for my sensors.
Comments and/or suggestions are welcome.

LMSensors in Check_MK

LMSensors in Check_MK

Updates:
Version 1.1: now has pnp templates to put graphs of the same type together.
Here’s an example:

Now we have combined graphs

Now we have combined graphs (v1.1)

Version 1.2: changed sed to perl in agent plugin, sensornames with more than one space (among things) were giving issues. Thanks to Cyril Pawelko for finding the issue and helping with testing!

Version 1.3: minor change to PNP templates — Nico Weinreich informed me that his fan templates weren’t working correctly so I updated the regular expression used to match corresponding sensor types. If you didn’t have this issue this update won’t do anything useful for you πŸ™‚

Version 1.4: Seems like I was a dumbass and didn’t check the 1.2 package properly. This package really makes it work with perl instead of sed.
Also updated the voltage pnp template to hopefully match more voltage sensors.
Note: if the pnp4nagios template doesn’t work for you, check your pnp4nagios perfdata dir, for example /var/lib/pnp4nagios/perfdata/ and see what .rrd files exist for your host. They are based on the sensor name, so if your sensor name is “St John”, it will not match the voltage template. These names come directly from your sensors.conf (if you don’t have it the default names for the sensors).
See above on how to rename your sensors.

Version 1.5: Another bug spotted by Cyril! The pnp4nagios temperature template had a botched variable name.

Downloads:
lmsensors-1.6.mkp (5296 downloads)      SHA1: 42f7f7eebf803fb3ed755107521761bfb4fbb6bc  MD5: 199266bfd9d750243b9bc49773b6b4d6
[Download not found]
[Download not found]
[Download not found]
[Download not found]
[Download not found]
[Download not found]





33 Comments for this entry

  • danielbair

    The bar graphs on the right side are not working for me. I just get “invalid data: global name ‘savefloat’ is not defined”

    I am running OMD on Ubuntu 10.04 LTS with check_mk version 1.1.9i8

    -Daniel

    • BenV

      danielbair: I guess your version of check_mk doesn’t include the definition of saveint/savefloat yet. If upgrading to a newer version isn’t an option you could try to remove the savefloat function in perfometer/lmsensors.py (lines 9 and 10), it should still work if the check reports good values.

  • loreto

    Installation on check_mk Linux server has been successful (following your instructions)
    “check_mk -II localhost” doesn’t reports any lmsensors data.

  • BenV

    loreto: can you post/mail me the output of ‘sensors’ and ‘check_mk –debug –checks=lmsensors.temp -II localhost’ (if you think the temperature sensors should report something, otherwise ‘lmsensors.fan’ or ‘lmsensors.volt’)?

  • loreto

    Ciao BenV, I didn’t find you email so I’m posting the output.
    ————————————————
    – host588 – Check_mk server – Virtual Machine
    ————————————————
    [root@host588]# check_mk -vP install /root/Check_MK/Plugins/lmsensors-1.0.mkp
    Installing lmsensors version 1.0.
    Checks:
    lmsensors
    Agents:
    lmsensors
    Multisite extensions:
    plugins/perfometer/lmsensors.py

    [root@host588]# check_mk -II localhost
    cpu.loads 1 new checks
    cpu.threads 1 new checks
    df 8 new checks
    diskstat 1 new checks
    kernel 3 new checks
    kernel.util 1 new checks
    lnx_if 1 new checks
    local 1 new checks
    mem.used 1 new checks
    ntp.time 1 new checks
    postfix_mailq 1 new checks
    tcp_conn_stats 1 new checks
    uptime 1 new checks

    *****
    ** lmsensors is not listed (may be it’s a VM)
    *****

    [root@host588]# check_mk -debug –checks=lmsensors.temp -II localhost
    Network error: [Errno -2] Name or service not known

    *****
    ** Now copying lmsensors to PM
    *****
    ————————————-
    – host755 – Physical machine
    ————————————-

    root@host755# grep MK_LIBDIR $(which check_mk_agent)
    export MK_LIBDIR=”/usr/lib/check_mk_agent”
    PLUGINSDIR=$MK_LIBDIR/plugins
    LOCALDIR=$MK_LIBDIR/local

    *****
    ** Because of it didn’t work, I copied lmsensors on all check_mk directories
    *****
    scp -p /usr/share/check_mk/agents/lmsensors root@host755:/usr/lib/check_mk_agent/
    scp -p /usr/share/check_mk/agents/lmsensors root@host755:/usr/lib/check_mk_agent/plugins/
    scp -p /usr/share/check_mk/agents/lmsensors root@host755:/usr/lib/check_mk_agent/local/
    scp -p /usr/share/check_mk/agents/lmsensors root@host755:/usr/share/check_mk/agents/

    chmod 755 lmsensors

    [root@host588]# check_mk -II host755
    cpu.loads 1 new checks
    cpu.threads 1 new checks
    df 8 new checks
    diskstat 1 new checks
    kernel 3 new checks
    kernel.util 1 new checks
    mem.used 1 new checks
    ntp.time 1 new checks
    tcp_conn_stats 1 new checks
    uptime 1 new checks
    *****
    ** lmsensors is not listed
    *****

    [root@host588]# check_mk -debug –checks=lmsensors.temp -II host755
    Network error: [Errno -2] Name or service not known

    Ciao

  • BenV

    My email is benv (at) junerules dot com πŸ™‚

    And you’re right: lmsensors doesn’t work on virtual machines, or at least, that’s my experience so far.
    On the physical machine it should work however, depending on the sensors in your machine and whether or not they are supported by lmsensors (and in your kernel).

    The check_mk lmsensors plugin expects ‘/usr/bin/sensors’ to exist and to provide output. For example, my server here at home outputs this:
    benv@server:~$ sensors
    lm85-i2c-0-2e
    Adapter: SMBus I801 adapter at 2000
    in0: +1.56 V (min = +0.00 V, max = +3.32 V)
    Vcore: +1.34 V (min = +0.00 V, max = +2.99 V)
    +3.3V: +3.33 V (min = +2.97 V, max = +3.63 V)
    +5V: +5.10 V (min = +4.50 V, max = +5.50 V)
    +12V: +12.00 V (min = +0.00 V, max = +15.94 V)
    fan1: 1361 RPM (min = 0 RPM)
    fan3: 952 RPM (min = 0 RPM)
    fan4: 1251 RPM (min = 0 RPM)
    temp1: +63.0 C (low = +0.0 C, high = +75.0 C)
    M/B Temp: +49.0 C (low = +0.0 C, high = +65.0 C)
    temp3: +39.0 C (low = +0.0 C, high = +55.0 C)
    cpu0_vid: +1.088 V

    Can you check if you have the tool and post the output here, and/or email it to me? πŸ™‚

  • BenV

    Note that I just updated the package, this version also includes a PNP4Nagios template and a little man page for the check.

  • loreto

    Ciao BenV,
    Sorry for the delay of my answer, but since the command “sensor” did not include any output on my machine, I tried to install the drivers of the manufacturer. Unfortunately, the “sensor” command still does not give any result. We are investigating.
    Ciao
    Loreto

  • stefan927

    hey πŸ˜‰
    i have installed the plugin on my check_mk instance. the problem i have is, that check_mk does not recognise the plugin when i reinventarize the check_mk on the local machine. on a remote server this works very fine. any ideas? πŸ˜‰

  • BenV

    Stefan927: I’m guessing it’s either the output of sensors that’s weird or your check_mk can’t locate the agent plugin locally.
    You can try running: check_mk_agent | grep PluginsDirectory
    and check the dir it returns, there should be a file called ‘lmsensors’ there.
    Next try running that file and see if it gives proper output (a section <<<lmsensors>>> followed by sensordata).
    Also you can see if this command gives any weird errors:
    check_mk –debug –checks lmsensors.temp localhost

    Good luck hunting for the problem πŸ™‚

  • stefan927

    thanks for your quick reply πŸ˜‰
    exatly that was the problem.
    i have installed the latest package (lmsensors-1.5.mkp) and the lmsensors file was extrated to the wrong directory “/usr/share/check_mk/agents/” but the default path when you install check_mk is “/usr/lib/check_mk_agent/plugins”. Is this a package error? πŸ˜‰

  • BenV

    Stefan927: as far as I know that’s not a package error — that is, I only tell check_mk to make a package and check_mk (and I hope therefore the package)’knows’ what parts are agent plugins and libraries etc. Then again, I don’t know what the official word on this is from Mr Check_MK himself, the documentation is not very detailed on this πŸ™‚

  • sippe

    Hi there and nice agent you have build. Unfortunately I’m a bit stupid or I have missed some part because I have this kind of situation:

    root@blah:/opt# telnet foo 6556
    … …
    <<>>
    coretemp-isa-0000
    Core_0 +48.0 C high +73.0 C crit +85.0 C
    Core_1 +41.0 C high +73.0 C crit +85.0 C
    Core_2 +45.0 C high +73.0 C crit +85.0 C
    Core_3 +43.0 C high +73.0 C crit +85.0 C

    coretemp-isa-0001
    Core_0 +48.0 C high +73.0 C crit +85.0 C
    Core_1 +39.0 C high +73.0 C crit +85.0 C
    Core_2 +40.0 C high +73.0 C crit +85.0 C
    Core_3 +40.0 C high +73.0 C crit +85.0 C

    Like You see I have two CPU system with 8 cores. So check_mk can see just that last one CPU because it’s actually same core numbers I think. Is there some easy way to fix this? I suppose CPUx_Core_y could solve this problem or is there some clever solution? πŸ™‚

  • BenV

    Hello sippe,

    That’s a bug you got there, or at least, I never ran into this situation myself and wasn’t smart enough to realize beforehand that this might occur πŸ™‚

    I’ll have to look into fixing this for you, but you can try to work around the problem by renaming the sensors in /etc/sensors.conf like this:
    chip “coretemp-isa-0000”
    label in0 “CPU1_Core0”
    label in1 “CPU1_Core1”

    chip “coretemp-isa-0001”
    label in0 “CPU2_Core0”
    label in1 “CPU2_Core1”

    (don’t forget to run sensors -s afterwards to apply these changes)

    I’ll try to fix it in the sensor agent soon, but I’m a bit short on time at the moment so give me a few days πŸ™‚

  • sippe

    Hi Ben!

    Great thanks for you. I really like your tool and I suppose it’s getting even better all the time. πŸ™‚

    Nice work! πŸ™‚

  • stefan927

    Hi Ben,
    another “bug” in the pnp4nagios template spottet.
    when i go to the pnp4nagios directly i have the following error:

    Template /var/www/virtual/xxx/monitoring/htdocs/pnp4nagios//templates/check_mk-lmsensors.volt.php does not provide array $def[].

    any ideas? πŸ˜‰

  • BenV

    Stefan927: Sounds like the template isn’t able to find your sensordata .rrd files. It expects something like ‘Sensor_core0.rrd’, can you provide the name of your .rrd file?

  • stefan927

    these sensor_*.rrd’s i have.

    Sensor_12.0V.rrd
    Sensor_3.3V.rrd
    Sensor_3.3V_Standby.rrd
    Sensor_5.0V.rrd
    Sensor_5.0V_Standby.rrd
    Sensor_Core_0.rrd
    Sensor_Core_0_Temperature.rrd
    Sensor_Core_1.rrd
    Sensor_Core_1_Temperature.rrd
    Sensor_Memory_Temperature.rrd
    Sensor_Memory_Vcc.rrd
    Sensor_Processor_Vcc.rrd
    Sensor_System_Fan.rrd
    Sensor_Vbat.rrd
    Sensor_VR_Temperature.rrd

  • BenV

    Stefan927: sorry for the slow reply, busy week.
    The rrd files look fine as far as their names are concerned, so that’s not the issue.
    Maybe you can email me the details of your setup (check_mk / pnp4nagios versions etc) so we can try to figure out the problem that way, might be easier πŸ™‚
    My email is benv (at) junerules dot com.

  • Janko

    Hi Ben,
    i have a Problem… πŸ™‚

    ————————–
    root@noc:/opt/omd/sites/monitoring# check_mk -vP install lmsensors-1.5.mkp
    -bash: check_mk: command not found
    root@noc:/opt/omd/sites/monitoring# su monitoring
    OMD[monitoring]:~$ check_mk -vP install lmsensors-1.5.mkp
    Traceback (most recent call last):
    File “/omd/sites/monitoring/share/check_mk/modules/check_mk.py”, line 4845, in
    do_packaging(args)
    File “/omd/sites/monitoring/share/check_mk/modules/packaging.py”, line 98, in do_packaging
    f(args)
    File “/omd/sites/monitoring/share/check_mk/modules/packaging.py”, line 334, in package_install
    tar = tarfile.open(path, “r:gz”)
    File “/usr/lib/python2.7/tarfile.py”, line 1678, in open
    return func(name, filemode, fileobj, **kwargs)
    File “/usr/lib/python2.7/tarfile.py”, line 1729, in gzopen
    raise ReadError(“not a gzip file”)
    tarfile.ReadError: not a gzip file
    OMD[monitoring]:~$
    —————————–

    • Janko

      Hi Ben,

      i can solve the Problem… redownload the File…but now i have a new Problem. I use OMD with Check-MK. I cant find the Agent File to copy to the remote Server…

      • BenV

        Is it not in /usr/share/check_mk/agents/lmsensors? If you have the locate command available you can try running ‘locate lmsensors’ and see what that returns, possibly after running updatedb πŸ™‚

  • spyder

    Ben,

    you did some amazing job here.
    I’ve got it working, but I have one aesthetic issue – rrd filenames are saved with sensor name duplicated eg. Sensor_in3_in3.rrd or Sensor_fan1_fan1.rrd. You can figure out that it will get ugly on pnp graphs, as names are taken from rrd filenames. Here is and example
    http://fonoc.net/snap/Check_MK_gv_18916224.png

    Do you have an idea where can I change that behavior or why it happens that way?

    Thanks and keep on great work.

    • BenV

      Hm, good question. Over here I don’t see that behaviour, but most of the names come from the output of lmsensors.
      See what ‘sensors’ says, or check the agent plugin out. Are the names double there as well?
      On my machines I simply get rrd files like ‘Sensor_fan1.rrd’.

      • spyder

        sensors output, both from sensors and from check_mk_agent through your plugin, look pretty normal:

        <<>>
        coretemp-isa-0000
        CPU_Physical_Tempetature +35.0 C high +80.0 C crit +100.0 C
        Core_0_Temperature +35.0 C high +80.0 C crit +100.0 C
        Core_1_Temperature +31.0 C high +80.0 C crit +100.0 C
        Core_2_Temperature +32.0 C high +80.0 C crit +100.0 C
        Core_3_Temperature +32.0 C high +80.0 C crit +100.0 C

        it8728-isa-0a30
        in0 +0.02 V min +0.00 V max +3.06 V
        in1 +2.09 V min +0.00 V max +3.06 V
        in2 +2.03 V min +0.00 V max +3.06 V
        in3 +2.05 V min +0.00 V max +3.06 V
        in4 +0.01 V min +0.00 V max +3.06 V
        in5 +1.76 V min +0.00 V max +3.06 V
        in6 +1.54 V min +0.00 V max +3.06 V
        3VSB +3.38 V min +0.00 V max +6.12 V

        All other rrd filenames (that are not ‘Sensor..’ are normal.
        What is creating those files check_mk or pnp4nagios?
        I was trying to figure out in code where they come from, but I couldn’t figure it out.
        My best bet is that they are generated from ‘Service Description’ + ‘Item’

        or maybe check_mk version I’m running is causing that behavior (1.2.2p3 (omd 1.10 package))?

        • BenV

          The rrd files are created by pnp4nagios (which in turn gets called by nagios through process_perfdata.pl), maybe it’s a configuration/version thing there. I’m running check_mk version 1.2.2p1 here without problems, pnp4nagios version 0.6.21.
          From process_perfdata.pl it seems it uses the Service Description as filename, if the rrd storage type is set to ‘SINGLE’, but when it’s set to ‘MULTILPE’ it will use the Service Description AND ‘name’ of the item. Maybe there’s where it goes wrong. I have the storage type set to SINGLE here – so maybe this is the problem.

          The graph template then takes the name of the rrd file as title for the legend as you guessed.

          Seems like I’ll have to test with a pnp4nagios setup with the storage type set to multiple, but I won’t have time to test this today. I’ll come back on this πŸ™‚

  • spyder

    Correct.
    OMD comes with pnp4nagios configured with this:
    RRD_STORAGE_TYPE = MULTIPLE

  • adrian

    Hello,

    First thank you for the script, but i have a small issue, I’m getting the following output in check_mk:

    CRIT – CRITICAL – Sensor value 35.0 C, which is smaller than +127.0

    and it’s not right, the output in sensors is:

    temp1: +35.0Β°C (low = +127.0Β°C, high = +127.0Β°C) sensor = thermistor

    Any idea how could i fix it ?

    • BenV

      @adrian: in order to set the low and high alarms you need to create/edit your /etc/sensors.conf (or whatever file your distro uses, it can also be sensors3.conf) and then run sensors -s.
      For example for my it8728-isa-0a30 adapter I can adjust the temperature by adding a section like this:
      chip “it8728-*”
      set temp1_min -20
      set temp1_max 127

      After editing/adjusting the file run ‘sensors -s’ and see if it’s any better πŸ™‚

      Wouter.

  • brightdroid

    You can fix the issue in “check_lmsensors” using this patch:

    — lmsensors.org 2011-06-07 16:01:01.000000000 +0200 +++ lmsensors 2017-03-29 09:10:26.184876477 +0200 @@ -59,8 +59,8 @@ break if values == None: return (3, “UNKNOWN – sensor status not found in agent output”) – value = float(line[1]) – stype = line[2] + value = float(values[1]) + stype = values[2] perfdata = [ ( item, value, “”, savefloat(smax) ) ] if smax != None and value > float(smax): return (2, “CRITICAL – Sensor value %s %s, which is bigger than %s” % (value, stype, smax), perfdata)

  • BenV

    @brightdroid: thanks for the patch, updated the package as v1.6! πŸ™‚

  • knspradeep

    Is there any way to find the cpu frequency and its alerts thresold values suppose my cpu will operate 4.2GH suppoe if it down with 3.6hz i need to get alert.

    • BenV

      This sounds more like a lmsensors question than related to the check_mk plugin if you ask me.
      There are several tools available to get the CPU speeds, your kernel will usually have a governor setup to handle these speeds. These governors are interchangable, you could for example force it to always run at 4.6GHz.
      I don’t really understand your question I’m afraid πŸ™‚

1 Trackback or Pingback for this entry

Leave a Reply

You must be logged in to post a comment.