BenV's notes

Software

Check_MK plugin: lmsensors

by on Jun.07, 2011, under Software

On the subject of Pretty Graphs (see my earlier post), I decided to write a plugin (a.k.a. ‘Package’) for Check_MK in order to monitor (and make pretty graphs!) of the sensor output of lmsensors.
Most machines support this out of the box these days, and it’s always interesting to see the conditions of your machine. In case you don’t know, it gives the temperature and voltage of your CPU and mainboard. Thus it’s a good source for making pretty graphs 🙂

Since these plugins are very easy to install (once you’ve got check_mk up and running that is) and still nobody had written one for lmsensors, I decided to do it myself. Writing python isn’t my strongest point (yet), but these are good opportunities to learn.

One of the issues I ran into while writing the plugin was that PNP4Nagios fails on service names that have a plus character in them. For instance, I had Sensor +12V. This created the files Sensor_+12V.rrd and corresponding xml, but when one would go to the PNP4Nagios graph of that sensor it would request a file called Sensor__12V.rrd, which obviously failed.
Therefore I molested the names a bit, so your sensor might now be simply called MB12V instead of M/B+12V.

Configuration of LM-Sensors
For my plugin to work you need to make sure that you have the ‘sensors’ tool. This normally comes in a package called “lmsensors” or “lm-sensors”. Note that you obviously need some kind of hardware sensor on your machine that’s supported by lm-sensors for it to work, including the required kernel module. Fortunately this will often work out of the box.
After making sure you have sensors and that running ‘sensors’ will give output like this:

benv@localhost:~$ sensors
it8720-isa-0228
Adapter: ISA adapter
in0: +1.04 V (min = +0.00 V, max = +4.08 V)
in1: +1.66 V (min = +0.00 V, max = +4.08 V)
in2: +3.39 V (min = +0.00 V, max = +4.08 V)
+5V: +3.04 V (min = +0.00 V, max = +4.08 V)
in4: +3.10 V (min = +0.00 V, max = +4.08 V)
in5: +1.90 V (min = +0.00 V, max = +4.08 V)
in6: +4.08 V (min = +0.00 V, max = +4.08 V)
5VSB: +3.04 V (min = +0.00 V, max = +4.08 V)
Vbat: +3.30 V
fan1: 2235 RPM (min = 10 RPM)
fan2: 0 RPM (min = 0 RPM)
fan3: 2500 RPM (min = 0 RPM)
fan5: 0 RPM (min = 0 RPM)
temp1: +41.0°C (low = +127.0°C, high = +127.0°C) sensor = thermistor
temp2: +36.0°C (low = +127.0°C, high = +90.0°C) sensor = thermal diode
temp3: +38.0°C (low = +127.0°C, high = +127.0°C) sensor = thermistor
cpu0_vid: +1.250 V

k10temp-pci-00c3
Adapter: PCI adapter
temp1: +32.0°C (high = +70.0°C)

You might see things like ‘in0’ instead of ‘CPU Voltage’. Don’t ask me what voltage corresponds with what sensor on your mainboard, but you can rename the sensor output by editing /etc/sensors.conf or /etc/sensors3.conf depending on your flavor of linux.
In order for the check_mk plugin templates to recognize the type you’ll need to make sure they have some kind of indication of the type of sensor. For instance label the temperature sensors with ‘temp’ or ‘temperature’. The default names like ‘in0’ will also work, but something like ‘Pizza sensor’ obviously won’t.

To change labels or ignore certain sensors because they give bogus data (not connected etc), first find your adapter type.
In the example above this is it8720-isa-0228. Now edit the sensors.conf file and add a section
for this adapter if it isn’t already there.

Here’s an example for renaming in0 to “CPU Voltage” and turning off the second fan since it’s not connected.
Also we’ll change the minimum and maximum voltage for the CPU Voltage — this determines when nagios will send out an alarm or not:

chip "it8720-isa-0228"
set in0_min 1.0
set in0_max 2.0
label in0 "CPU Voltage"
ignore fan2

After changing the sensors file you’ll need to make lmsensors aware of the configuration change by running ‘sensors -s’. (might need root).

benv@localhost:~$ sensors -s
benv@localhost:~$ sensors
it8720-isa-0228
Adapter: ISA adapter
CPU Voltage: +1.31 V (min = +1.00 V, max = +2.00 V)
fan1: 2235 RPM (min = 10 RPM)
fan3: 2500 RPM (min = 0 RPM)
# some stuff deleted to save space :)

Tada. Now repeat this process for all sensors 🙂

Installation:
There are two parts to installing a Check MK plugin. First on the host that actually runs check_mk we need to install the package. This is quickly done:

root@checkmk# wget http://notes.benv.junerules.com/wp-content/uploads/2011/12/lmsensors-1.4.mkp
root@checkmk# md5sum lmsensors-1.4.mkp
115bd50557d1db7e934baa64e172e506 lmsensors-1.4.mkp
root@checmk# check_mk -vP install lmsensors-1.4.mkp
Installing lmsensors version 1.4.
Checks:
lmsensors
Checks man pages:
lmsensors
Agents:
lmsensors
Multisite extensions:
plugins/perfometer/lmsensors.py
PNP4Nagios templates:
check_mk-lmsensors.php
check_mk-lmsensors.fan.php
check_mk-lmsensors.temp.php
check_mk-lmsensors.volt.php
root@checkmk# check_mk -II
lmsensors.fan 2 new checks
lmsensors.volt 4 new checks
root@checkmk# check_mk -O

Done. Soon there will be pretty graphs for this machine 🙂

Now for a remote machine you will need to put the agent in place. Since this is only a single file it’s trivial to do:

benv@checkmk$ scp /usr/share/check_mk/agents/lmsensors root@othermachine:/usr/share/check_mk/agents

Note that the place you want to put that thing in is the $MK_LIBDIR/plugins directory. In my case, this was /usr/lib/check_mk_agent/plugins, but it could very well be somewhere else on your system. You can find it in the check_mk_agent script if you don’t know:

benv@somemachine$ grep MK_LIBDIR `which check_mk_agent`
export MK_LIBDIR="/usr/lib/check_mk_agent"
PLUGINSDIR=$MK_LIBDIR/plugins

Let Check_MK do an inventory on your remote machine [check_mk -II $machine] and the rest goes automagically! 🙂

And now we have pretty graphs for my sensors.
Comments and/or suggestions are welcome.

LMSensors in Check_MK

LMSensors in Check_MK

Updates:
Version 1.1: now has pnp templates to put graphs of the same type together.
Here’s an example:

Now we have combined graphs

Now we have combined graphs (v1.1)

Version 1.2: changed sed to perl in agent plugin, sensornames with more than one space (among things) were giving issues. Thanks to Cyril Pawelko for finding the issue and helping with testing!

Version 1.3: minor change to PNP templates — Nico Weinreich informed me that his fan templates weren’t working correctly so I updated the regular expression used to match corresponding sensor types. If you didn’t have this issue this update won’t do anything useful for you 🙂

Version 1.4: Seems like I was a dumbass and didn’t check the 1.2 package properly. This package really makes it work with perl instead of sed.
Also updated the voltage pnp template to hopefully match more voltage sensors.
Note: if the pnp4nagios template doesn’t work for you, check your pnp4nagios perfdata dir, for example /var/lib/pnp4nagios/perfdata/ and see what .rrd files exist for your host. They are based on the sensor name, so if your sensor name is “St John”, it will not match the voltage template. These names come directly from your sensors.conf (if you don’t have it the default names for the sensors).
See above on how to rename your sensors.

Version 1.5: Another bug spotted by Cyril! The pnp4nagios temperature template had a botched variable name.

Downloads:
lmsensors-1.6.mkp (5511 downloads)      SHA1: 42f7f7eebf803fb3ed755107521761bfb4fbb6bc  MD5: 199266bfd9d750243b9bc49773b6b4d6
[Download not found]
[Download not found]
[Download not found]
[Download not found]
[Download not found]
[Download not found]

35 Comments more...

Fighting with PNP4Nagios

by on Jun.07, 2011, under Software

So today I noticed my pretty Check MK graphs were broken. Trying to view a random Check_MK service’s PNPGraphs gave this error:

Warning: preg_match() [function.preg-match]: Compilation failed: unknown option bit(s) set at offset 0 in /usr/lib/kohana/system/core/utf8.php on line 30

Fatal error: PCRE has not been compiled with UTF-8 support. See PCRE Pattern Modifiers for more information. This application cannot be run without UTF-8 support. in /usr/lib/kohana/system/core/utf8.php on line 38

Of course this is after I had upgraded some Slackware packages in the daily upgrades (I still run Slackware Current on non production machines, keeps things interesting) including PHP and Perl, so I wasn’t really surprised.
I reinstalled RRDTool since the perl bindings were gone (and of course I once again had to fight it) and decided to upgrade PNP4Nagios while I was at it.
However, no dice. Searching for the error gave nothing, hence this post.

To be sure it wasn’t what they (google) claimed it would be, I checked for UTF-8 support in my libpcre:


benv@graphs$ pcretest -C
PCRE version 8.12 2011-01-15
Compiled with
UTF-8 support
Unicode properties support
Newline sequence is LF
\R matches all Unicode newlines
Internal link size = 2
POSIX malloc threshold = 10
Default match limit = 10000000
Default recursion depth limit = 10000000
Match recursion uses stack

Jup, UTF-8 support is there, along with Unicode stuff. Yay.

Next thing I noticed was that the new PHP version was complaining about extensions that wouldn’t load.
For instance dbase.so didn’t exist anymore, and some other junk failed as well. It’s possible that this error has been around for a while on that machine, but it was time to fix it!
Since my php.ini was from 2008 I decided to simply take the stock /etc/httpd/php.ini-production for now.
After that change php was again bitching about not being able to load extensions, but this times it were different ones.
Apparently they now have ‘libenchant‘ for spell checking, so that was one of the other failures.
To fix that problem use ‘slackpkg install enchant‘. (that’ll teach me to run the slackpkg install-new every once in a while :p)

Restarting Apache the hard way (not “apachectl restart” but /etc/rc.d/rc.httpd restart) helped for the PNP4Nagios error.
However, upgrading it introduced a new gimmick called:

Please check the documentation for information about the following error.

Undefined index: auth_enabled
file [line]:

application/models/auth.php [22]:

back

Right. Upgrading my Check_MK to 1.1.10p3 didn’t help.
However, checking the pnp4nagios config.php file made me aware that they added some options.
After merging in the new config.php options from their sample dir it finally worked again.
The new options I had to add when going from pnp4nagios version 0.6.7 to 0.6.13 were:

$conf['zgraph_width'] = "750";
$conf['zgraph_height'] = "450";
$conf['auth_enabled'] = FALSE;
# Adjust the next one to your configuration, it's probably different :)
$conf['livestatus_socket'] = "unix:/var/lib/nagios/rw/live";
$conf['allowed_for_all_services'] = "";
$conf['allowed_for_all_hosts'] = "";

Hooray, pretty graphs are back 🙂

LMSensor Fan1 speed graph

Pretty graphs!

1 Comment :, , , , , more...

Review: Fable 3

by on Jun.05, 2011, under Software

A short summary of my Fable 3 experience so far.

Fable III

Fable 3


First we install it, and immediately we notice the Windows Game Live Garbage Cancer. So far so bad.
However, apart from the login bugging me every time I start the game it works fine.

The game itself looks fine, better than the previous Fable I played at least. The story is pretty standard, you’re a prince and your evil brother – the king – is ruining everything.
You’re the hero who has to murder him. Or something.
So the first thing you do is shake everyone’s hand for 5 minutes, apparently people love you when you shake their hands long enough.
Then you go through the usual “Get kicked out of the castle and start your quest” ritual. Speaking of rituals, did I mention I hate unskippable cutscenes?
Especially when starting the game, who the fuck cares that Microsoft made the game…
Anyhow, after running around a bit and getting your first skills you find yourself in the familiar guild room, a safe place to save your game and switch clothes etc.
Reminded me of Baldur’s Gate’s pocket plane.

Next you go to villages to gather allies. Which means helping them out first. Which means shaking hands, and smashing monsters.
Oh yeah, “prove you’re a hero” so they can teach you how to use your sword and gun in addition to the magic you started off with.
Great great. In the second village your run into you can do minigames like Guitar Hero, only more retarded because there are only two buttons involved.
If that wasn’t easy enough already, they also have Pie Making Hero. Same game with only 3 notes per song. So before you know it you have enough money
to buy several houses. So much for that.

Other villagers give you quests after whistling to them for an hour (did I mention it gets really old, really fast?), and strangely they all need you to get the same hidden
package in the other village, or talk to the same retard in that other village to pass a message.
I’d like to get a list of my active quests, but that isn’t unlocked at this part of the game yet.
So I decided to go on with the story to unlock that part, but just after traveling back to the village my windows decided to BSOD. Probably related to that new NVIDIA driver update from yesterday.
Anyhow, I reboot windows, restart Fable 3, get through the sickening unskippable intro garbage, log into LIVE.
[Windows LIVE] Saaay, your savegame …. isn’t.
[BenV] What?
[LIVE] Your save is corrupt, so we’ll just start the game from scratch again. Would you like a Prince or a Princess?
[BenV] !(@3(@^$*&%^*(@$#(@*$%3*REBOOT*

So much for Fable 3 and windows.

UPDATE:
Just tried to get the game working in wine. With some hassle I got through the installation.
However, the game won’t start becaseu FS3Secu.exe crashes. Garbage DRM as always.
Couldn’t find a crack that would bypass this exe though, so no luck for Wine at the moment.

Leave a Comment :, more...

Commandline XLS/DOC to PDF converting

by on Apr.04, 2011, under Software

Since this took me 5 years to find through google, this is another reference post.
Converting word and/or excel documents to PDF format should be easy.
Unfortunately it takes Libreoffice (or whatever your variant is today), but it works:

$ libreoffice -headless -convert-to pdf /tmp/bla.xls
convert /tmp/bla.xls -> /tmp/bla.pdf using calc_pdf_Export

No garbage paid for converters needed, just garbage libreoffice.

Update: since libreoffice is still the same garbage as openoffice (maybe it’ll improve some day) I ran into this error
on my production machine:

$ libreoffice -headless -convert-to pdf /tmp/bla.xls
[Java framework] Error in function createSettingsDocument (elements.cxx).
javaldx failed!

Apparently this is caused by libreoffice wanting to do some garbage in its home directory. Except for my headless user doesn’t have a workable home.
Oh yeah, there’s a -nofirststartwizard option but it doesn’t help. This is solved by setting HOME to /tmp for instance:

$ HOME=/tmp libreoffice -headless -convert-to pdf /tmp/bla.xls

Leave a Comment more...

Xen, DRBD and live migration

by on Mar.10, 2011, under Software

Once again I have some new hardware that’s been labeled “Xen Server”.
This time I want to set it up in a way that brings some redudancy so we can actually have 1 server fail and still have our hosts up and running.
(or at least back up in a few minutes instead of several hours).
To achieve this goal I will install the latest version of Xen (which seems to be 4.01) and use DRBD with LVM for storage. (continue reading…)

1 Comment :, , , , more...

Another round of Adobe Cancer

by on Feb.24, 2011, under Morons, Software

Needless to say Adobe had an update today.
Since we’re running a windows 2008 server with users that don’t have administrator rights, Adobe is very annoying to start with.

<Adobe>”Hey I have an update! Oh, I can’t install it because you need administrator rights, but I’ll keep bugging you with it anyway, MWHOAHAHHAHA”

So if that isn’t enough already (and they have an update about every week … if not more often), the piece of cancer can’t even update. Even when running the updater as administrator:

Adobe Cancer Update

Mysterious fail

Of course it would be terrible to state what the problem is, so they only give you “Error: 1403” to work with.
Thanks Adobe, really useful. It’s unfortunate that some morons here insist on needing it (Because some idiot customers send us PDF files with chinese fonts and other rubbish that isn’t handled well in better pdf readers).

Feh.

Leave a Comment more...

Mercurial on Windows vs Linux, spot the problem

by on Feb.17, 2011, under Software

Last week I upgraded our fileserver at work from Debian Lenny to Debian Squeeze.
Obviously a ton of stuff got ‘new’ (read: less ancient) versions, including Apache.
Apart from a reboot or two for new kernels and some config fixes everything went pretty smooth.

This week lotjuh ran into the problem that she couldn’t push to the mercurial repository from windows.
Strange, because everything worked fine from linux. Tested from both the windows 2008 server we have here and another windows 7 machine at home, the both broke with the same cryptic message:

c:\tmp> hg clone --insecure https://fileserver/repository
abort: error: _ssl.c:1325: error:14094410:SSL routines:SSL3_READ_BYTES:sslv3 alert handshake failure

Huh. That’s weird.
Obviously google doesn’t help with this, you get some garbage results on how mercurial didn’t do jack with https certificates before version 1.7 and their struggle to implement it.

After some digging I found this in the apache logs:

[Thu Feb 17 12:10:51 2011] [error] [client 192.168.123.321] Re-negotiation request failed
[Thu Feb 17 12:10:51 2011] [error] SSL Library Error: 336068931 error:14080143:SSL routines:SSL3_ACCEPT:unsafe legacy renegotiation disabled

Feh. Somewhere old SSL libraries are being used! Windows… .always the same.

Solution:
In your apache ssl configuration (mods-enabeld/ssl.conf on Debian), add this:

SSLInsecureRenegotiation on

Note that this obviously isn’t a great solution, but it’s the only way to get it to work on windows at the moment.

Leave a Comment :, , , , more...

Linux Software Raid-1 issue

by on Jan.29, 2011, under Software

It just took me about an hour to figure this one out, so here’s the story for the next time I run into it.
Steps taken:
* New machine
* 2 harddisks (Western Digital Greens, so used wdidle3 on them!)
* Boot Slackware64 installer from PXE/NFS
* cfdisk, create 2 identical partitions, make bootable, set type to FD, write, quit
* mdadm –create /dev/md0 –raid-level=1 –raid-devices=2 /dev/sda1 /dev/sdb2
* install slackware64, grub2 and some other junk
* reboot

Sounds good right?
Well, for some reason the array kept booting up with only 1 of 2 disks active.
No errors or warnings, just kept fucking up. /proc/mdstat looked like

# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md0 : active raid1 sdb1[1]
39061952 blocks [1/2] [_U]

Adding /dev/sda1 back (mdadm /dev/md0 –add /dev/sda1) worked fine too, the array resync-ed without problems.

After about an hour of trying to recreate superblocks and that sort of stuff I found it:
The partition type of /dev/sda1 was set to 0x83 instead of 0xFD.
Thanks cfdisk, last time I used that piece of garbage. (I’m 100% certain I set them to 0xFD, but somehow it’s bugging for me lately in cfdisk).

Leave a Comment :, , , , more...

Why I hate lilo

by on Jan.11, 2011, under Software

Every time I install a machine with the latest Slackware, I’m amazed again at the installed boot manager – lilo.
Sure, lilo works. Most of the times. Even when you have a raid-1 boot device.
Unless you don’t have the latest version of lilo of course.

Today I tried to continue a Slackware64 (current) install of a machine that I installed a week ago.
It worked fine, was just about to install Xen when one of the disks started acting up.
Obviously SMART didn’t help for a bit
* Report – No errors!
* Short Self test – Your disk is fine!
* You want a long test that takes 4 hours? Your machine locks up before it completes, haha!
But when the disk kept failing every time when the md1 resync hit 36%, I yanked out the disk and sent it RMA.
Dmesg showed error like this:

[ 3362.784129] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[ 3362.784132] ata1.00: failed command: READ DMA EXT
[ 3362.784135] ata1.00: cmd 25/00:00:3f:60:f4/00:04:57:00:00/e0 tag 0 dma 524288 in
[ 3362.784135] res 40/00:00:02:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 3362.784136] ata1.00: status: { DRDY }
[ 3362.784139] ata1: hard resetting link
[ 3364.002049] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[ 3364.009142] ata1.00: configured for UDMA/33
[ 3364.009148] sd 0:0:0:0: [sda] Result: hostbyte=0x00 driverbyte=0x08
[ 3364.009150] sd 0:0:0:0: [sda] Sense Key : 0xb [current] [descriptor]
[ 3364.009152] Descriptor sense data with sense descriptors (in hex):
[ 3364.009153] 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00
[ 3364.009156] 00 00 00 01
[ 3364.009158] sd 0:0:0:0: [sda] ASC=0x0 ASCQ=0x0
[ 3364.009159] sd 0:0:0:0: [sda] CDB: cdb[0]=0x28: 28 00 57 f4 60 3f 00 04 00 00
[ 3364.009162] end_request: I/O error, dev sda, sector 1475633215
[ 3364.009174] ata1: EH complete

So today I figured I could continue installing with only half a raid-1 array.
But it didn’t boot (“Loading operating system…. *halt*).
I figured lilo must have been installed to the MBR of the disk that I yanked, so I booted from LAN and ran lilo.
Obviously lilo complained, because /dev/sda was only half the raid-1 array and disks were missing!
Fine. I changed my boot device to /dev/md0, hoping that lilo would get the hint.


# lilo
Warning: LBA32 addressing assumed
Fatal: Not all RAID-1 disks are active; use '-H' to install to active disks only
# lilo -H
Warning: LBA32 addressing assumed
Warning: Partial RAID-1 install on active disks only: booting is not failsafe

Warning: Faulty disk in RAID-1 array; boot with caution!!
Fatal: Unusual RAID bios device code: 0xFF

*sigh*
This is why I hate lilo. If it doesn’t work, it doesn’t work.
And it never tells you why. Or maybe it does, just like windows always tells you what’s wrong when you get a blue screen.

It’s probably this bug, but I don’t care. Always something.
Time to find the sources to grub.

Leave a Comment :, , , , more...

Humble Bundle Part 2!

by on Dec.16, 2010, under Software

Obviously when something is really successfull there will be a second version of it.
Whether the motive is pure profit or for the greater good of $omething, I can only applaud it.
Especially when they promote DRM free games on linux. (and they provide the mac and win32 versions with it as well)
So The Humble Bundle is back with part 2!

This time the games offered are: Braid, Cortex Command, Machinarium, Osmos, Revenge of the Titans.
Just Machinarium alone is worth the money (if you like puzzle games and artsy graphics), and the others that I’ve played so far are good as well.
Some games are still being developed (Revenge of the Titans), but you’re eligible for future updates. (last time they even added the bundle to Steam after a while).

Like the previous Humble Bundle, YOU get to choose how much you want to pay them.
This obviously means that if you’re a windows user you pay less than when you’re a mac or linux user 😉
And then you get to decide how much of that money goes to charity (EFF / Child’s Play) and how much of it goes to the developers. (and how much the humble bundle guys get… for bandwidth etc I’m sure).

So go get it!. Even if you don’t care about the games, you can give some money to the EFF instead 🙂

Leave a Comment :, , , , more...