BenV's notes

Tag: raid

Linux Software Raid disk upgrades

by on Dec.16, 2012, under Software

Every now and then you find out that this huge disk you’ve been using — you know, the one that when you bought it you thought “How on earth am I ever going to fill this one up? My biggest game can fit on this disk 100 times!” — … isn’t as huge anymore. Or at least all the free space on it has disappeared and nagios is whining that your disk is full or about to explode.
Some background info: My fileserver here at home has 3 linux software raid arrays (raid-1 mirrors) on top of 4 physical disks. The first and also smallest array is used as root filesystem to boot from into Slackware linux. The second and third arrays are both big and simply for storage of games, music, series, etc.
When I created that first array a few years ago I figured “Hm, 20GB should be enough for a slackware install, right? Well, let’s just make it 50GB to be sure, we have plenty of space anyway on this huge disk“. Back then the ‘huge’ disks were 500GB. Meanwhile those 500GBs have been replaced with 1TB ones, but that array remained the same. Today I have a set of 1.5TB drives to replace the 1TB ones. Not a huge upgrade, but I didn’t have to buy these disks since they came from a server that had its drives upgraded as well. Anyhow, the 50GB partition managed to get filled with over 40GB of stuff that I can’t trash (mostly user home directories). I could move them to a different partition of course, but today we’re going to resize that partition to 100GB and put the rest in the storage partition.
Off-topic note: Do you also hate it when you’re typing in a browser and hit CRTL-w to delete your last word and realize you just closed your tab? I sure as heck do, good thing wordpress saves these drafts every now and then 🙂 (continue reading…)

2 Comments :, , , more...

Why I hate lilo

by on Jan.11, 2011, under Software

Every time I install a machine with the latest Slackware, I’m amazed again at the installed boot manager – lilo.
Sure, lilo works. Most of the times. Even when you have a raid-1 boot device.
Unless you don’t have the latest version of lilo of course.

Today I tried to continue a Slackware64 (current) install of a machine that I installed a week ago.
It worked fine, was just about to install Xen when one of the disks started acting up.
Obviously SMART didn’t help for a bit
* Report – No errors!
* Short Self test – Your disk is fine!
* You want a long test that takes 4 hours? Your machine locks up before it completes, haha!
But when the disk kept failing every time when the md1 resync hit 36%, I yanked out the disk and sent it RMA.
Dmesg showed error like this:

[ 3362.784129] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[ 3362.784132] ata1.00: failed command: READ DMA EXT
[ 3362.784135] ata1.00: cmd 25/00:00:3f:60:f4/00:04:57:00:00/e0 tag 0 dma 524288 in
[ 3362.784135] res 40/00:00:02:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 3362.784136] ata1.00: status: { DRDY }
[ 3362.784139] ata1: hard resetting link
[ 3364.002049] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[ 3364.009142] ata1.00: configured for UDMA/33
[ 3364.009148] sd 0:0:0:0: [sda] Result: hostbyte=0x00 driverbyte=0x08
[ 3364.009150] sd 0:0:0:0: [sda] Sense Key : 0xb [current] [descriptor]
[ 3364.009152] Descriptor sense data with sense descriptors (in hex):
[ 3364.009153] 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00
[ 3364.009156] 00 00 00 01
[ 3364.009158] sd 0:0:0:0: [sda] ASC=0x0 ASCQ=0x0
[ 3364.009159] sd 0:0:0:0: [sda] CDB: cdb[0]=0x28: 28 00 57 f4 60 3f 00 04 00 00
[ 3364.009162] end_request: I/O error, dev sda, sector 1475633215
[ 3364.009174] ata1: EH complete

So today I figured I could continue installing with only half a raid-1 array.
But it didn’t boot (“Loading operating system…. *halt*).
I figured lilo must have been installed to the MBR of the disk that I yanked, so I booted from LAN and ran lilo.
Obviously lilo complained, because /dev/sda was only half the raid-1 array and disks were missing!
Fine. I changed my boot device to /dev/md0, hoping that lilo would get the hint.


# lilo
Warning: LBA32 addressing assumed
Fatal: Not all RAID-1 disks are active; use '-H' to install to active disks only
# lilo -H
Warning: LBA32 addressing assumed
Warning: Partial RAID-1 install on active disks only: booting is not failsafe

Warning: Faulty disk in RAID-1 array; boot with caution!!
Fatal: Unusual RAID bios device code: 0xFF

*sigh*
This is why I hate lilo. If it doesn’t work, it doesn’t work.
And it never tells you why. Or maybe it does, just like windows always tells you what’s wrong when you get a blue screen.

It’s probably this bug, but I don’t care. Always something.
Time to find the sources to grub.

Leave a Comment :, , , , more...