From cf5920165180a047e5114e0c450ddf57a7819741 Mon Sep 17 00:00:00 2001
From: Nico Schottelius You've a raid and you want to monitor it with FreeBSD. That may or
+You've a raid and you want to monitor it with FreeBSD. That may or
may not be a problem. I'll try to summarise all information I got. If
you know that there's something incorrect or outdated, please contact
me. In general monitoring the state of a raid may be problematic, if
the hardware does not expose the needed information or does just expose
it via notification (it sends a messages "raid status changed" through
the driver, which you can try to grep out of syslog, but you cannot
-monitor it actively). And the other one: For more information on supported devices have a look at amr(4). (This is untested by me, just found it on the net). On http://lists.freebsd.org/pipermail/freebsd-proliant/2006-October/000169.html I also found the relevant strings to look for: You could also use hpacucli, which can be found at http://people.freebsd.org/~jcagle/. I have no experience with it. So if you have, you can send report or scripts to monitor it to me, so I can include it here (the hint to it was send by Jaimie Sirovich. Install and configure sysutils/3dm. This installs a daemon that provides a webinterface and which is also capable to notify you via e-mail if something happens. This is perhaps the easiest way of monitoring raid in FreeBSD. The other possibility to monitor 3ware raids is via tw_cli. This is a softwareraid driver for many different cards, have a look at ataraid(4). Somebody in ##freebsd (irc.freenode.org) pasted the url http://www.monkeybrains.net/~rudy/example/raid_status.html, which contains a script that monitors gmirror, 3ware (via tw_cli) and also ataraid (ar0) via atacontrol. For archiving, the script is mirrored below:
+
### FreeBSD gmirror software raid
-
As you might expect, monitoring this raid is pretty easy. We achieved that with the following two scripts:
+As you might expect, monitoring this raid is pretty easy.
+We achieved that with the following two scripts:
ddna044% cat /usr/local/scripts/fbsd_raid_monitor/cfs_gmirror.sh
And the one that is called by cron:
#!/bin/sh
#==============================================================================
# Copyright (c) 2007, Netstream AG
# Author: Nico Schottelius <nico-freebsd-raid-monitoring <at> schottelius.org>
# Created: 2007-04-23
# Description: Display state of all gmirror devices
# Created-By: /home/user/nico/firmen/netstream/sh/neues_skript.sh
#==============================================================================
gmirror list | \
awk -F: 'BEGIN { print "gmirror devices";
print "---------------";
}
/^Geom name:/ {
name=$2
}
/^State:/ {
print name ":" $2
}'
ddna044% cat /usr/local/scripts/fbsd_raid_monitor/cfrib_gmirror.sh
-###
-LSI / Symbios Megaraid (amr driver)
+
+
+### LSI / Symbios Megaraid (amr driver)
#!/bin/sh
#==============================================================================
# Copyright (c) 2007, Netstream AG
# Author: Nico Schottelius <nico-freebsd-raid-monitoring <at> schottelius.org>
# Created: 2007-04-23
# Description: Report broken devices.
# Created-By: /home/user/nico/firmen/netstream/sh/neues_skript.sh
#==============================================================================
check=$(dirname $0)/cfs_gmirror.sh
# Skip first two lines: header
"$check" | awk -F": " 'BEGIN { getline; getline } $2 !~ /COMPLETE/ { print $1 ":" $2 }'
There are two possibilities to monitor amr-based devices:
The utility "amrstat" is availale in ports as sysutils/amrstat and is FOSS. Calling it reveals all needed information:
@@ -37,14 +38,14 @@ LSI / Symbios Megaraid (amr driver)
#!/bin/sh -f
#
# Display status of RAID volumes on amr(4) controllers using the LSI MegaRC
# utility. If more than one adapter exists, add additional unit numbers to
# $adapters.
#
# $Id$
#
# If there is a global system configuration file, suck it in.
#
if [ -r /etc/defaults/periodic.conf ]; then
. /etc/defaults/periodic.conf
source_periodic_confs
fi
adapters="0"
rc=0
case "${daily_amr_status_enable:-YES}" in
[Nn][Oo])
;;
*)
for adapter in $adapters; do
echo ""
echo "AMR RAID status (adapter ${adapter}):"
/usr/local/sbin/megarc -ldinfo -a${adapter} -Lall -nolog |\
sed '1,/Information Of Logical Drive/d' || rc=$?
done
;;
esac
exit "$rc"
mpt based devices can be monitored under Linux with the kernel module "mptctl" and the FOSS tool "mpt-status". There seems to be no support under FreeBSD available currently. For more information about mpt have a look at mpt(4).
+
### ciss
-
Known tools:
###
-###
This driver is used for most HP / Compaq controllers and is (afaik) found in almost all modern SAS/SATA systems provided by HP. As described in http://www.unixadmintalk.com/f41/monitoring-raid-arrays-51889/, you can monitor it via camcontrol:
# camcontrol inquiry da0
pass0: <COMPAQ RAID 1 VOLUME OK> Fixed Direct Access SCSI-0 device
pass0: 135.168MB/s transfers
#!/bin/sh-### Adaptec: aac -
# raid_status - check the state of the RAID.
# This script works for various types of RAID devices. (Currently, 3Ware, gmirror, BSd 'ar0' raids)
# WARNING: Install the proper CLI program for your 3ware card, if you use 3ware.
# Set up a cronjob like this:
# */16 * * * * /home/rudy/bin/raid_status CRON
### Copyright (c) 2006, Rudy Rucker All rights reserved.
### Redistribution and use of script, with or without modification, is
### permitted provided that the following condition is met:
### Redistributions of source code must retain the above copyright
### notice, this list of conditions and the following disclaimer.
### THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND
### ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
### IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
### ARE DISCLAIMED.
# ----------- Change Log ------------
# Mon Oct 11 15:20:37 PDT 2004 - rudy
# Original script.
# Tue Feb 7 01:28:07 PST 2006 - rudy
# Added 9500 and 9550 support
# Fri Jun 9 10:38:33 PDT 2006 - rudy
# works for 'ar' and 'tw' mirrored arrays
# Tue Sep 12 10:23:13 PDT 2006 - rudy
# Added gmirror and realized that not all 3ware's are the same...
MODE=$1
TWCLI="/usr/local/bin/tw_cli"
GMIRROR="/sbin/gmirror"
ATACONTROL="/sbin/atacontrol"
AWK="/usr/bin/awk"
GREP="/usr/bin/grep"
MAIL="/usr/bin/mail"
EMAIL="noc@example.com"
# if this is not a 3ware card, check the atacontol
if [ -c /dev/twed0 ] && [ -x $TWCLI ]; then
# 3ware card ... 8000 series
STATUS=`$TWCLI info c0 u0 | $GREP "^Status" | $AWK {'print $2'}`;
VALID='OK'
ESTATUS_CMD="$TWCLI info c0 u0";
# double check the 3ware output incase it returned nada...
# Umm... this is the only raid card I have witness this bug
if [ "X$STATUS" = "X" ]; then
sleep 1;
STATUS=`$TWCLI info c0 u0 | $GREP "^Status" | $AWK {'print $2'}`;
fi
elif [ -c /dev/da0 ] && [ -x $TWCLI ]; then
# Note, there are plenty of other device names that use da0... this script is
# not for those... works with:
# 3ware 9550SX, 9500S
STATUS=`$TWCLI info c0 | $GREP "^u0" | $AWK '{print $3}'`;
VALID='OK'
ESTATUS_CMD="$TWCLI info c0 u0"
elif [ -c /dev/mirror/gm0 ] && [ -x $GMIRROR ]; then
# gmirror /dev/mirror/gm0
STATUS=`$GMIRROR status gm0 | $GREP "^mirror" | $AWK {'print $2'}`;
VALID='COMPLETE'
ESTATUS_CMD="$GMIRROR list";
elif [ -c /dev/ar0 ] && [ -x $ATACONTROL ]; then
# Motherboard promise and others
STATUS=`$ATACONTROL status ar0 | $GREP "status" | $AWK -F 'status: ' '{print $2}'`;
VALID='READY'
ESTATUS_CMD="/sbin/atacontrol status ar0"
else
echo "Unknown Raid type.... ";
if [ -x $TWCLI ]; then
echo " + found $TWCLI";
else
echo " - can't exec $TWCLI";
fi
if [ -x $ATACONTROL ]; then
echo " + found $ATACONTROL";
else
echo " - can't exec $ATACONTROL";
fi
if [ -x $GMIRROR ]; then
echo " + found $GMIRROR";
else
echo " - can't exec $GMIRROR";
fi
exit;
fi
# Okay, we checked the raid status and know what the return code should be.
if [ "$STATUS" = "$VALID" ]; then
if [ "$MODE" = "CRON" ]; then
exit;
fi
echo "OK condition";
$ESTATUS_CMD
exit;
fi
# ERROR! Either print to TTY or send an email, based on MODE (which is arg[1])
if [ "$MODE" = "CRON" ]; then
$ESTATUS_CMD | $MAIL -s "[ERROR] Raid array on $HOST returned $STATUS" $EMAIL
else
echo "ERROR condition"
$ESTATUS_CMD
fi