[Leaplist] 24-7 redundancy for poe folk

Bryan J Smith b.j.smith at ieee.org
Thu Jul 30 00:27:52 EDT 2009


On Mon, 2009-07-27 at 01:42 -0400, Randall Perry wrote:
> Solution found.
> I have spent the past week playing and configuring Zabbix.
> The free and opensource utility allows you to monitor a myriad of
> events and configure triggers.
> It supports SNMP, service status (including just ping), SMS
> notification via attached cellphone, email notification, and tons
> more.
> Through their zabbix agent running on each server, you can also
> monitor CPU usage, disk usage, mem usage, etc. (and historically
> graph/report it).
> I know others will really like this product.

You know, I should have mentioned Zabbix and a few other options.
Looking back, my apologies for not doing so.  Zabbix will monitor many
services with_out_ an agent on those systems, which is _exactly_ what
you originally wanted.  I feel like a fool for not pointing that out.

There were several reasons, with one big oversight by myself.

One is that I'm _always_ using Tier-1 PC OEM servers, namely HP and IBM,
with Dell far, far less.  So I always have the SNMP hardware agents
loaded and what not, plus things I'll manually setup traps to throw via
net-snmp or other agents/options.  With those agents, I'm typically
deploying HP OpenView (commercial) or Nagios (Open Source) as the
collector/monitor.  I also mentioned Nagios because it's in Extra
Packages for Enterprise Linux (EPEL)[1].

Two, and this is a major oversight by myself, I _failed_ to recognize
that Zabbix is also in EPEL.  Duh!  I should have looked for it, but I
assumed it was not.  So, again, my apologies there, especially
considering it was such an ideal fit.  You can setup a Zabbix server
with all sorts of probes and it does the job hitting various services,
without any client/agent configuration on the other end.  At the same
time, you can add its agent and even add SNMP later on.  I guess I got
focused on storage and other aspects and just mentioned Nagios without
getting deeper.

Lastly, combined with number two (which was my ignorance), my common
attitude that unless I have used it in a production environment (those
who know me know what I call a production environment ;), I don't like
to even mention things.  So, again, that's why I mentioned Nagios.  Not
that Nagios is more "production ready" than Zabbix, I just have never
used Zabbix other than to play with at home.  And, again, I didn't
realize that Zabbix was in EPEL.

[1] https://fedoraproject.org/wiki/EPEL  


On Mon, 2009-07-27 at 14:47 -0400, Jesse Rhoads wrote:
> Just to give you guys a heads up, I personally had to abandon the use
> of Zabbix.  I too thought it would be easier than Nagios, and after I
> tried installing it on some production machines, very strange things
> started acting up (random core dumps, kernel panics, etc), so I
> suspect there may be something very unstable in it.  It definitely was
> enough for me to pull the plug on it immediately.  These problems
> occurred only after installing Zabbix daemons, and after disabling
> Zabbix the problems these have not returned.  I hate to make
> assumptions but I can only reasonably assume that Zabbix was the
> cause, and I can't put something unstable on my servers.
> I did not have the time or motivation to try to track down what it was
> doing, but I did want to pass along that I have a standing
> red-flag-YMMV warning on it.

What distro and where are you getting Nagios from?

E.g., although Red Hat won't support the Fedora Project's EPEL
components (regardless of the fact that a lot of Red Hat employees build
them), it will support the underlying platform (Red Hat Enterprise
Linux) and analyzing kdumps.  Red Hat Support Engineering (SEG) can
quickly tell you what is crashing the system from those kdumps, even if
it's a non-supported platform component (they just won't help engineer a
fix).

Just wanted to mention that in case you're running Red Hat Enterprise
Linux.  It's one of the things the subscriptions fund (as well as
upstream developers, some of which who regularly do SEG work and
discover issues in kdumps which result in patches that go upstream). 

> Another alternative is to simply use Nagios with the Groundwork
> Monitor framework around it.  That is what I am adopting here.  It
> takes a bit to get up and running but is solid so far.

Nagios is a solid, Open Source SNMP collector/front-end.  I've fed it
from all sorts of closed and open source agents, traps, etc...


-- 
Bryan J Smith          Professional, Technical Annoyance
b.j.smith at ieee.org    http://www.linkedin.com/in/bjsmith
--------------------------------------------------------
I don't have a "favorite Linux distro."  I use, develop
and support community efforts, often built around Linux.
Technology and solutions are my focus, not dragging in
assumptions, marketing and other concepts which dominate
non-community developed software, which I left long ago.



-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.



More information about the Leaplist mailing list