[ale] Monitoring Solutions

Jim Kinney jim.kinney at gmail.com
Wed Feb 24 20:18:21 EST 2021


The abuse of sysadmins by the monitor tool is why they get turned off.

Never ever set alerts for anything except to page the person who wants them turned on. That shit will get shut down fast. Takes about 2 weeks of 24x7 pages.

Dashboards. That's a solution that works. Alerts go to a dashboard and the color changes for a big, visible across the room banner.

If the problem exists 24x7, the solution is a 24x7 live support contract.

On February 24, 2021 5:26:33 PM EST, Jeff Hubbs via Ale <ale at ale.org> wrote:
>I will say that one Nagios environment at A Previous Employer^TM made 
>sysadminning hell because Nagios was implemented by the netops folks
>who 
>never took into account what the systems involved actually did. Why
>yes, 
>my file server lights up all its cores at 3AM when it runs ClamAV on
>all 
>the Samba filespace (>200MiB/s read!) or when it performs a mksquashfs 
>on same; don't bother me!
>
>On 2/24/21 4:08 AM, Leam Hall via Ale wrote:
>> Working on my career skills, and I need to add a monitoring solution.
>
>> Is Nagios still the first choice in open source options? I need to 
>> learn server, network, and application monitoring to beef up a 
>> challenge area in my Site Reliability Engineering skill set.
>>
>> Happy Wednesday!
>>
>> Leam
>>

-- 
Computers amplify human error
Super computers are really cool
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.ale.org/pipermail/ale/attachments/20210224/a17d9c42/attachment.html>


More information about the Ale mailing list