Steve Litt via plug on 17 Oct 2023 17:19:53 -0700


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] Could you put that in an email?


Walt Mankowski via plug said on Tue, 17 Oct 2023 16:36:50 -0400

>On Tue, Oct 17, 2023 at 03:00:11AM -0400, Steve Litt via plug wrote:
>> L K via plug said on Mon, 16 Oct 2023 18:38:40 -0400
>>   
>> >Hey Pluggers,
>> >  I have a home arch linux server I use for various hobbiest
>> > services. I'd
>> >like to get notified via email if it goes down. I usually would run
>> >such things on my server but... well.. you see my dilemma :D. Can
>> >anyone recommend a service they have used for such? I'd prefer a
>> >pull-based one (like issuing a web request from the server) for
>> >security. I have a bad association with pagerduty :P.
>> >
>> >Lou  
>> 
>> Sounds like a job for a shellscript to me. The shellscript contains a
>> subroutine to test whether the remote server is doing its job, and
>> another subroutine to email error messages, so pseudocode looks like
>> this:
>> 
>> forever
>>    test remote server
>>    if problem
>>       email people
>>    sleep 1 minute
>> 
>> The shellscript could be launched as a daemon by systemd, sysvinit,
>> runit, s6 or OpenRC.  
>
>Of course at a high level it would work something like that, but the
>devil is in the details. Here are a few questions off the top of my
>head:
>
>* Where will this shell script run?

By the init system during boot

>
>* How will it test the remote server?

It will test for some piece of proper behavior by the remote server. If
it's a remote Posgres server with a known table, the test might be as
simple as a psql command.

>* Does something need to be running on the remote server too?

Just the server being monitored.

>* How will you email the people that need to know?

One way is to have a list of the recipients in a file. A failed test
runs a script that, for each recipient, put the error message in a
properly formatted email file, and stuff it in the SMTP or Nullmailer
queue. Others are much more knowledgeable about this part of it than I,
but I've done stuff like this in the past.

>
>* How will you ensure that whatever took out the remote server doesn't
>  take out the test script too?

There's no component of the test loop on the remote server. For
instance, if it's a Postgres remote server, how will a down
remote Postgres take down a psql command on the monitoring machine?

>
>* How do you get it to stop pinging you so that you're not flooded
>  with alerts every minute when it goes down?

Ah, good point. The interval would need to be a variable. After the
first message, it changes from 60 to 3600 or whatever. Once it tests
good again, the 3600 is switched back to 60. Or perhaps there might be
a manual component to the reset, so if the server keeps going up and
down, nobody gets pinged excessively.

>Presumably packages designed for this sort of testing will have
>solutions to most/all of these potential issues.

No package would need to be designed for the monitoring I suggest.
Whatever the client side needs from the server, the test would simply
exercise that need in a non-destructive way.

SteveT

Steve Litt 

Autumn 2023 featured book: Rapid Learning for the 21st Century
http://www.troubleshooters.com/rl21
___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug