JP Vossen on 23 Sep 2009 00:01:09 -0700

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] correct way to do this in bash

> Date: Tue, 22 Sep 2009 18:19:34 -0400
> From: Mag Gam <>
> Currently for my research I have a process that writes 24 hours and 5
> days a week.  Its writes to standard out and there is standard error.
> There is a lot of data, close to 300Gb a day therefore I can't lose a
> minute of outage.

Can you change the process at all?  If you can, tweaking the way it 
writes things, as Walt suggested, might be a good idea.  Or tweaking it 
so that it knows to pause & buffer for a moment while you swap log files 
out from under it when you send it some signal.

> I am capturing daily reports and cutoff is at 8:00AM.

That's an odd time.  It's usually midnight so you have a "day" of data, 
rather than 24 hours of data from two different days...  But you already 
know that, so...  Whatever...  :-)

> process > /phys/data/20090922/20090922.crac.out
> 2>/phys/data/20090922/20090922.crac.err.out
> At 7:59AM I kill the process using cron and restart the process at
> 8:00AM everyday for 5 days using cron. I lose 1 min of simulation data
> :-(.
> Is there a clever way to have my process run or restart at 8:00AM
> without cron and no interruption? Or is this the preferred way?

Q: Why do you lose 1 minute of data?
A: Because you are doing 2 steps in cron, and it's not granular enough. 

Solution: do it all in 1 step in cron.  You could probably cram this all 
into your crontab, but it would be more maintainable to put it in a 
shell script and call that from cron every day at 8 AM.

----- begin UNTESTED Shell script -----

#!/bin/bash -
# Log rotation for process

# Get the date for the log file and create the dir
now=$(date '+%Y-%m-%d')
[ -d "/phys/data/$now" ] || mkdir -p -m 0750 "/phys/data/$now"
# man mkdir, -m sets the perms in 1 step

# Kill the old one
killall process  # Or however you do it

# Start the new one with the new log files
process >  /phys/data/$now/$now.crac.out \
         2> /phys/data/$now/$now.crac.err.out

----- end UNTESTED Shell script -----

You will still lose data for however long it takes 'process' to die and 
restart, but hopefully that's under a second, instead of over a minute.

Depending on how reliable everything is, you might need to add some 
error checking to make sure everything happens as it's supposed to.  If 
you are doing it from cron in 2 steps like I think you are, and that 
works well enough except for the gap, it's probably reliable enough.

I specified /bin/bash, but depending on your OS and version, /bin/sh 
might be bash in Bourne mode, or dash, or something else.  Bash in 
POSIX/Bourne mode might be slightly faster to start up, dash will 
probably be slightly faster to start up, so you might be able to shave 
fractions of a second closer to 8 AM by experimenting with other shells 
a bit.  Bash has so many great user/interactive features that it's a but 
slower to fire up than others.  (Paul, I don't wanna hear about zsh, 
that's got to be even worse! :)

Also, if you really care about "8 AM" you are running NTP on the server, 

Good luck,
JP Vossen, CISSP            |:::======|
My Account, My Opinions     |=========|
"Microsoft Tax" = the additional hardware & yearly fees for the add-on
software required to protect Windows from its own poorly designed and
implemented self, while the overhead incidentally flattens Moore's Law.
Philadelphia Linux Users Group         --
Announcements -
General Discussion  --