Rich Freeman on 7 Apr 2011 05:33:08 -0700


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] gnu parallel and tar


I figured I'd see what I get on my system (a recent Phenom II X4).
All operations are on a tmpfs (quite a bit of free ram here right
now).

On Mon, Apr 4, 2011 at 4:21 PM, Julien Vehent <julien@linuxwall.info> wrote:
>>> Initial file:

ls -s access_log --block-size=1
481239040 access_log

>>> === with bzip2 ====

time bzip2 -z -9 access_log

real    3m1.397s
user    3m1.124s
sys     0m0.187s

ls -s access_log.bz2 --block-size=1
15572992 access_log.bz2


>>> === with lbzip2 ====

time lbzip2 -n 4 -z -9 -S access_log

lbzip2: "access_log": condvar counters:
lbzip2: any worker tried to consume from splitter:                    547
lbzip2: any worker stalled                       :                      9
lbzip2: muxer tried to consume from workers      :                    869
lbzip2: muxer stalled                            :                    434
lbzip2: splitter tried to consume from muxer     :                    953
lbzip2: splitter stalled                         :                    419
lbzip2: fchown("access_log.bz2"): Operation not permitted
lbzip2: unlink("access_log"): Operation not permitted

real    0m49.680s
user    3m16.039s
sys     0m0.723s

Wow...

Not sure what CPU you were running - the Phenom II has a fair bit of
cache which probably is helping.  Also, I'm running off of tmpfs so
maybe IO is slowing you down.  Looking at atop the lbzip2 process was
almost always over 390% CPU, and often was at 399%.

ls -s access_log.bz2 --block-size=1
15638528 access_log.bz2

And with two cores:

time lbzip2 -n 2 -z -9 access_log
lbzip2: fchown("access_log.bz2"): Operation not permitted
lbzip2: unlink("access_log"): Operation not permitted

real    1m33.072s
user    3m5.242s
sys     0m1.323s

I'd be really curious how much of your problem was IO.

I tried it again on a real drive, but again I have lots of RAM so
cache is going to have a big impact here (for reads, and to a big
extent writes as well since we're not far over the 30 second commit
inverval):

time lbzip2 -n 4 -z -9 -S access_log
lbzip2: "access_log": condvar counters:
lbzip2: any worker tried to consume from splitter:                    548
lbzip2: any worker stalled                       :                     10
lbzip2: muxer tried to consume from workers      :                    898
lbzip2: muxer stalled                            :                    448
lbzip2: splitter tried to consume from muxer     :                    970
lbzip2: splitter stalled                         :                    436

real    0m47.912s
user    3m10.099s
sys     0m0.560s

time lbzip2 -n 2 -z -9 -S access_log
lbzip2: "access_log": condvar counters:
lbzip2: any worker tried to consume from splitter:                    539
lbzip2: any worker stalled                       :                      3
lbzip2: muxer tried to consume from workers      :                   1007
lbzip2: muxer stalled                            :                    503
lbzip2: splitter tried to consume from muxer     :                   1028
lbzip2: splitter stalled                         :                    494

real    1m32.018s
user    3m2.492s
sys     0m1.845s


Hmm, results look almost the same (strangely enough a hair faster!).
I suspect that you were either IO bound, or that maybe you have a CPU
that has a pretty small cache.

Rich
___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug