Julien Vehent on 4 Apr 2011 13:21:41 -0700 |
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
Re: [PLUG] gnu parallel and tar |
On Mon, 4 Apr 2011 15:45:55 -0400, Austin Murphy wrote:
On Mon, Apr 4, 2011 at 2:37 PM, Julien Vehent <julien@linuxwall.info> wrote:On Mon, 4 Apr 2011 13:21:59 -0400, Austin Murphy wrote:I've had a good experience with lbzip2, a multi-threaded implementation of bzip.....Initial file: $ ls -s jmeter-server-node1.log --block-size=1 689274880 jmeter-server-node1.log === with bzip2 ==== $ time bzip2 -z -9 jmeter-server-node1.log real  Â8m33.220s user  Â8m31.444s sys   0m0.880s $ ls -s jmeter-server-node1.log.bz2 --block-size=1 1589248 jmeter-server-node1.log.bz2....=== with lbzip2 ==== $ time lbzip2 -n 4 -z -9 -S jmeter-server-node1.log real  Â5m37.425s user  Â20m57.227s sys   0m5.016s $ ls -s jmeter-server-node1.log.bz2 --block-size=1 1601536 jmeter-server-node1.log.bz2....Compression is of the same level, but I'm surprised to see that while lbzip2 is 65% faster, it also uses 250% more user time than bzip2. The efficiencyper-core is a lot lower, but I'm happy to be using all my cores.My understanding is that bzip2 is highly optimized to avoid cache misses. If you have too many threads running at once you might be blowing out a shared cache. You might try running with -n 2 or letting it decide how many threads to run.
Well that's interesting: $ time lbzip2 -n 2 -z -9 jmeter-server-node1.log real 5m55.924s user 11m47.732s sys 0m2.480sI'm only using 2 threads and I get almost the same performances as with 4 threads.
Parallelism is hard :) Julien ___________________________________________________________________________ Philadelphia Linux Users Group -- http://www.phillylinux.org Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce General Discussion -- http://lists.phillylinux.org/mailman/listinfo/plug