Paul L. Snyder on 10 May 2005 15:03:09 -0000


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] Tutorials


Quoting Jon Nelson <quincy@linuxnotes.net>:

> Mark M. Hoffman said:
> > * Jon Nelson <quincy@linuxnotes.net> [2005-05-06 15:14:57 -0400]:
> >>
> >>     $ find ../dir1/ | xargs tar cvf test.tar
> >
> > Ugh, no.  The xargs man page says:
> >
> > 	xargs reads arguments from the standard input, delimited
> > 	by  blanks  (which can  be protected with double or single
> > 	quotes or a backslash) or newlines, and executes the command
> > 	(default is /bin/echo) one or more times with any initial-
> > 	                       ^^^^^^^^^^^^^^^^^
> > 	arguments followed by arguments read from standard input.
> >
> > If you use xargs with tar that way (on a big enough directory tree)
> > you will end up missing files.
[...]
> If I understand your post correctly you feel that on a larger tree you
> might encounter files with spaces in them.  [...]

The bit that Mark underlined was the "one or more times".

> That's why I mentioned '-print0' in first post.  I believe this would
> take care of the above:
> 
>     $ find ../dir1/ -print0 | xargs --null tar cvf test.tar
[...]

This is a good thing to do.

> I don't think the '-r' option for 'tar' is necessary because the 'tar'
> command is executed once, not for every argument.  Really, I guess '-r'
> or
> '-c' would work.

Here's the nub: if xargs exceeds the size that is allocated for the
command-line, the tar command will be executed more than once.  If tar
is executed with the 'c' action, the archive will be clobbered, losing
any files that were there already.

By default, according to 'man xargs', the maximum number of chars per
line is 131072, not including environment variables.  If the OS has
limits that are less than this, xargs will warn and reduce the size.
(Theoretically; I haven't tested this.)  Now, you won't usually hit
this limits when tarring up your home directory on a Linux box, but it's
good to keep them in mind in the interests of portability, reliability,
and pedantry.  With deeply nested directories, line lengths can grow
very quickly.

You can simulate the effects of long lines using the '-s' option, which
limits the maximum number of characters that xargs will consider
acceptable. (Here we also specify '-t', which prints the generated
command lines before execution.)

  % ls
  foo.beer  foo.dot  foo.gif  foo.xml
  
  % find . -name "foo*" | xargs -t -s38 tar cvf test.tar
  tar cvf test.tar ./foo.beer ./foo.dot
  ./foo.beer
  ./foo.dot
  tar cvf test.tar ./foo.gif ./foo.xml
  ./foo.gif
  ./foo.xml

We can see that tar is executed twice, because adding a second
argument would exceed the line length.  Now, checking the archive
contents, we see that only the files from the second tar are present.

  % tar tf test.tar
  ./foo.gif ./foo.xml

Specifying 'r' as the action for tar rather than 'c' causes each 
subsequent tar to append, rather than clobber.  You'll usually want
to zap the target tarfile first, because 'r' does not replace existing
files.  (There is an analogy there to the '>' and '>>' file redirection
shell syntax.)

Another option is to use 'tar u', which creates the archive if it does
not exist and adds or updates files depending on modification times and
presence in the archive.

HTH,
pls
___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug