Fred Stluka via plug on 1 Jun 2021 18:28:28 -0700


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] Using diff


Michael, and other PLUG folks,

I've been meaning forever to write a script:
- subtract = subtract one file from another and show the lines
   that occur only in the 1st file.

Now that you brought it up, and since Rich tipped me off to
comm, I decided to finally do it.  While I was at it, I also wrote:
- intersect = show the lines that occur in both files
- union = show the lines that occur in one or both files

And I added these options to all of them:
-i = ignore case
-a = show duplicates (like UNION ALL in a SQL query)

The subtract and intersect scripts are basically just one-line
calls to:
% comm -23
% comm -12
like what Rich showed us.

The union script is basically:
% sort --merge

I also made them all sort the files first, in case they aren't already
sorted.

And I threw in a bunch of error checking, -h/--help for usage info,
comments, etc.  You can see them at:
- http://bristle.com/Tips/Mac/Unix/subtract
- http://bristle.com/Tips/Mac/Unix/intersect
- http://bristle.com/Tips/Mac/Unix/union

I didn't have to write the other one I was considering:
- join = join the lines of 2 files on a common field

It already exists.  See:
% man join

I wrote them for Mac, but they're mostly just plain Unix/Linux.
Should work anywhere.  The one exception is the -i option I added
to subtract and intersect.  The -i option of the union script should
work fine.  But for subtract and intersect, I map it to the -i option of
comm on Mac.  That option may not exist on your flavor of Linux.

If so, you may have to change the value of the LC_COLLATE env var
before calling comm.  Or force the input files to all upper or lower
case first, or something.  If you go that route, you can use my
scripts as filters:
- http://bristle.com/Tips/Mac/Unix/lowercase
- http://bristle.com/Tips/Mac/Unix/uppercase
Or just copy their contents, which are basically one-liners that
call tr:
% tr "[:upper:]" "[:lower:]"
% tr "[:lower:]" "[:upper:]"

I may add that, but not right now.  I've been sitting in my garden
working on my laptop all day.  See:
- http://bristle.com/GardenOffice.jpg
and decided to catch up on a major backlog of PLUG posts.  I
spotted your question, and got inspired to whip these up quickly,
but my battery was already pretty low and I'm now at 4%.  Gotta
run...

BTW, feel free also to check out the rest of my Linux/Unix and
Mac scripts at:
- http://bristle.com/Tips/Unix
- http://bristle.com/Tips/Mac/Unix

Enjoy!  (Oops!  3% and counting...)

--Fred
------------------------------------------------------------------------
Fred Stluka -- http://bristle.com -- Glad to be of service!
Open Source: Without walls and fences, we need no Windows or Gates.
------------------------------------------------------------------------

On 5/27/21 8:29 PM, Walt Mankowski via plug wrote:
I usually end up doing this in Perl or Python. The diff solution is
nice. If you're interested, I've attached a little Perl script I just
whipped up. It has the advantage of not requiring the lists to be
sorted.

Then I wrote a Python version, too. :)

Walt

On Thu, May 27, 2021 at 07:36:05PM -0400, Michael Lazin via plug wrote:
Awesome, thanks!   Comm got the job done in a pinch but I am going to try
diff tomorrow morning.  Thanks again for your help.

Sincerely,

Michael Lazin

On Thu, May 27, 2021, 6:01 PM Carlos M. Fernández <aremmes@gmail.com> wrote:

This has worked for me before:

diff -Nau first second | grep '^+'

This assumes that first and second are sorted.

On Thu, May 27, 2021, 17:37 Michael Lazin via plug <
plug@lists.phillylinux.org> wrote:

I used the sort command to make two lists.  One list has everything on
the first list and more.  They are already alphabetically sorted.  I want
to know what is on the second list that is not on the first list.  I am
trying to use diff but perhaps there is a better way.   I tried awk and an
ugly egrep and now I am giving up and asking for help.  Thanks for your
time.

Sincerely,

Michael Lazin

___________________________________________________________________________
Philadelphia Linux Users Group         --
http://www.phillylinux.org
Announcements -
http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --
http://lists.phillylinux.org/mailman/listinfo/plug

___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug

___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug

___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug