Rich Freeman on 22 Dec 2018 07:59:33 -0800


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] Git: net time gain or loss?


On Sat, Dec 22, 2018 at 10:31 AM Tim Allen <flipper@peregrinesalon.com> wrote:
>
> The fact git commits are just deltas is a huge win over previous
> version control systems which would make entire copies of the code
> base when branching;

Actually, it is the opposite.  With git EVERY commit makes a complete
copy of the code base, whether branching or otherwise.  Ironically
cvs/subversion actually did store deltas with regular commits (not
sure offhand about branches).

However, git also makes use of content-hashing deduplication at both
the directory and file level, and packing does further compression on
top of this.  So, these copies don't cost much.

Git commits only look like deltas because git efficiently diffs
commits against the previous commit and presents the results as if it
were storing a delta.  Since everything is content-hashed at the
individual directory level this comparison does not need to descend
identical subdirectories - if two subdirectories have the same hash
then all their contents down to the leaves are identical.

Just a quick example:
$ git show 0aec0acdd9d816c7158c4fa128e8ee90a8c7bc7b --pretty=raw

commit 0aec0acdd9d816c7158c4fa128e8ee90a8c7bc7b
tree 318de13e110f12598aba543dc423375861ee0987
parent 3853f9eb18437f7a2b58ec0d95ffbb2e5a604dcf
author Hans de Graaff <graaff@gentoo.org> 1545457911 +0100
committer Hans de Graaff <graaff@gentoo.org> 1545457911 +0100
gpgsig -----BEGIN PGP SIGNATURE-----

 iQEzBAABCAAdFiEEIggVRmJzp0YePtgn2zR/k4ZU+jQFAlwd0PcACgkQ2zR/k4ZU
 +jSRYggAu9aIYjXj5clC8w2HghjhuYRFEC0PuWKUfEkHw5zcaNIoL2z2hpHSuYEf
 eEJDiCdofiCWIfvqZMM+90J0ibIBqoPWO18vzniazpOk4/Wilfa93GhWvt61NmVu
 GHp+E0PRo0yDy1rEjKXbAopdU2TyAeZoFMfeuOV/00aP3n80gGqx4D8wqRSsbdoH
 ZLx8u2LryzWVb3PkAgj5Y7l4MxGcV9j98UOq3+519D3mzOV6vdhy47vkfKT2WlBn
 x8CXZg41/WSvdEBJgwCIJxdoGg1V05/65TA3PDAz7lyYTvsF/tbobpjoVXULuubr
 XsAokH1aIlbd+y6WtTkauNp9s6K20w==
 =i/4Z
 -----END PGP SIGNATURE-----

    dev-ruby/spy: add 1.0.0

    Signed-off-by: Hans de Graaff <graaff@gentoo.org>
    Package-Manager: Portage-2.3.51, Repoman-2.3.11

The lines above are the entirety of a git commit record.  Of note is
the parent and tree hashes.  The parent references the previous
commit. The tree references the content of the commit.

Let's look at the tree itself:
$ git ls-tree 318de13e110f12598aba543dc423375861ee0987

100644 blob fbf45aff6770abddd596c684404ac94cd9c67487    .gitignore
040000 tree 1698d0d404f7bb4d43237db78aa4255d9abc460b    app-accessibility
040000 tree 21c9a92c4a775bf5f30eab3fc2eb4c6d9b1bb338    app-admin
040000 tree e54f63e6ebb1a0c30b65cfb23785be7df3559cd6    app-antivirus
040000 tree ea6c2b02f1dc1352e7a8466e64b20a4671aa3fea    app-arch
...
040000 tree fe55dee9af9cb6aa3446329e0dc20e7a467f3edf    dev-ruby
...

The first line is an actual file, and the rest are all references to
other trees, which are subdirectories.  Let's compare this to the
parent commit of the one I showed above:
$ git show 3853f9eb18437f7a2b58ec0d95ffbb2e5a604dcf --pretty=raw | grep tree
tree 53840547387cf92dfc754838d288a779965fa922

$ git ls-tree 53840547387cf92dfc754838d288a779965fa922
100644 blob fbf45aff6770abddd596c684404ac94cd9c67487    .gitignore
040000 tree 1698d0d404f7bb4d43237db78aa4255d9abc460b    app-accessibility
040000 tree 21c9a92c4a775bf5f30eab3fc2eb4c6d9b1bb338    app-admin
040000 tree e54f63e6ebb1a0c30b65cfb23785be7df3559cd6    app-antivirus
040000 tree ea6c2b02f1dc1352e7a8466e64b20a4671aa3fea    app-arch
...
040000 tree 6aa08fb81dc6ed1d96d1b6ad94d4731ad01b781d    dev-ruby
...

As you can see all the hashes are identical with the exception of
dev-ruby, which means that this directory contains all the changes in
that commit.  All the other objects in the two commits are shared.

-- 
Rich
___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug