Rich Freeman on 22 Dec 2018 07:59:33 -0800 |
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
Re: [PLUG] Git: net time gain or loss? |
On Sat, Dec 22, 2018 at 10:31 AM Tim Allen <flipper@peregrinesalon.com> wrote: > > The fact git commits are just deltas is a huge win over previous > version control systems which would make entire copies of the code > base when branching; Actually, it is the opposite. With git EVERY commit makes a complete copy of the code base, whether branching or otherwise. Ironically cvs/subversion actually did store deltas with regular commits (not sure offhand about branches). However, git also makes use of content-hashing deduplication at both the directory and file level, and packing does further compression on top of this. So, these copies don't cost much. Git commits only look like deltas because git efficiently diffs commits against the previous commit and presents the results as if it were storing a delta. Since everything is content-hashed at the individual directory level this comparison does not need to descend identical subdirectories - if two subdirectories have the same hash then all their contents down to the leaves are identical. Just a quick example: $ git show 0aec0acdd9d816c7158c4fa128e8ee90a8c7bc7b --pretty=raw commit 0aec0acdd9d816c7158c4fa128e8ee90a8c7bc7b tree 318de13e110f12598aba543dc423375861ee0987 parent 3853f9eb18437f7a2b58ec0d95ffbb2e5a604dcf author Hans de Graaff <graaff@gentoo.org> 1545457911 +0100 committer Hans de Graaff <graaff@gentoo.org> 1545457911 +0100 gpgsig -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEIggVRmJzp0YePtgn2zR/k4ZU+jQFAlwd0PcACgkQ2zR/k4ZU +jSRYggAu9aIYjXj5clC8w2HghjhuYRFEC0PuWKUfEkHw5zcaNIoL2z2hpHSuYEf eEJDiCdofiCWIfvqZMM+90J0ibIBqoPWO18vzniazpOk4/Wilfa93GhWvt61NmVu GHp+E0PRo0yDy1rEjKXbAopdU2TyAeZoFMfeuOV/00aP3n80gGqx4D8wqRSsbdoH ZLx8u2LryzWVb3PkAgj5Y7l4MxGcV9j98UOq3+519D3mzOV6vdhy47vkfKT2WlBn x8CXZg41/WSvdEBJgwCIJxdoGg1V05/65TA3PDAz7lyYTvsF/tbobpjoVXULuubr XsAokH1aIlbd+y6WtTkauNp9s6K20w== =i/4Z -----END PGP SIGNATURE----- dev-ruby/spy: add 1.0.0 Signed-off-by: Hans de Graaff <graaff@gentoo.org> Package-Manager: Portage-2.3.51, Repoman-2.3.11 The lines above are the entirety of a git commit record. Of note is the parent and tree hashes. The parent references the previous commit. The tree references the content of the commit. Let's look at the tree itself: $ git ls-tree 318de13e110f12598aba543dc423375861ee0987 100644 blob fbf45aff6770abddd596c684404ac94cd9c67487 .gitignore 040000 tree 1698d0d404f7bb4d43237db78aa4255d9abc460b app-accessibility 040000 tree 21c9a92c4a775bf5f30eab3fc2eb4c6d9b1bb338 app-admin 040000 tree e54f63e6ebb1a0c30b65cfb23785be7df3559cd6 app-antivirus 040000 tree ea6c2b02f1dc1352e7a8466e64b20a4671aa3fea app-arch ... 040000 tree fe55dee9af9cb6aa3446329e0dc20e7a467f3edf dev-ruby ... The first line is an actual file, and the rest are all references to other trees, which are subdirectories. Let's compare this to the parent commit of the one I showed above: $ git show 3853f9eb18437f7a2b58ec0d95ffbb2e5a604dcf --pretty=raw | grep tree tree 53840547387cf92dfc754838d288a779965fa922 $ git ls-tree 53840547387cf92dfc754838d288a779965fa922 100644 blob fbf45aff6770abddd596c684404ac94cd9c67487 .gitignore 040000 tree 1698d0d404f7bb4d43237db78aa4255d9abc460b app-accessibility 040000 tree 21c9a92c4a775bf5f30eab3fc2eb4c6d9b1bb338 app-admin 040000 tree e54f63e6ebb1a0c30b65cfb23785be7df3559cd6 app-antivirus 040000 tree ea6c2b02f1dc1352e7a8466e64b20a4671aa3fea app-arch ... 040000 tree 6aa08fb81dc6ed1d96d1b6ad94d4731ad01b781d dev-ruby ... As you can see all the hashes are identical with the exception of dev-ruby, which means that this directory contains all the changes in that commit. All the other objects in the two commits are shared. -- Rich ___________________________________________________________________________ Philadelphia Linux Users Group -- http://www.phillylinux.org Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce General Discussion -- http://lists.phillylinux.org/mailman/listinfo/plug