Re: [PLUG] Git: net time gain or loss?

Rich Freeman on 21 Dec 2018 07:50:00 -0800

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] Git: net time gain or loss?

From: Rich Freeman <r-plug@thefreemanclan.net>
To: "Philadelphia Linux User's Group Discussion List" <plug@lists.phillylinux.org>
Subject: Re: [PLUG] Git: net time gain or loss?
Date: Fri, 21 Dec 2018 10:49:44 -0500
Reply-to: Philadelphia Linux User's Group Discussion List <plug@lists.phillylinux.org>
Sender: "plug" <plug-bounces@lists.phillylinux.org>

On Fri, Dec 21, 2018 at 2:19 AM Will <staticphantom@gmail.com> wrote:
>
> One large general git gripe I pose for this mailing list that
> surprisingly was done well with Visual Source Safe that I cannot
> replicate with svn, mercurial, or our beloved git. Working at a
> particular position, we had all of our projects for every customer and
> every release of our base application all in a single repository (feel
> free to cringe here with VSS that was not backed up for years). At
> the start of every project, we would pin the base application version
> while checking in new versions of our customers augmented custom
> code. I have not been able in any repository to pin parts of one
> branch to a single commit while allowing other parts to continue to
> function as a repo would normally all in the same branch. Effectively
> it was as if a single folder could be set manually to specific tags
> as everything else would move along with the head revision all in
> the same branch. While a neat feature, I can only assume I could not
> replicate the behavior with other tools as maybe the behavior that is
> so bad of an anti-pattern/worst practice that it could be considered
> just stupid and banned from ever being done in something like Git.
>
> I would really be interested in how and curious by how messy it would
> be to have a single branch in git reference different commits from head
> in the same branch.

I'm not seeing any easy change to git that would make this possible.

I do know that there is interest in improving sub-repository support
in git, which seems like it might handle something like this.

You could also accomplish this with a combination of convention and
git hooks, such as rejecting any commit that changes a "read-only"
part of the tree.  Tools like git-flow could also be used to
facilitate this kind of workflow.

Google has their repo tool that probably can accomplish something like
this workflow.  I'm not a huge fan of repo personally because of how
it ends up working for end-users trying to reproduce issues.  However,
it is basically a tool for stitching together many repositories into a
single project, which allows each repository to independently track
history.

Repo is driven by a manifest file that describes all the various
repositories that are incorporated into a project.  Those repositories
can all track branches/tags/etc.  So, if you're working on module A
you can have your repo track a development branch for module A, and a
stable-v3 branch for all the other modules, or even a specific stable
tag for those.  However, that all involves editing xml files to tell
the repo tool what to do.  Now, those xml files can be stored
centrally in a git repo so that your devs just point the repo tool at
it and it gives them the right stuff, so that your module A team gets
the dev branch for module A, and your support team gets the
stable-fixes-dev branch for all the modules, and so on.

What I don't like about repo is that it isn't automatically capturing
state history of the whole collection of repositories.  Let me explain
what I mean a bit.

Suppose I want to reproduce a bug on a monolithic repository.  I can
check it out, bisect it, and so on.  If I report to a developer that
I'm having a problem I can give them a commit hash and that hash alone
allows the developer to completely reproduce all the source code I am
using.

The problem with repo is that in the typical use case the manifest is
pointing at some branch on dozens or hundreds of repositories.  Unless
you are using a tag (which has to exist in every repo) nothing is
pinning down exactly what code is being fetched.  When you run "repo
sync" it goes out and pulls the latest commit on each of those 100
repositories.  Every one of them has a hash, but repo doesn't
consolidate what all those hashes are anywhere, and it has no way to
go fetch that same combination of repositories.   So, if I run into a
problem but don't know which specific repository is at fault I can't
tell a developer what code to go check out.  At best I could try to
dig up a list of 100 hashes and they can try to manually check all
those out, because there is no tool in repo (that I'm aware of) that
can completely reproduce the same state.  The only exception is with
release tags - if whoever owns those repositories goes and adds a v1.0
tag to all 100 repositories and I clone that tag on all of them, now
they're all pinned.  However, unless you're creating a LOT of tags in
your repo you can't really bisect issues that way.

Where I'd see this a lot is with android.  Let's say I run into a
bluetooth issue in a 3rd-party android build.  If I made that build
using repo I can't really tell the developers how to reproduce it -
maybe they'll go and sync the repo and now the issue is gone.  At best
I could download a daily tarball of the whole thing and test on that,
but now you're bypassing the VCS tool entirely which just demonstrates
a weakness in the tool.

Maybe I'm just not completely groking repo.  If somebody on the list
is a big repo fan and has some advice I'm all ears.  I haven't used it
recently either, so maybe the tool has evolved, but a quick skim of
the docs suggests not.

Now, all that said I consider the fact that git commits are atomic
across the entire repository is a feature.  Back when Gentoo was using
cvs it was impossible to really guarantee tree-level QA because
commits basically happened at the file level.  I could change a file
over here, and you could change a file over there, and neither of us
might have tested against the other's changes.  Having commits be
atomic at the repository level creates the resource contention issue I
mentioned in my earlier email, but it also creates a guarantee of
tree-wide consistency.

-- 
Rich
___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug

References:
- [PLUG] Git: net time gain or loss?
  - From: JP Vossen <jp@jpsdomain.org>
- Re: [PLUG] Git: net time gain or loss?
  - From: "Lee H. Marzke" <lee@marzke.net>
- Re: [PLUG] Git: net time gain or loss?
  - From: "Wells, Clay A" <clayw@sas.upenn.edu>
- Re: [PLUG] Git: net time gain or loss?
  - From: Charlie Li <ml+PLUG@vishwin.info>
- Re: [PLUG] Git: net time gain or loss?
  - From: Rich Freeman <r-plug@thefreemanclan.net>
- Re: [PLUG] Git: net time gain or loss?
  - From: Charlie Li <ml+PLUG@vishwin.info>
- Re: [PLUG] Git: net time gain or loss?
  - From: Will <staticphantom@gmail.com>

Prev by Date: Re: [PLUG] Git: net time gain or loss?
Next by Date: Re: [PLUG] Code Shelter
Previous by thread: Re: [PLUG] Git: net time gain or loss?
Next by thread: Re: [PLUG] Git: net time gain or loss?
Index(es):
- Date
- Thread