Fred Stluka on 21 Dec 2018 13:31:58 -0800 |
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
Re: [PLUG] Git: net time gain or loss? |
Rich,
So, if two people attempt to push at similar times, the first to get in will have a successful push, and the next will get an error. That second committer must do a pull and a rebase, and then push again. If the pace of commits is large enough then this can become a significant bottleneck, with lots of committers spending a lot of time rebasing commits only to fail repeatedly to push them.
Yes, I have seen the situation where people are pushing so often that my push is sometimes based on older commits, and I have to pull again before pushing. You are right -- I can see how this could become a serious problem at scale. But, I'm not sure what rebase has to do with this. When I get this error, I always just pull, which does a merge. If the changes made were in unrelated parts of the files, the merge is automatic. If not there are "merge conflicts" that I have to resolve manually because, for example, another developer and I made unrelated changes to the same line of code. If the merge is automatic, it's fast, and so my next push works fine. If not, I may take some time to manually resolve the conflicts, and someone may meanwhile do another push, so my next push fails, and I have to do another pull first. But I've never needed rebase in this scenario. What am I missing? For scalability, what about the workflow where folks fork (clone) the repo, make their change, and issue a "pull request"? Then a smaller set of senior people are doing all the pulls from the forked repos into the main repo. And resolving all of the merge conflicts, or rejecting the pull request so that the guy who did the fork has to re-pull, resolve the conflicts and issue a new pull request. In this scenario, no one ever really pushes, they just pull upstream. Would you expect this to also not scale? It see it used on large scale FOSS projects like Django. --Fred ------------------------------------------------------------------------ Fred Stluka -- Bristle Software, Inc. -- http://bristle.com #DontBeATrump -- Make America Honorable Again! ------------------------------------------------------------------------ On 12/20/18 6:06 PM, Rich Freeman wrote:
On Thu, Dec 20, 2018 at 5:22 PM Fred Stluka <fred@bristle.com> wrote:Yeah, Git scales. Linus wrote it to manage the huge number of committers to Linux around the world.Git sort-of scales. Linus has a fairly unique workflow in the FOSS world. The official linux repo has but a single committer. It might grow large in size, and have many commits per day, but it never has more than one person committing at the same time. Git can handle any number of incoming commits at the same time, /as long as those commits target different branches./ If more than one person attempts to commit to a single branch at the same time, then only one can succeed without a merge commit, because once one commit is merged the next no longer is parented against the current head, and a fast-forward commit is not possible. It is rarely desirable to have an automated repository accept non-fast-forward pushes, because nobody will have actually looked at the resulting merge commits prior to them being committed and if there are conflicts there is no possibility of manual review. So, if two people attempt to push at similar times, the first to get in will have a successful push, and the next will get an error. That second committer must do a pull and a rebase, and then push again. If the pace of commits is large enough then this can become a significant bottleneck, with lots of committers spending a lot of time rebasing commits only to fail repeatedly to push them. Now, the fact that git is distributed does allow all those committers to continue to accumulate work in their private repos and ignore the bottleneck, and then push all their commits at once when it is in less contention. These pushes are all-or-nothing so they probably aren't going to be penalized for having 100 commits to push all at once. It does delay the dissemination of work, however. Other VCS implementations are more file-based, and thus there isn't the same kind of repository-level locking difficulty. All that said, I think it is usually manageable in practice. And there are workarounds. The Linux workflow of course works where people cascade their commits up. They basically sit queued up in email inboxes until somebody applies their patches. Another workaround would be to have a collection of staging branches where individuals can push their changes and then a scheduled process checks for merge conflicts and if there are none merges the branch. That approach would result in many merge commits, which some find distasteful, unless you rebase them, but that approach precludes gpg signing commits. I would not say that git is perfect. However, in practice many of its issues are the result of it not fitting in with preconceptions around how a VCS ought to work, and potential users might do well to consider if there is an opportunity to improve things by changing their processes. Now, going back to my earlier post, just as I have little hope of my company ever managing 100 page requirement specifications in anything other than Word, I also have little hope in them ever using git. I'm happy when I see people using subversion - they're more likely to just stick everything on a shared drive, or maybe make occassional zip snapshots and stick them in sharepoint.
___________________________________________________________________________ Philadelphia Linux Users Group -- http://www.phillylinux.org Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce General Discussion -- http://lists.phillylinux.org/mailman/listinfo/plug