Rich Freeman on 16 Nov 2017 19:47:37 -0800


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] Revision Control for the Rest of Us


On Thu, Nov 16, 2017 at 8:34 PM, JP Vossen <jp@jpsdomain.org> wrote:
>
> 2. No DB = fail, all 3 use some kind of DB inside the ./.vcs/ dir (meaning
> ./bzr/, ./git/, or ./.hg/ dir).  The plain-text hackability is the *only*
> real advantage that RCS has, I'd argue.
>

I'm not sure that RCS needs a DB given its design/limitations.

Note, I'm not a fan of RCS (or CVS) in any way, shape, or form.
Having done a lot of the validation work for the Gentoo cvs->git
migration my eyes bleed from looking at raw RCS files.

One big difference between RCS and git (and probably several of the
other modern VCSs) is that RCS is purely file-based (as is CVS, which
is for the most part a layer on top of RCS).  If you make changes to
15 files and commit the changes, with RCS you get 15 files that each
have a commit recorded in them.  They might have the same timestamp
and author, but otherwise there is nothing that really links them.  On
the other hand, with git if you commit changes to 15 files at the same
time you end up with a single commit record, which references a single
tree that contains the changes in all 15 files.  In git those changes
are atomic, and in RCS they aren't really.  When doing a cvs->git
migration finding all the matching commits across all those files is
one of the many challenges.

Now, one advantage that brings is more tolerance for concurrency.  Two
people can be working in different parts of a CVS repository at the
same time and their commits will not interfere, because the history of
each file is completely independent.  On the other hand, with git all
changes block each other as far as the data model goes (though the
software will try to resolve these automatically if they don't change
the same regions of the same files).  If you have a high commit rate
in git that can potentially become a bit of a problem.  With linux
there is no issue because there is no central repository that
everybody commits to, but with most other large projects it can become
an issue when you go to push and end up having to rebase and check for
conflicts.

The flip side of this is that with git the entire repository is always
consistent.  The first dev to push their changes makes the destination
repo look exactly like the one he presumably QA tested.  The second
has to do a rebase and any needed QA before pushing.  Then when they
push the entire repository is consistent with what they were looking
at.  With a file-level implementation like CVS or RCS if you have
changes going on in multiple places then you don't necessarily end up
with checkpoints that match what either dev actually tested.

Oh, and let's not talk about branching in CVS or RCS (I assume RCS
handles it the same as CVS but I could be wrong there).  The way CVS
handles branching makes that plain-text hackability look not all that
hackable...  I realize that nobody was really advocating for RCS on
the basis of branching, but I figured I'd mention it.  As was already
pointed out git was built for branching.

In any case, to get back to the original point, because RCS tracks its
history on a file level there really isn't much of a loss in not
having a database.  I suspect that scanning backwards in the history
of a single RCS file should perform even better than it does in git,
simply because it is much flatter (with git there are many references
which could potentially require a lot more seeking if you're dong a
linear scan - though as I posted about a few years ago you can
parallelize a lot of repo-level operations as they branch out from the
linked list of commits).

-- 
Rich
___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug