Dustin Getz on 8 Oct 2012 16:32:33 -0700


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Intro / Question


> independently validate that a git repository and a cvs repository are “identical.”

is it sufficient to validate the end goal, and not the intermediate history? (which means you can just export the final state, hash the exports and compare.) I ask because...

Cvs basically stores per-file history anyway.  I’d like to use the git and cvs executables to do any reading of the repositories to ensure that there are no errors in “interpretation.”

this is the step which may give you trouble, due to the impedance mismatch you mention: CVS commits are per-file, git commits are per repo. I'm not sure how history import tools handle this, but i think you will need to understand it in order to validate it, because it seems to me that there are several perfectly valid ways to convert cvs history into git history.

On Sun, Oct 7, 2012 at 7:33 PM, Michael Bevilacqua-Linn <michael.bevilacqualinn@gmail.com> wrote:
Hey Rich,

Sorry for the radio silence, been crazy lately. 

Seems like Amazon's elastic map/reduce would work for this, assuming that you really do need to run it in parallel.  You could spin up instances with git/svn installed and push a copy of the repos up into S3, and then use Hadoop's streaming features to write the actually map and reduce jobs in a scripting language of your choice.

At least that's probably what I'd end up doing...

Thanks,
MBL

On Fri, Oct 5, 2012 at 6:28 AM, Rich Freeman <rich@thefreemanclan.net> wrote:
On Fri, Oct 5, 2012 at 6:21 AM, Rich Freeman <rich@thefreemanclan.net> wrote:
> Hi, first post to the group and I've yet to attend a meeting, so I'll start
> with an intro.

And apologies to the group for what appears to have been an htmlized
post with inconsistent formatting.  I don't know if that is taboo on
this list, but I used the groups interface to create it and it clearly
isn't WYSIWYG, or even WYSISYM.

Rich