James E Keenan on 15 Dec 2016 08:33:34 -0800

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

[Philadelphia-pm] Devel-Git-MultiBisect: follow-up to Dec 12 2016 meeting

This is a follow-up to the discussion of CPAN distribution Devel-Git-MultiBisect from the Philadelphia Perlmongers meeting on Monday, December 12 2016. I would once again like to thank phiadelphia.pm for offering me that opportunity to present my work.

At approximately this point in the slides -- http://thenceforward.net/perl/talks/phlpm20161212/slide034.html -- Walt Mankowski asked why I felt it necessary, after testing commit 110 out of 220 -- i.e., the first bisection point -- I then proceeded to test commit 109 out of 220 rather than proceeding directly to the next bisection point which would, in this case, be commit 55 out of 220. The inference was that some of my test runs might have been unnecessary. I believe Geoff Avery and others -- speak up here, so you all get credit! -- concurred.

Another way of putting this: Suppose that we populated a list of the index numbers of commits visited in the course of multisection. Suppose that that list was 0-based rather than 1-based. Then that list would start like this:

    [ 0, 219, 109, 108, 54, 81, 80, 67, 66, 60, 59, 57, 56, 55 ... ]

Walt -- correct me if I'm misstating this -- was wondering why visits to steps 108, 80, 66, etc., were needed.

I didn't have a good response then, and I still don't now. I can state that, in terms of getting to my desired final result (see http://thenceforward.net/perl/talks/phlpm20161212/slide050.html), my multiple bisection code is correct, even if it is somewhat inefficient.

Thinking about this over the past few days, I realized that it would be helpful to create, for my benefit and yours, a simplified framework for approaching the problem. In my github repo for this distribution (though not yet on CPAN), I have added this file: https://github.com/jkeenan/devel-git-multibisect/blob/master/examples/generalized-multisection.t. This program abstracts away the presenting problem and git in order to just focus on the procedure for multisection of a list of strings in the same format as produced by Devel::Git::MultiBisect::Transitions::multisect_all_targets(). The bisection procedure I am currently using in that method can be found in multisect_list(), which starts at line 265, and more specifically in the if block starting at line 317.

Now, I suspect that some of the bisection steps could probably be eliminated -- but different steps from those people were pointing out during the meeting. If we took a look at the complete list of indexes visited, it would be:

    [ 0, 219, 109, 108, 54, 81, 80, 67, 66, 60, 59, 57, 56, 55,
      137, 136, 96, 95, 75, 74, 65, 64, 58 ]

That holds the full 23 steps taken to identify all 3 transition points in the 220-item commit range discussed at the meeting. But if we know that the value at the first real bisection point (index 109) is identical to that at the end-of-range point (index 219), then we should be able to avoid ever visiting any index between 109 and 219. Hence, indexes 137 and 136 could be eliminated.

But that's a different set of potentially eliminatable steps from what was raised at the meeting.

So, if any of you would like to stare at the code and offer a suggestion, feel free to clone that repo, poke around and offer a patch.

Thank you very much.
Jim Keenan

Philadelphia-pm mailing list