Jeff Abrahamson on 14 Dec 2004 01:42:24 -0000

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: perldb weirdness

On Sun, Dec 12, 2004 at 10:27:11PM -0500, Michael Styer wrote:
>   [85 lines, 432 words, 2550 characters]  Top characters: etsnoarh
> I've been looking at this code wondering what the heck is going on, and I
> can't make any sense of it, so I'll ask.

Yes, it's a bit odd looking, I admit, and it's doing something you
might not think about very often.

Here's the algorithm in pseudo-code.  It finds the connected
components of an undirected graph.  Cf. Cormen et al., ch 22 (first
edition) or ch 21 (second edition).

Quoting from Cormen, 2nd edition:

      MAKE-SET(x) creates a new set whose only member (and thus
      representative) is x. Since the sets are disjoint, we require
      that x not already be in some other set.

      UNION(x, y) unites the dynamic sets that contain x and y, say Sx
      and Sy, into a new set that is the union of these two sets. The
      two sets are assumed to be disjoint prior to the operation. The
      representative of the resulting set is any member of Sx union
      Sy, although many implementations of UNION specifically choose
      the representative of either Sx or Sy as the new
      representative. Since we require the sets in the collection to
      be disjoint, we "destroy" sets Sx and Sy, removing them from the

      FIND-SET(x) returns a pointer to the representative of the
      (unique) set containing x.

The algorithm for connected components is then

    for each vertex v in V(G)
    for each edge (u,v) in E(G)
        if find-set(u) != find_set(v)

Quoting from Cormen again, the procedure initially places each vertex
v in its own set. Then, for each edge (u, v), it unites the sets
containing u and v.  After all the edges are processed, two vertices
are in the same connected component if and only if the corresponding
objects are in the same set.

So the thing I need is to say "which set is x in?"

I could maintain a set of hashes and test each one.  But the more
logical thing is to have each x point to a hash (which contains it).
Multiple elements might point to the same hash if the hash contains
multiple things.

Another naive way to do it is to have a hash where each element is a
key and its value is its set number.  Then unions require iterating
over all keys.  This gets slow for large sets.

Thus, I have a hash that has elements as keys and references to (I was
hoping pointers to) hashes as values.  A union is thus just a matter
of adding one element's hash's keys to the other's, then setting the
two hash pointers to the same hash reference.

It would have worked in C. ;-)

See below for the original comments and questions.


> Jeff Abrahamson wrote:
> > Here's the fuller code section that shows me doing what I just said.
> > The key part is the block beginning with the comment "union of sets".
> > 
> >     my %sets = ();
> >     for my $v (keys %V) {
> > 	$sets{$v} = [$v];
> So the value of $sets{$v} is an array ref with a single element, namely
> $v? I don't understand why you would want this, but OK, that's cool...
> >     }
> >     for my $v_begin (keys %E) {
> > 	my $adj = $E{$v_begin};
> > 	for my $v_end (keys %{$adj}) {
> > 	    if($sets{$v_begin} != $sets{$v_end}) {
> but here, you're comparing those arrayrefs for equality... except that
> I'm sure that this comparison is in effect something like 
>     if ('ARRAY(0x8522038)' != 'ARRAY(0x8198432)') {
>         do stuff...
>     }
> isn't it? I can't think of how to check that, but is that what you want?
> > 		# union of sets
> > 		my %s1 = %{$sets{$v_begin}};
> > 		my %s2 = %{$sets{$v_end}};
> Then here, I'm surprised you can even do this at all. I tried something
> similar just now and I get "Can't coerce array into hash at ./
> line 26." because $sets($v_begin} is going to be an arrayref, not a
> hashref, right?
> > 		for my $key (keys %s1) {
> > 		    $s2{$key} = 1;
> I'm not sure what you're doing here; %s1 and %s2 are created and
> destroyed each time through this block, and you're not doing anything
> with them, so putting the keys from %s1 into %s2 wouldn't seem to make
> any difference.
> > 		}
> > 		$sets{$v_begin} = $sets{$v_end};   # set pointers equal
> > 	    }
> > 	}
> >     }
> >     my %unique_sets;
> >     for my $s (keys %sets) {
> > 	print $sets{$s}, "\n";
> Isn't this going to print the string 'ARRAY(0x8522038)'... is that what
> you want?
> > 	$unique_sets{$sets{$s}} = $sets{$s};
> Here you're setting up %unique_sets so that its keys are strings of the
> form 'ARRAY(0x8522038)' and its values are references to the same arrays
> that the values of %sets refer to. This seems rather odd...
> > 	print $sets{$s}, "\n";
> Printing 'ARRAY(0x8522038)' again...
> >     }
> So, ah, I'm mystified. Is there some wizardry going on here that I'm not
> getting? What's the point of all the playing around with
> 'ARRAY(0x8522038)' strings directly? It seems very un-perlish to me, but
> I have a distinct feeling I may not be understanding something.
> Hoping for some enlightenment...
> Mike
> -
> **Majordomo list services provided by PANIX <URL:>**
> **To Unsubscribe, send "unsubscribe phl" to**


 Jeff Abrahamson  <>    +1 215/837-2287
 GPG fingerprint: 1A1A BA95 D082 A558 A276  63C6 16BF 8C4C 0D1D AE4B

 A cool book of games, highly worth checking out:

Attachment: signature.asc
Description: Digital signature