Searching packed data

Mark Dominus on 27 Sep 2005 14:03:06 -0000

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Searching packed data

From: mjd-perl-pm@plover.com (Mark Dominus)

To: phl@lists.pm.org

Subject: Searching packed data

Date: Tue, 27 Sep 2005 09:59:23 -0400

Organization: Plover Systems

Reply-to: phl@lists.pm.org

Sender: owner-phl@lists.pm.org

A guy in comp.lang.perl.moderated has a pretty interesting problem. He has digits of pi in what is called "BCD" format. This is a format that encodes two decimal digits per character. The two digits $a and $b are packed into a character with $a in the high bits and $b in the low bits. For example, the character "R" = 0x52 represents the two decimal digits 5 and 2. To represent digits of pi, say the ten digits "1415926536", we pack each pair of digits into a character. The easy way to do this in Perl is pack "H*", "1415926536" and the result is the 5-character string "\x14\x15\x92\x65\x36". Now suppose we have a very large string, say one million characters long, which therefore represents two million digits of pi. And we want to know if the digit sequence "123456" appears anywhere. Obviously, one way to do this is to unpack the two million digits and search the resulting two-megabyte string: my $pos = index(unpack("H*", $pi), "123456"); But the guy in clp.moderated doesn't want to do this; he wants to search the string *without* unpacking it. Without intending any comment on whether this is a sensible thing to want to do, I wrote code to solve the problem. It came out very nice, and I thought other folks might be interested in the exercise. My code will show up in clp.moderated when the moderator gets around to approving it. In the meantime, you might want to try it yourself; I found it fun. What I learned from this project: it's a lot easier to write good code when I've gotten a full night's sleep. - **Majordomo list services provided by PANIX <URL:http://www.panix.com>** **To Unsubscribe, send "unsubscribe phl" to majordomo@lists.pm.org**

Follow-Ups:

Re: Searching packed data
From: Meng Weng Wong <mengwong@dumbo.pobox.com>

Prev by Date: Fwd: [pm_groups] YAPC::Israel::2006 - Call for papers and participation

Next by Date: Re: Searching packed data

Previous by thread: Fwd: [pm_groups] YAPC::Israel::2006 - Call for papers and participation

Next by thread: Re: Searching packed data

Index(es):

Date

Thread