To LUGNET HomepageTo LUGNET News HomepageTo LUGNET Guide Homepage
 Help on Searching
 
Post new message to lugnet.off-topic.geekOpen lugnet.off-topic.geek in your NNTP NewsreaderTo LUGNET News Traffic PageSign In (Members)
 Off-Topic / Geek / 2642
2641  |  2643
Subject: 
Re: Bitstring vector to list conversion in Perl5
Newsgroups: 
lugnet.off-topic.geek
Date: 
Thu, 4 Jan 2001 06:45:35 GMT
Viewed: 
87 times
  
In lugnet.off-topic.geek, Dan Boger writes:
$veclength = length unpack("b*",$v);  # only because I don't know how many
                                      # bits are in it - not efficient
@n = ();
for ($index=0;$index<$veclength,$index++) {
  push @n, $index if vec($v,$index,1);
}

-=-=-=-=-=-=-

$veclength is terribly inefficient, but I could think of no way of finding
out the length of the vector - I'm sure you have that little bit of
information stored someplace...

$veclength = length($v) * 8;

It isn't precise, but it's fast -- and a few bits of slop would be perfectly
OK here since the slop would be 0-bits.  :-)


this isn't the fastest way...  you might be able to do something with
map, but you need to supply the broken down list to it as an argument,
so you'll still end up with the non-sparse array loaded in memory.

Well, plus, map{} is also almost always slower than looping (especially with
use integer, which would be important here too).

I was imagining there must be a way to do it that's O(n) in low-level CPU
performance but O(1) in high-level Perl performance -- something that takes
advantage of CPU string-searching hardware or at least Perl's fast regex
engine.

What about something sneaky using a regex and pos?

This seems to run at a pretty decent clip...about ten million bits per second
on a string that has goodies at the beginning and end and is empty everywhere
else...


-=-=-=-=-=-=-

sub furl (@) {
   use integer;
   my $bits = "";
   foreach (@_) { vec($bits, $_, 1) = 1; }
   $bits;
}

sub unfurl ($) {
   use integer;
   my $bits = shift;
   my @nums;
   my $ibyte;
   while ($bits =~ m/[^\0]/g) {
      my $byte = unpack("b*", substr($bits, ($ibyte=pos($bits)-1), 1));
      push(@nums, $ibyte*8+pos($byte)-1) while $byte =~ m/1/g;
   }
   @nums;
}

-=-=-=-=-=-=-

my $bits = furl(qw(
   2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 101
   103 107 109 113 127 131 137 139 149 151 157 163 167 173 179 181 191 193 197
   199 211 223 227 229 233 239 241 251 257 263 269 271 277 281 283 293 307 311
   313 317 331 337 347 349 353 359 367 373 379 383 389 397 401 409 419 421 431
   433 439 443 449 457 461 463 467 479 487 491 499 503 509 521 523 541 547 557
   563 569 571 577 587 593 599 601 607 613 617 619 631 641 643 647 653 659 661
   673 677 683 691 701 709 719 727 733 739 743 751 757 761 769 773 787 797 809
   811 821 823 827 829 839 853 857 859 863 877 881 883 887 907 911 919 929 937
   941 947 953 967 971 977 983 991 997
));

$bits = $bits . "\0" x 1_000_000 . $bits;

my @nums = unfurl($bits);

print "@nums\n";

-=-=-=-=-=-=-


I wonder what instructions /[^\0]/ compiles down to.  Is it smart enough to
know to scan full-steam-ahead for the first byte not equal to zero using
hardware instructions or would it construct a 256-byte lookup table and
do it the hard way?  Oh well, it's probably not the bottleneck either way.

--Todd



Message is in Reply To:
  Re: Bitstring vector to list conversion in Perl5
 
(...) well, I can think of several ways of doing it, but since $v might be millions of bits long, we don't really want to get the full list in memory ever. So I guess the simplest way would be to run through the vector, indexing the 1s: (...) (23 years ago, 3-Jan-01, to lugnet.off-topic.geek)

3 Messages in This Thread:

Entire Thread on One Page:
Nested:  All | Brief | Compact | Dots
Linear:  All | Brief | Compact
    

Custom Search

©2005 LUGNET. All rights reserved. - hosted by steinbruch.info GbR