Geeking : 2642


Off-Topic / Geek / 2642	2641 \| 2643

Subject:	Re: Bitstring vector to list conversion in Perl5
Newsgroups:	lugnet.off-topic.geek
Date:	Thu, 4 Jan 2001 06:45:35 GMT
Viewed:	87 times

In lugnet.off-topic.geek, Dan Boger writes: > $veclength = length unpack("b*",$v); # only because I don't know how many > # bits are in it - not efficient > @n = (); > for ($index=0;$index<$veclength,$index++) { > push @n, $index if vec($v,$index,1); > } > > -=-=-=-=-=-=- > > $veclength is terribly inefficient, but I could think of no way of finding > out the length of the vector - I'm sure you have that little bit of > information stored someplace... $veclength = length($v) * 8; It isn't precise, but it's fast -- and a few bits of slop would be perfectly OK here since the slop would be 0-bits. :-) > this isn't the fastest way... you might be able to do something with > map, but you need to supply the broken down list to it as an argument, > so you'll still end up with the non-sparse array loaded in memory. Well, plus, map{} is also almost always slower than looping (especially with use integer, which would be important here too). I was imagining there must be a way to do it that's O(n) in low-level CPU performance but O(1) in high-level Perl performance -- something that takes advantage of CPU string-searching hardware or at least Perl's fast regex engine. What about something sneaky using a regex and pos? This seems to run at a pretty decent clip...about ten million bits per second on a string that has goodies at the beginning and end and is empty everywhere else... -=-=-=-=-=-=- sub furl (@) { use integer; my $bits = ""; foreach (@_) { vec($bits, $_, 1) = 1; } $bits; } sub unfurl ($) { use integer; my $bits = shift; my @nums; my $ibyte; while ($bits =~ m/[^\0]/g) { my $byte = unpack("b*", substr($bits, ($ibyte=pos($bits)-1), 1)); push(@nums, $ibyte*8+pos($byte)-1) while $byte =~ m/1/g; } @nums; } -=-=-=-=-=-=- my $bits = furl(qw( 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 101 103 107 109 113 127 131 137 139 149 151 157 163 167 173 179 181 191 193 197 199 211 223 227 229 233 239 241 251 257 263 269 271 277 281 283 293 307 311 313 317 331 337 347 349 353 359 367 373 379 383 389 397 401 409 419 421 431 433 439 443 449 457 461 463 467 479 487 491 499 503 509 521 523 541 547 557 563 569 571 577 587 593 599 601 607 613 617 619 631 641 643 647 653 659 661 673 677 683 691 701 709 719 727 733 739 743 751 757 761 769 773 787 797 809 811 821 823 827 829 839 853 857 859 863 877 881 883 887 907 911 919 929 937 941 947 953 967 971 977 983 991 997 )); $bits = $bits . "\0" x 1_000_000 . $bits; my @nums = unfurl($bits); print "@nums\n"; -=-=-=-=-=-=- I wonder what instructions /[^\0]/ compiles down to. Is it smart enough to know to scan full-steam-ahead for the first byte not equal to zero using hardware instructions or would it construct a 256-byte lookup table and do it the hard way? Oh well, it's probably not the bottleneck either way. --Todd

Message is in Reply To:

		Re: Bitstring vector to list conversion in Perl5
(...) well, I can think of several ways of doing it, but since $v might be millions of bits long, we don't really want to get the full list in memory ever. So I guess the simplest way would be to run through the vector, indexing the 1s: (...) (23 years ago, 3-Jan-01, to lugnet.off-topic.geek)

3 Messages in This Thread:

Entire Thread on One Page:: Nested: All | Brief | Compact | Dots
Linear: All | Brief | Compact

Custom Search