Subject:
|
Re: Strip surplus white space with Perl
|
Newsgroups:
|
lugnet.off-topic.geek
|
Date:
|
Sun, 29 Oct 2000 22:00:13 GMT
|
Viewed:
|
154 times
|
| |
| |
Todd Lehman wrote:
> > > Ahh, but what happens if you have a carriage return followed by a
> > > space? The second method eliminates the space and leaves the
> > > carriage return, which isn't what he asked for.
>
> Ahh, but that *is* what he asked for. Fredrik said, "If two consecutive
> characters are <SPACE> and <RET>, for example, it doesn't matter to me
> which one is preserved and which one is chopped off."
There are some days when I shouldn't touch a keyboard. I read and reread
his initial posting and could have sworn that it said "It *does* matter to
me which one..." I think you just went in and edited the message. ;)
I guess it's one of those days.
> > $string =~ s/(\t\t+| +|\r\r+|\n\n+|\f\f+)/substr($&,1,1)/eg;
>
> My goodness. I suppose you could do it that way, but why go through all that
> trouble when all you want to do is this?--
>
> $string =~ s/(\s)\1+/$1/g;
Umm, I didn't think of it? :) Yours *is* a much cleaner solution...
It also proves Lindsey's axiom: "No matter how much time or thought
you put into a problem, someone will always have a better solution
than you."
> Could you give a concrete example? I'm curious -- I've always believed that
> /g is faster (and clearer code) than using a while, even when there aren't
> lots of matches. Doesn't
>
> while ($string =~ s///) {}
>
> have O(n^2) performance, where n is the number of hits in the string? Or are
> you hiding some anchors in there?
I'm going to pull this directly from the "Perl Cookbook" -- that's where I
remember seeing it. Section 1.7, expanding and compressing tabs.
while (<>) {
1 while s/\t+/' ' x (length($&) * 8 - length($`) % 8)/e;
print;
}
"If you're looking at the second while loop and wondering why it
couldn't have been written as part of a simple s///g instead, it's
because you need to recalculate the length from the start of the
line again each time (stored in $`) rather than merely from where the
last match occurrred."
After reading that paragraph a while back I decided that I'd stick with
while loops for most cases -- sometimes while debugging or modifying
my code I don't think about this type of change, so I wouldn't necessarily
have the presence of mind to switch from /g to a while loop. But if it's
already in a while loop, I don't have to worry about it.
Mind you, I only do this when I might access $& et al.
Chris
|
|
Message has 1 Reply: | | Re: Strip surplus white space with Perl
|
| On Sun, Oct 29, 2000 at 10:00:13PM +0000, Christopher Lindsey wrote: [snip] (...) of course, if you're using $&, you don't care about efficiency anyway, so you might as well put your regexp in a while loop. But if you do care about speed, you'd go (...) (24 years ago, 29-Oct-00, to lugnet.off-topic.geek)
|
Message is in Reply To:
| | Re: Strip surplus white space with Perl
|
| (...) Ahh, but that *is* what he asked for. Fredrik said, "If two consecutive characters are <SPACE> and <RET>, for example, it doesn't matter to me which one is preserved and which one is chopped off." (...) My goodness. I suppose you could do it (...) (24 years ago, 29-Oct-00, to lugnet.off-topic.geek)
|
10 Messages in This Thread:
- Entire Thread on One Page:
- Nested:
All | Brief | Compact | Dots
Linear:
All | Brief | Compact
This Message and its Replies on One Page:
- Nested:
All | Brief | Compact | Dots
Linear:
All | Brief | Compact
|
|
|
|