To LUGNET HomepageTo LUGNET News HomepageTo LUGNET Guide Homepage
 Help on Searching
 
Post new message to lugnet.off-topic.geekOpen lugnet.off-topic.geek in your NNTP NewsreaderTo LUGNET News Traffic PageSign In (Members)
 Off-Topic / Geek / 2249
2248  |  2250
Subject: 
Re: Strip surplus white space with Perl
Newsgroups: 
lugnet.off-topic.geek
Date: 
Sun, 29 Oct 2000 22:00:13 GMT
Viewed: 
154 times
  
Todd Lehman wrote:
Ahh, but what happens if you have a carriage return followed by a
space?  The second method eliminates the space and leaves the
carriage return, which isn't what he asked for.

Ahh, but that *is* what he asked for.  Fredrik said, "If two consecutive
characters are <SPACE> and <RET>, for example, it doesn't matter to me
which one is preserved and which one is chopped off."

There are some days when I shouldn't touch a keyboard.  I read and reread
his initial posting and could have sworn that it said "It *does* matter to
me which one..."  I think you just went in and edited the message.  ;)
I guess it's one of those days.

   $string =~ s/(\t\t+|  +|\r\r+|\n\n+|\f\f+)/substr($&,1,1)/eg;

My goodness.  I suppose you could do it that way, but why go through all that
trouble when all you want to do is this?--

   $string =~ s/(\s)\1+/$1/g;

Umm, I didn't think of it?  :)  Yours *is* a much cleaner solution...
It also proves Lindsey's axiom: "No matter how much time or thought
you put into a problem, someone will always have a better solution
than you."

Could you give a concrete example?  I'm curious -- I've always believed that
/g is faster (and clearer code) than using a while, even when there aren't
lots of matches.  Doesn't

   while ($string =~ s///) {}

have O(n^2) performance, where n is the number of hits in the string?  Or are
you hiding some anchors in there?

I'm going to pull this directly from the "Perl Cookbook" -- that's where I
remember seeing it.  Section 1.7, expanding and compressing tabs.

   while (<>) {
      1 while s/\t+/' ' x (length($&) * 8 - length($`) % 8)/e;
      print;
   }

   "If you're looking at the second while loop and wondering why it
   couldn't have been written as part of a simple s///g instead, it's
   because you need to recalculate the length from the start of the
   line again each time (stored in $`) rather than merely from where the
   last match occurrred."

After reading that paragraph a while back I decided that I'd stick with
while loops for most cases -- sometimes while debugging or modifying
my code I don't think about this type of change, so I wouldn't necessarily
have the presence of mind to switch from /g to a while loop.  But if it's
already in a while loop, I don't have to worry about it.

Mind you, I only do this when I might access $& et al.

Chris



Message has 1 Reply:
  Re: Strip surplus white space with Perl
 
On Sun, Oct 29, 2000 at 10:00:13PM +0000, Christopher Lindsey wrote: [snip] (...) of course, if you're using $&, you don't care about efficiency anyway, so you might as well put your regexp in a while loop. But if you do care about speed, you'd go (...) (24 years ago, 29-Oct-00, to lugnet.off-topic.geek)

Message is in Reply To:
  Re: Strip surplus white space with Perl
 
(...) Ahh, but that *is* what he asked for. Fredrik said, "If two consecutive characters are <SPACE> and <RET>, for example, it doesn't matter to me which one is preserved and which one is chopped off." (...) My goodness. I suppose you could do it (...) (24 years ago, 29-Oct-00, to lugnet.off-topic.geek)

10 Messages in This Thread:



Entire Thread on One Page:
Nested:  All | Brief | Compact | Dots
Linear:  All | Brief | Compact

This Message and its Replies on One Page:
Nested:  All | Brief | Compact | Dots
Linear:  All | Brief | Compact
    

Custom Search

©2005 LUGNET. All rights reserved. - hosted by steinbruch.info GbR