To LUGNET HomepageTo LUGNET News HomepageTo LUGNET Guide Homepage
 Help on Searching
 
Post new message to lugnet.robotics.rcx.pbforthOpen lugnet.robotics.rcx.pbforth in your NNTP NewsreaderTo LUGNET News Traffic PageSign In (Members)
 Robotics / RCX / pbFORTH / 224
223  |  225
Subject: 
Re: Whitespace and comment remover in perl
Newsgroups: 
lugnet.robotics.rcx.pbforth
Date: 
Wed, 15 Dec 1999 15:22:49 GMT
Viewed: 
1166 times
  
Ralph Hempel wrote:

This is a good start. Here's my comments on each of the regexp lines...
By the way, my Tcl script handles most of this stuff properly, but
I'm waiting until the new year to develop XMODEM uploads for it....

     s/\\.*$//g; # remove \ commends
     s/\(.*?\)//g; # remove ( ... )

- Except if comments are nested or a quoted string!!! Remember that
  regexps are greedy, they find the LONGEST match. If you have, say...

  ( some comment string ) CODE IS HERE ( more comments )

  then the entire string is removed..
Your statement of the LONGEST match may be correct for TCL but not for
perl!
look at the second regexp:
the '?' inside means expand to the smallest match ( this is a perl
specialty i think)
the only thing is, if you nest comments like

( comment ( comment on comment ) )

then the very last ')' will not be removed.

In case, you use ( ) inside of quotes, you are right, they and the text
in between will be removed from the quote.
I'll fix this.


    s/^\s*//g; # remove leading whitespaces ( \t\r\n\f)
    s/\s+$//g; # remove trailing whitespaces ( \t\r\n\f)

  You might want to remove any sequence of one or more whitespace chars
  unless they are inside quotes, of course...

the first regexp only matches at the beginning of the string (there is a
'^' at the beginning of the search part).
the second regexp only matches at the end of the string (there is a '$'
at the end of the regexps).

You are right, there are restrictions in this little script:
1. if you have a bracket comment spanning more than one line or when a
quoted area span more than one line,
   then the script must fail, because it works on a line by line basis.
2. duplicate whitespaces between words are not removed.

Greetings Carsten



Message has 1 Reply:
  RE: Whitespace and comment remover in perl
 
(...) OOPS! I guess that those pesky regexps are not for the faint of heart. The new Tcl standard also allows you to specify less greedy regexps. The nesting of comments as you suggest is illegal FORTH anyways. It only specifies that a close comment (...) (25 years ago, 15-Dec-99, to lugnet.robotics.rcx.pbforth)

Message is in Reply To:
  RE: Whitespace and comment remover in perl
 
(...) <snipped script> Carsten, This is a good start. Here's my comments on each of the regexp lines... By the way, my Tcl script handles most of this stuff properly, but I'm waiting until the new year to develop XMODEM uploads for it.... (...) (25 years ago, 15-Dec-99, to lugnet.robotics.rcx.pbforth)

5 Messages in This Thread:

Entire Thread on One Page:
Nested:  All | Brief | Compact | Dots
Linear:  All | Brief | Compact

This Message and its Replies on One Page:
Nested:  All | Brief | Compact | Dots
Linear:  All | Brief | Compact
    

Custom Search

©2005 LUGNET. All rights reserved. - hosted by steinbruch.info GbR