Subject:
|
Re: Whitespace and comment remover in perl
|
Newsgroups:
|
lugnet.robotics.rcx.pbforth
|
Date:
|
Wed, 15 Dec 1999 15:22:49 GMT
|
Viewed:
|
1315 times
|
| |
| |
Ralph Hempel wrote:
>
> This is a good start. Here's my comments on each of the regexp lines...
> By the way, my Tcl script handles most of this stuff properly, but
> I'm waiting until the new year to develop XMODEM uploads for it....
>
> s/\\.*$//g; # remove \ commends
> s/\(.*?\)//g; # remove ( ... )
>
> - Except if comments are nested or a quoted string!!! Remember that
> regexps are greedy, they find the LONGEST match. If you have, say...
>
> ( some comment string ) CODE IS HERE ( more comments )
>
> then the entire string is removed..
Your statement of the LONGEST match may be correct for TCL but not for
perl!
look at the second regexp:
the '?' inside means expand to the smallest match ( this is a perl
specialty i think)
the only thing is, if you nest comments like
( comment ( comment on comment ) )
then the very last ')' will not be removed.
In case, you use ( ) inside of quotes, you are right, they and the text
in between will be removed from the quote.
I'll fix this.
>
> s/^\s*//g; # remove leading whitespaces ( \t\r\n\f)
> s/\s+$//g; # remove trailing whitespaces ( \t\r\n\f)
>
> You might want to remove any sequence of one or more whitespace chars
> unless they are inside quotes, of course...
the first regexp only matches at the beginning of the string (there is a
'^' at the beginning of the search part).
the second regexp only matches at the end of the string (there is a '$'
at the end of the regexps).
You are right, there are restrictions in this little script:
1. if you have a bracket comment spanning more than one line or when a
quoted area span more than one line,
then the script must fail, because it works on a line by line basis.
2. duplicate whitespaces between words are not removed.
Greetings Carsten
|
|
Message has 1 Reply: | | RE: Whitespace and comment remover in perl
|
| (...) OOPS! I guess that those pesky regexps are not for the faint of heart. The new Tcl standard also allows you to specify less greedy regexps. The nesting of comments as you suggest is illegal FORTH anyways. It only specifies that a close comment (...) (25 years ago, 15-Dec-99, to lugnet.robotics.rcx.pbforth)
|
Message is in Reply To:
| | RE: Whitespace and comment remover in perl
|
| (...) <snipped script> Carsten, This is a good start. Here's my comments on each of the regexp lines... By the way, my Tcl script handles most of this stuff properly, but I'm waiting until the new year to develop XMODEM uploads for it.... (...) (25 years ago, 15-Dec-99, to lugnet.robotics.rcx.pbforth)
|
5 Messages in This Thread:
- Entire Thread on One Page:
- Nested:
All | Brief | Compact | Dots
Linear:
All | Brief | Compact
This Message and its Replies on One Page:
- Nested:
All | Brief | Compact | Dots
Linear:
All | Brief | Compact
|
|
|
|