To LUGNET HomepageTo LUGNET News HomepageTo LUGNET Guide Homepage
 Help on Searching
 
Post new message to lugnet.admin.generalOpen lugnet.admin.general in your NNTP NewsreaderTo LUGNET News Traffic PageSign In (Members)
 Administrative / General / 8482
8481  |  8483
Subject: 
Re: posting weirdness
Newsgroups: 
lugnet.admin.general
Date: 
Sun, 17 Dec 2000 17:56:25 GMT
Viewed: 
221 times
  
On Sun, Dec 17, 2000 at 02:47:03AM +0000, Todd Lehman wrote:
I'm not sure yet[1] if this was a stupidly written regex on my part or bona
fide bug in perl's[2] regex engine.  Either way, the post you submitted caused
the system to spend as much time as it possibly could (until the process was
killed) examining the body of the post for a certain pattern.  That pattern
match went like this...

   $body =~ s/(?:\n\s*)+$//;

...and its purpose is to strip the body of a post of any repeats of completely
blank lines at the very bottom (i.e, at end of the string).  It doesn't match
blank lines in the middle of the string, but it does (did) get extremely
tripped up by them.

My gut says that this was not a bug in perl, but simply a case of *MASSIVE*
backtracking on certain data input due to the fact that \n is part of the \s
character class.

yah, that's what it looks like to me.  Since it would match any newline,
and start adding up the whitespace until it found the beginning of a new
paragraph.  it'll backtrack to the next blank line, try again, and fail
once more.... if you have 80 lines, it'd have to do that 80 times...

Wouldn't this work in it's stead?

  $body =~ s/\n\s*$//;

this way, you don't have to deal with the whole *+ thing, which is always
bad...

I rewrote the pattern match like this instead...

   $body =~ s/(?:\n[ \t]*)+$//;

heh, this works of course too :)  I'm not sure, but arn't there some other
"whitespace" chars that fall within the \s class?

--
Dan Boger / dan@peeron.com / www.peeron.com / ICQ: 1130750
<set:6075_1>:  Wolfpack Tower (LEGO/SYSTEM/Castle/Wolfpack), '92, 232 pcs, 4 figs



Message has 1 Reply:
  Re: posting weirdness
 
(...) It's actually infinitely worse than that. The NFA it produced actually had exponential performance, so it would try to do it 2^80 (2 raised to the 80th power) times, which would take about 302,200,000,000,000,000 seconds or about ten thousand (...) (23 years ago, 17-Dec-00, to lugnet.admin.general)

Message is in Reply To:
  Re: posting weirdness
 
(...) OK, please try again now! (...) Whew! Thank you so much for finding and reporting this! I'm not sure yet[1] if this was a stupidly written regex on my part or bona fide bug in perl's[2] regex engine. Either way, the post you submitted caused (...) (23 years ago, 17-Dec-00, to lugnet.admin.general)

8 Messages in This Thread:


Entire Thread on One Page:
Nested:  All | Brief | Compact | Dots
Linear:  All | Brief | Compact

This Message and its Replies on One Page:
Nested:  All | Brief | Compact | Dots
Linear:  All | Brief | Compact
    

Custom Search

©2005 LUGNET. All rights reserved. - hosted by steinbruch.info GbR