To LUGNET HomepageTo LUGNET News HomepageTo LUGNET Guide Homepage
 Help on Searching
 
Post new message to lugnet.off-topic.geekOpen lugnet.off-topic.geek in your NNTP NewsreaderTo LUGNET News Traffic PageSign In (Members)
 Off-Topic / Geek / 112
111  |  113
Subject: 
A Beautiful Irony
Newsgroups: 
lugnet.off-topic.geek
Date: 
Thu, 17 Jun 1999 23:09:03 GMT
Viewed: 
327 times
  
I just noticed a curious URL in this message...

   http://www.lugnet.com/news/display.cgi?lugnet.off-topic.debate:1191

...the one with the embedded parentheses, near the bottom of the message.

The code in the web interface for parsing this is incorrect, but it happens
to display the hyperlink correctly there because there is a space character
after the last character of the URL.  But now here's one with parentheses
that trips it up nicely:

   http://www.lugnet.com/news/display.cgi?lugnet.general:325

I've got some new code for doing URLs correctly in vastly more cases, and
I'll put it into place when I'm satisfied that it's working super-great.

Anyway, I just had a humorous bug occur during regression testing of the
new code/algorithm...

The new code, while chewing on a URL, does left-to-right depth-counting on
parentheticals like () and [] and {}.  Thus, it makes the (reasonable)
assumption that if a URL does happen to ever contain parentheses, then those
parens are probably going to be paired up, and if a final paren is reached
which wasn't opened in the URL, then it's not part of the URL.

So it's got this little supporting data structure (Perl5 code):

   my @depth = (0, 0, 0, 0, 0);
   my %depth_index = qw( < 0  >  0  ( 1  )  1  [ 2  ]  2  { 3  }  3 );
   my %depth_delta = qw( < 1  > -1  ( 1  ) -1  [ 1  ] -1  { 1  } -1 );

And as it scans across the characters in the URL one-by-one, the %depth_index
hash tells which counter to update; the %depth_delta hash tells whether it's
going deeper or backing out.  If any of the depth counts ever goes negative,
then it truncates the URL there and places the closing </A> tag at the point,
saving the remainder out as regular plain HTML.

Well, to make a short story long, here's the funny part:  My code said this
originally...  This was the bug, below:

   my @depth = (0, 0, 0, 0, 0);
   my %depth_index = qw( < 0  >  0  \( 1  \)  1  [ 2  ]  2  { 3  }  3 );
   my %depth_delta = qw( < 1  > -1  \( 1  \) -1  [ 1  ] -1  { 1  } -1 );
                                    ^^    ^^
                                     (oops)

Note the \'s before the ('s and )'s.  I was just escaping them so that the
program wouldn't get a syntax error when I tried to run it.  But of course
Perl is smarter than I am, and one of the reasons that Perl has qw() in the
first place is so that \-escaping doesn't have to be strictly necessary.
What actually went into the hashes in those positions were the strings \(
and \) rather than ( and ), which caused the algorithm to fail silently and
not detect opening parentheses.

The irony of course is that the bug is a trivial syntax problem -- a typo,
almost -- and it has precisely everything to do with paren-counting.

Of all the possible places in the coding universe to rediscover the hard
way that Perl does paren-counting on qw()-rvalues, it had to be this.  :)
I'll ne'er forget it now!

--Todd



1 Message in This Thread:

Entire Thread on One Page:
Nested:  All | Brief | Compact | Dots
Linear:  All | Brief | Compact
    

Custom Search

©2005 LUGNET. All rights reserved. - hosted by steinbruch.info GbR