To LUGNET HomepageTo LUGNET News HomepageTo LUGNET Guide Homepage
 Help on Searching
 
Post new message to lugnet.off-topic.geekOpen lugnet.off-topic.geek in your NNTP NewsreaderTo LUGNET News Traffic PageSign In (Members)
 Off-Topic / Geek / 4443
4442  |  4444
Subject: 
Re: Regular Expression help?
Newsgroups: 
lugnet.off-topic.geek
Date: 
Tue, 8 Jul 2003 19:02:52 GMT
Viewed: 
355 times
  
In lugnet.off-topic.geek, Dan Boger wrote:
On Tue, Jul 08, 2003 at 06:44:29PM +0000, David Eaton wrote:
For *BORDER* tags, you're in luck:

/<img(\s+[^bB]\w+=("[^"]"|\S*))*>/i
                          ^
missing "*" here --------/

Woops, yep

But that is on the (valid?) presumption that the ONLY attribute for an
<img> tag that starts with the letter "B" is "BORDER". I suppose I
could be wrong what with all the custom attributes that MSIE has, etc.
But if there ARE other attributes that start with B, then the above
DOESN'T work, since it'll ignore anything that's got an attribute that
starts with B.

wouldn't this also break on this:

<img src="1.jpg" alt="Some Text with Contrived B=something in it">

? Wouldn't it not, since it matches on the double quote first? IE it would match
alt="........." as a single dealy? I'll test....

Ran as a test:

foreach(
'<img src="foo.jpg">',#  matches
'<img border="" src="foo.jpg">',#  no match
'<img border= src="foo.jpg">',#  no match
'<img src="foo.jpg" border="0">',#  no match
'<img src="foo.jpg" border=0>',#  no match
'<img src="foo.jpg" border="0" height="75">',#  no match
'<img src="foo.jpg" border=0 height="75">',#  no match
'<img src="foo.jpg" width="50">',#  matches
'<img src="1.jpg" alt="Some Text with Contrived B=something in it">',
'<img src="1.jpg" alt="Some Text with Contrived B=something in it" border=1>',
'<img src="1.jpg" alt="Some Text with Contrived B=something in it"
border="1">',
'<img src="1.jpg" border=1 alt="Some Text with Contrived B=something in it">',
'<img src="1.jpg" border="1" alt="Some Text with Contrived B=something in
it">',


) {
    if(/<img(\s+[^bB]\w+=(\"[^\"]*\"|\S*))*>/i) {
print "Y - $_\n";
    } else {
print "N - $_\n";
    }
}

Output:

Y - <img src="foo.jpg">
N - <img border="" src="foo.jpg">
N - <img border= src="foo.jpg">
N - <img src="foo.jpg" border="0">
N - <img src="foo.jpg" border=0>
N - <img src="foo.jpg" border="0" height="75">
N - <img src="foo.jpg" border=0 height="75">
Y - <img src="foo.jpg" width="50">
Y - <img src="1.jpg" alt="Some Text with Contrived B=something in it">
N - <img src="1.jpg" alt="Some Text with Contrived B=something in it" border=1>
N - <img src="1.jpg" alt="Some Text with Contrived B=something in it"
border="1">
N - <img src="1.jpg" border=1 alt="Some Text with Contrived B=something in it">
N - <img src="1.jpg" border="1" alt="Some Text with Contrived B=something in
it">

So, it seems to work... But as admitted, only because "BORDER" is the only thing
starting with a "B"...

DaveE



Message has 1 Reply:
  Re: Regular Expression help?
 
(...) David, Thanks for this additional approach. I thought that I had tried something similar to this, but I must have gotten some portion of it wrong, as I could get no matches in my data. Thanks again, -Andy Lynch (21 years ago, 9-Jul-03, to lugnet.off-topic.geek)

Message is in Reply To:
  Re: Regular Expression help?
 
(...) ^ missing "*" here --------/ (...) wouldn't this also break on this: <img src="1.jpg" alt="Some Text with Contrived B=something in it"> :) But other than that, good idea to look at what's valid :) (21 years ago, 8-Jul-03, to lugnet.off-topic.geek)

11 Messages in This Thread:


Entire Thread on One Page:
Nested:  All | Brief | Compact | Dots
Linear:  All | Brief | Compact

This Message and its Replies on One Page:
Nested:  All | Brief | Compact | Dots
Linear:  All | Brief | Compact
    

Custom Search

©2005 LUGNET. All rights reserved. - hosted by steinbruch.info GbR