To LUGNET HomepageTo LUGNET News HomepageTo LUGNET Guide Homepage
 Help on Searching
 
Post new message to lugnet.off-topic.geekOpen lugnet.off-topic.geek in your NNTP NewsreaderTo LUGNET News Traffic PageSign In (Members)
 Off-Topic / Geek / 2499
2498  |  2500
Subject: 
Perl5 regexen: exponential backtracking behavior when mixing \n and \s
Newsgroups: 
lugnet.off-topic.geek
Date: 
Sun, 17 Dec 2000 03:16:22 GMT
Viewed: 
47 times
  
Anyone out there running Perl 5.6 or higher?  Try running this little test
proggy that compares the approximate CPU execution times of two regexen:

-----------------------------------begin-----------------------------------
#!/bin/perl

use strict; $^W = 1;

sub cputime
{
   my ($user, $sys, $cuser, $csys) = times();
   return $user + $sys + $cuser + $csys;
}

my $nmax = 22;

print "\n", 's/(?:\n[ \t]*)+$//', "\n";
for my $n (0 .. $nmax)
{
   my $foo = "TOP\n" . ("\n" x $n) . "BOTTOM\n";
   my $t0 = cputime();
   $foo =~ s/(?:\n[ \t]*)+$//;
   my $t1 = cputime();
   printf "%s %.2f\n", $n, $t1 - $t0;
}

print "\n", 's/(?:\n\s*)+$//', "\n";
for my $n (0 .. $nmax)
{
   my $foo = "TOP\n" . ("\n" x $n) . "BOTTOM\n";
   my $t0 = cputime();
   $foo =~ s/(?:\n\s*)+$//;
   my $t1 = cputime();
   printf "%s %.2f\n", $n, $t1 - $t0;
}
------------------------------------end------------------------------------


I ran it on four systems and two versions of Perl...

Notes:

1.  CPU timing is accurate only to approximately 1/100 second -- depends on
    the system's clock tick.  Actual resolution is 1/128 second on some
    systems.  CPU times of 0.01 seconds are sampling noise.

2.  The execution times themselves isn't important -- it's the fact that they
    go up exponentially in the \n\s* case and don't in the \n[ \t]* case.

--Todd


Perl: 5.005_03
CPU: Single-processor 333 MHz. Pentium II box
OS: Red Hat Linux 6.1

-----------------------------------begin-----------------------------------
s/(?:\n[ \t]*)+$//
0 0.00
1 0.00
2 0.00
3 0.00
4 0.00
5 0.00
6 0.00
7 0.00
8 0.00
9 0.00
10 0.00
11 0.00
12 0.00
13 0.00
14 0.00
15 0.00
16 0.00
17 0.00
18 0.00
19 0.00
20 0.00
21 0.00
22 0.00

s/(?:\n\s*)+$//
0 0.00
1 0.00
2 0.00
3 0.00
4 0.00
5 0.01
6 0.00
7 0.00
8 0.00
9 0.00
10 0.00
11 0.01
12 0.02
13 0.04
14 0.09
15 0.18
16 0.32
17 0.63
18 1.29
19 2.56
20 5.10
21 10.13
22 20.37
------------------------------------end------------------------------------


Perl: 5.004_04
CPU: Single-processor 400 MHz. Pentium II box
OS: FreeBSD 2.2.7

-----------------------------------begin-----------------------------------
s/(?:\n[ \t]*)+$//
0 0.00
1 0.00
2 0.00
3 0.00
4 0.00
5 0.00
6 0.00
7 0.00
8 0.00
9 0.00
10 0.00
11 0.00
12 0.00
13 0.00
14 0.00
15 0.00
16 0.00
17 0.00
18 0.00
19 0.00
20 0.01
21 0.00
22 0.00

s/(?:\n\s*)+$//
0 0.00
1 0.00
2 0.00
3 0.00
4 0.00
5 0.00
6 0.00
7 0.00
8 0.00
9 0.00
10 0.01
11 0.01
12 0.02
13 0.04
14 0.06
15 0.14
16 0.27
17 0.54
18 1.09
19 2.17
20 4.35
21 8.71
22 17.35
------------------------------------end------------------------------------


Perl: 5.004_04
CPU: 4-processor 333 MHz. Ultra Sparc 20 box
OS: SunOS 5.7

-----------------------------------begin-----------------------------------
0 0.00
1 0.00
2 0.00
3 0.00
4 0.00
5 0.00
6 0.00
7 0.00
8 0.00
9 0.00
10 0.00
11 0.00
12 0.00
13 0.00
14 0.00
15 0.00
16 0.00
17 0.01
18 0.00
19 0.00
20 0.00
21 0.00
22 0.00

s/(?:\n\s*)+$//
0 0.00
1 0.00
2 0.00
3 0.00
4 0.00
5 0.00
6 0.00
7 0.00
8 0.00
9 0.00
10 0.01
11 0.01
12 0.01
13 0.03
14 0.07
15 0.12
16 0.25
17 0.50
18 1.00
19 1.99
20 3.97
21 7.93
22 15.87
------------------------------------end------------------------------------


Perl: 5.005_03
CPU: 16-processor 375 MHz. IBM SP Power3
OS: AIX 4.3

-----------------------------------begin-----------------------------------
s/(?:\n[ \t]*)+$//
0 0.00
1 0.00
2 0.00
3 0.00
4 0.00
5 0.00
6 0.00
7 0.00
8 0.00
9 0.00
10 0.00
11 0.00
12 0.00
13 0.00
14 0.01
15 0.00
16 0.00
17 0.00
18 0.00
19 0.00
20 0.00
21 0.00
22 0.00

s/(?:\n\s*)+$//
0 0.00
1 0.00
2 0.00
3 0.00
4 0.00
5 0.00
6 0.00
7 0.00
8 0.00
9 0.00
10 0.01
11 0.01
12 0.02
13 0.03
14 0.08
15 0.15
16 0.30
17 0.60
18 1.20
19 2.40
20 4.81
21 9.61
22 19.19
------------------------------------end------------------------------------



Message has 1 Reply:
  Re: Perl5 regexen: exponential backtracking behavior when mixing \n and \s
 
(...) [lindsey@wolrab ~]$ uname -a Linux wolrab 2.2.16-22smp #1 SMP Tue Aug 22 16:39:21 EDT 2000 i686 unknown [lindsey@wolrab ~]$ cat /etc/redhat-release Red Hat Linux release 7.0 (Guinness) s/(?:\n[ \t]*)+$// 0 0.00 1 0.00 2 0.00 3 0.00 4 0.00 5 (...) (24 years ago, 17-Dec-00, to lugnet.off-topic.geek)

5 Messages in This Thread:


Entire Thread on One Page:
Nested:  All | Brief | Compact | Dots
Linear:  All | Brief | Compact
    

Custom Search

©2005 LUGNET. All rights reserved. - hosted by steinbruch.info GbR