Subject:
|
Perl5 regexen: exponential backtracking behavior when mixing \n and \s
|
Newsgroups:
|
lugnet.off-topic.geek
|
Date:
|
Sun, 17 Dec 2000 03:16:22 GMT
|
Viewed:
|
64 times
|
| |
| |
Anyone out there running Perl 5.6 or higher? Try running this little test
proggy that compares the approximate CPU execution times of two regexen:
-----------------------------------begin-----------------------------------
#!/bin/perl
use strict; $^W = 1;
sub cputime
{
my ($user, $sys, $cuser, $csys) = times();
return $user + $sys + $cuser + $csys;
}
my $nmax = 22;
print "\n", 's/(?:\n[ \t]*)+$//', "\n";
for my $n (0 .. $nmax)
{
my $foo = "TOP\n" . ("\n" x $n) . "BOTTOM\n";
my $t0 = cputime();
$foo =~ s/(?:\n[ \t]*)+$//;
my $t1 = cputime();
printf "%s %.2f\n", $n, $t1 - $t0;
}
print "\n", 's/(?:\n\s*)+$//', "\n";
for my $n (0 .. $nmax)
{
my $foo = "TOP\n" . ("\n" x $n) . "BOTTOM\n";
my $t0 = cputime();
$foo =~ s/(?:\n\s*)+$//;
my $t1 = cputime();
printf "%s %.2f\n", $n, $t1 - $t0;
}
------------------------------------end------------------------------------
I ran it on four systems and two versions of Perl...
Notes:
1. CPU timing is accurate only to approximately 1/100 second -- depends on
the system's clock tick. Actual resolution is 1/128 second on some
systems. CPU times of 0.01 seconds are sampling noise.
2. The execution times themselves isn't important -- it's the fact that they
go up exponentially in the \n\s* case and don't in the \n[ \t]* case.
--Todd
Perl: 5.005_03
CPU: Single-processor 333 MHz. Pentium II box
OS: Red Hat Linux 6.1
-----------------------------------begin-----------------------------------
s/(?:\n[ \t]*)+$//
0 0.00
1 0.00
2 0.00
3 0.00
4 0.00
5 0.00
6 0.00
7 0.00
8 0.00
9 0.00
10 0.00
11 0.00
12 0.00
13 0.00
14 0.00
15 0.00
16 0.00
17 0.00
18 0.00
19 0.00
20 0.00
21 0.00
22 0.00
s/(?:\n\s*)+$//
0 0.00
1 0.00
2 0.00
3 0.00
4 0.00
5 0.01
6 0.00
7 0.00
8 0.00
9 0.00
10 0.00
11 0.01
12 0.02
13 0.04
14 0.09
15 0.18
16 0.32
17 0.63
18 1.29
19 2.56
20 5.10
21 10.13
22 20.37
------------------------------------end------------------------------------
Perl: 5.004_04
CPU: Single-processor 400 MHz. Pentium II box
OS: FreeBSD 2.2.7
-----------------------------------begin-----------------------------------
s/(?:\n[ \t]*)+$//
0 0.00
1 0.00
2 0.00
3 0.00
4 0.00
5 0.00
6 0.00
7 0.00
8 0.00
9 0.00
10 0.00
11 0.00
12 0.00
13 0.00
14 0.00
15 0.00
16 0.00
17 0.00
18 0.00
19 0.00
20 0.01
21 0.00
22 0.00
s/(?:\n\s*)+$//
0 0.00
1 0.00
2 0.00
3 0.00
4 0.00
5 0.00
6 0.00
7 0.00
8 0.00
9 0.00
10 0.01
11 0.01
12 0.02
13 0.04
14 0.06
15 0.14
16 0.27
17 0.54
18 1.09
19 2.17
20 4.35
21 8.71
22 17.35
------------------------------------end------------------------------------
Perl: 5.004_04
CPU: 4-processor 333 MHz. Ultra Sparc 20 box
OS: SunOS 5.7
-----------------------------------begin-----------------------------------
0 0.00
1 0.00
2 0.00
3 0.00
4 0.00
5 0.00
6 0.00
7 0.00
8 0.00
9 0.00
10 0.00
11 0.00
12 0.00
13 0.00
14 0.00
15 0.00
16 0.00
17 0.01
18 0.00
19 0.00
20 0.00
21 0.00
22 0.00
s/(?:\n\s*)+$//
0 0.00
1 0.00
2 0.00
3 0.00
4 0.00
5 0.00
6 0.00
7 0.00
8 0.00
9 0.00
10 0.01
11 0.01
12 0.01
13 0.03
14 0.07
15 0.12
16 0.25
17 0.50
18 1.00
19 1.99
20 3.97
21 7.93
22 15.87
------------------------------------end------------------------------------
Perl: 5.005_03
CPU: 16-processor 375 MHz. IBM SP Power3
OS: AIX 4.3
-----------------------------------begin-----------------------------------
s/(?:\n[ \t]*)+$//
0 0.00
1 0.00
2 0.00
3 0.00
4 0.00
5 0.00
6 0.00
7 0.00
8 0.00
9 0.00
10 0.00
11 0.00
12 0.00
13 0.00
14 0.01
15 0.00
16 0.00
17 0.00
18 0.00
19 0.00
20 0.00
21 0.00
22 0.00
s/(?:\n\s*)+$//
0 0.00
1 0.00
2 0.00
3 0.00
4 0.00
5 0.00
6 0.00
7 0.00
8 0.00
9 0.00
10 0.01
11 0.01
12 0.02
13 0.03
14 0.08
15 0.15
16 0.30
17 0.60
18 1.20
19 2.40
20 4.81
21 9.61
22 19.19
------------------------------------end------------------------------------
|
|
Message has 1 Reply:
5 Messages in This Thread:
- Entire Thread on One Page:
- Nested:
All | Brief | Compact | Dots
Linear:
All | Brief | Compact
|
|
|
|