GNU bug report logs - #17700
[PATCH] dfa: speed-up for a pattern that many atoms are catenated

Previous Next

Package: grep;

Reported by: Norihiro Tanaka <noritnk <at> kcn.ne.jp>

Date: Thu, 5 Jun 2014 11:41:01 UTC

Severity: normal

Tags: patch

Done: Norihiro Tanaka <noritnk <at> kcn.ne.jp>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 17700 in the body.
You can then email your comments to 17700 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-grep <at> gnu.org:
bug#17700; Package grep. (Thu, 05 Jun 2014 11:41:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Norihiro Tanaka <noritnk <at> kcn.ne.jp>:
New bug report received and forwarded. Copy sent to bug-grep <at> gnu.org. (Thu, 05 Jun 2014 11:41:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
To: bug-grep <at> gnu.org
Subject: [PATCH] dfa: speed-up for a pattern that many atoms are catenated
Date: Thu, 05 Jun 2014 20:32:44 +0900
[Message part 1 (text/plain, inline)]
When many atoms are catenated, dfamust() is very slow in order that
pushing a string into `in' list is slow.  This change fixes it.

I tested below to confirm the effect.

  $ printf '%08192d\n' 0 | time -p src/grep -f - /dev/null
[0001-dfa-speed-up-for-a-pattern-that-many-atoms-are-caten.patch (text/plain, attachment)]

Information forwarded to bug-grep <at> gnu.org:
bug#17700; Package grep. (Thu, 05 Jun 2014 16:50:02 GMT) Full text and rfc822 format available.

Message #8 received at 17700 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Norihiro Tanaka <noritnk <at> kcn.ne.jp>, 17700 <at> debbugs.gnu.org
Cc: Aharon Robbins <arnold <at> skeeve.com>
Subject: Re: bug#17700: [PATCH] dfa: speed-up for a pattern that many atoms
 are catenated
Date: Thu, 05 Jun 2014 09:48:41 -0700
[Message part 1 (text/plain, inline)]
On 06/05/2014 04:32 AM, Norihiro Tanaka wrote:
> +      memchr (cp, *lookfor, lenin - (cp - lookin));
> +      if (!cp)

Thanks, but this part can't be right, as memchr's result is discarded.

It seems to me that much of the performance benefit comes from using a 
faster implementation of strstr, and that the DFA code will be better 
off if it simply uses the system strstr rather than rolling its own.  
(The DFA code dates back before strstr was standardized, which is why it 
has its own implementation.)

I installed the attached patch to do that and got a big speedup:

$ printf '%08192d\n' 0 | time -p src/old/grep -f - /dev/null
real 16.14
user 16.13
sys 0.00
$ printf '%08192d\n' 0 | time -p src/grep -f - /dev/null
real 0.79
user 0.79
sys 0.00

Could you please look at the remaining part of your patch and see 
whether it's a win if it's merged to what's now installed? Thanks.

PS.  Aharon, I assume this'll affect Gawk, in that you'll need to 
provide a strstr if you want to be portable to ancient systems that lack 
it.  strstr was standardized in C89 so it'd have to be a pretty ancient 
system, and it may be better just to let this slide.
[diff.patch (text/x-patch, attachment)]

Information forwarded to bug-grep <at> gnu.org:
bug#17700; Package grep. (Thu, 05 Jun 2014 22:50:01 GMT) Full text and rfc822 format available.

Message #11 received at 17700 <at> debbugs.gnu.org (full text, mbox):

From: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: Aharon Robbins <arnold <at> skeeve.com>, 17700 <at> debbugs.gnu.org
Subject: Re: bug#17700: [PATCH] dfa: speed-up for a pattern that many atoms
 are catenated
Date: Fri, 06 Jun 2014 07:49:22 +0900
Thanks.

Paul Eggert wrote:
> It seems to me that much of the performance benefit comes from using a
> faster implementation of strstr, and that the DFA code will be better
> off if it simply uses the system strstr rather than rolling its own.

`lookfor' isn't terminated by `\-'\0' in istrstr(), but strstr() requires
it.  So I wasn't replace it to strstr();

after `grep: undo part of previous change', no longer faster.

$ printf '%02048d\n' 0 | time -p src/grep -f - /dev/null

master  : real 5.74  user 5.40  sys 0.17
my patch: real 0.08  user 0.04  sys 0.04





Information forwarded to bug-grep <at> gnu.org:
bug#17700; Package grep. (Fri, 06 Jun 2014 00:50:02 GMT) Full text and rfc822 format available.

Message #14 received at 17700 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
Cc: Aharon Robbins <arnold <at> skeeve.com>, 17700 <at> debbugs.gnu.org
Subject: Re: bug#17700: [PATCH] dfa: speed-up for a pattern that many atoms
 are catenated
Date: Thu, 05 Jun 2014 17:49:23 -0700
Norihiro Tanaka wrote:
> after `grep: undo part of previous change', no longer faster.
>
> $ printf '%02048d\n' 0 | time -p src/grep -f - /dev/null
>
> master  : real 5.74  user 5.40  sys 0.17
> my patch: real 0.08  user 0.04  sys 0.04

This isn't matching the results I get on my platform.  If 
src/d5dfa69/grep is the old version, src/709e7e5/grep the current master 
(i.e., just use system strstr), and src/grep is the old version with 
your patch, I get:

$ printf '%02048d\n' 0 | time -p src/d5dfa69/grep -f - /dev/null
real 0.33
user 0.33
sys 0.00
$ printf '%02048d\n' 0 | time -p src/709e7e5/grep -f - /dev/null
real 0.04
user 0.04
sys 0.00
$ printf '%02048d\n' 0 | time -p src/grep -f - /dev/null
real 0.03
user 0.03
sys 0.00

So it looks like your patch confers some advantage, but on my platform 
almost all the speedup is achieved simply by switching to the system strstr.

> `lookfor' isn't terminated by `\-'\0' in istrstr(), but strstr() requires it.

Yes, that's why I needed to make the most recent change in the current 
master; it arranges for strstr's 2nd arg to be null-terminated.




Information forwarded to bug-grep <at> gnu.org:
bug#17700; Package grep. (Fri, 06 Jun 2014 06:44:02 GMT) Full text and rfc822 format available.

Message #17 received at 17700 <at> debbugs.gnu.org (full text, mbox):

From: arnold <at> skeeve.com
To: noritnk <at> kcn.ne.jp, eggert <at> cs.ucla.edu
Cc: 17700 <at> debbugs.gnu.org
Subject: Re: bug#17700: [PATCH] dfa: speed-up for a pattern that many atoms
 are catenated
Date: Fri, 06 Jun 2014 00:42:45 -0600
Is strstr() even a good idea? dfa needs to be able to match NUL
bytes in the data.  If this prevents that, then it's a problem.

I'm merely asking - I didn't look hard at where the change was made.
If it matching NUL bytes isn't affected then, no problem.

Thanks,

Arnold

Paul Eggert <eggert <at> cs.ucla.edu> wrote:

> Norihiro Tanaka wrote:
> > after `grep: undo part of previous change', no longer faster.
> >
> > $ printf '%02048d\n' 0 | time -p src/grep -f - /dev/null
> >
> > master  : real 5.74  user 5.40  sys 0.17
> > my patch: real 0.08  user 0.04  sys 0.04
>
> This isn't matching the results I get on my platform.  If 
> src/d5dfa69/grep is the old version, src/709e7e5/grep the current master 
> (i.e., just use system strstr), and src/grep is the old version with 
> your patch, I get:
>
> $ printf '%02048d\n' 0 | time -p src/d5dfa69/grep -f - /dev/null
> real 0.33
> user 0.33
> sys 0.00
> $ printf '%02048d\n' 0 | time -p src/709e7e5/grep -f - /dev/null
> real 0.04
> user 0.04
> sys 0.00
> $ printf '%02048d\n' 0 | time -p src/grep -f - /dev/null
> real 0.03
> user 0.03
> sys 0.00
>
> So it looks like your patch confers some advantage, but on my platform 
> almost all the speedup is achieved simply by switching to the system strstr.
>
> > `lookfor' isn't terminated by `\-'\0' in istrstr(), but strstr() requires it.
>
> Yes, that's why I needed to make the most recent change in the current 
> master; it arranges for strstr's 2nd arg to be null-terminated.




Reply sent to Norihiro Tanaka <noritnk <at> kcn.ne.jp>:
You have taken responsibility. (Fri, 06 Jun 2014 13:20:02 GMT) Full text and rfc822 format available.

Notification sent to Norihiro Tanaka <noritnk <at> kcn.ne.jp>:
bug acknowledged by developer. (Fri, 06 Jun 2014 13:20:03 GMT) Full text and rfc822 format available.

Message #22 received at 17700-done <at> debbugs.gnu.org (full text, mbox):

From: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: Aharon Robbins <arnold <at> skeeve.com>, 17700-done <at> debbugs.gnu.org
Subject: Re: bug#17700: [PATCH] dfa: speed-up for a pattern that many atoms
 are catenated
Date: Fri, 06 Jun 2014 22:19:13 +0900
Paul Eggert wrote:
> So it looks like your patch confers some advantage, but on my platform
> almost all the speedup is achieved simply by switching to the system strstr.

First, I tested on CentOS 5.10.  Next, I tested on RHEL 6.5, and get
result as same as you.  strstr() on CentOS 5.10 may be too old.  I'm
sure that this bug has already been fixed.  So closing.

Thanks,
Norihiro





Information forwarded to bug-grep <at> gnu.org:
bug#17700; Package grep. (Fri, 06 Jun 2014 13:44:01 GMT) Full text and rfc822 format available.

Message #25 received at 17700 <at> debbugs.gnu.org (full text, mbox):

From: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
To: arnold <at> skeeve.com
Cc: eggert <at> cs.ucla.edu, 17700 <at> debbugs.gnu.org
Subject: Re: bug#17700: [PATCH] dfa: speed-up for a pattern that many atoms
 are catenated
Date: Fri, 06 Jun 2014 22:43:28 +0900
Arnold Robbins wrote:
> Is strstr() even a good idea? dfa needs to be able to match NUL
> bytes in the data.  If this prevents that, then it's a problem.

dfamust() doesn't change a result, so no problem.  BTW, even before make
this change, dfamusts is terminated by NUL byte.

Thanks,
Norihiro





Information forwarded to bug-grep <at> gnu.org:
bug#17700; Package grep. (Fri, 06 Jun 2014 14:21:01 GMT) Full text and rfc822 format available.

Message #28 received at 17700 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
Cc: 17700 <at> debbugs.gnu.org
Subject: Re: bug#17700: [PATCH] dfa: speed-up for a pattern that many atoms
 are catenated
Date: Fri, 06 Jun 2014 07:19:36 -0700
[Message part 1 (text/plain, inline)]
Norihiro Tanaka wrote:
> strstr() on CentOS 5.10 may be too old.

Yes it is.  But 'configure' is supposed to detect this.  config.log 
should say something like this:

configure:26283: checking whether strstr works in linear time
configure:26357: gcc -std=gnu99 -o conftest -g -O2   conftest.c  >&5
configure:26357: $? = 0
configure:26357: ./conftest
configure:26357: $? = 142
configure: program exited with status 142

This is how it behaves for me, on RHEL 6.5 and on Solaris 11.1.  Could 
you please investigate why it is not occurring on CentOS 5.10?  For 
example, why does the attached program work?  It should fail.
[strstr-test.c (text/x-csrc, attachment)]

Information forwarded to bug-grep <at> gnu.org:
bug#17700; Package grep. (Fri, 06 Jun 2014 14:27:01 GMT) Full text and rfc822 format available.

Message #31 received at 17700 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Norihiro Tanaka <noritnk <at> kcn.ne.jp>, arnold <at> skeeve.com
Cc: 17700 <at> debbugs.gnu.org
Subject: Re: bug#17700: [PATCH] dfa: speed-up for a pattern that many atoms
 are catenated
Date: Fri, 06 Jun 2014 07:25:40 -0700
Norihiro Tanaka wrote:
> BTW, even before make
> this change, dfamusts is terminated by NUL byte.

Yes, that's right, this patch doesn't affect whether the DFA code works 
correctly on NUL bytes.  There is a performance issue: if the pattern 
contains NUL bytes the DFA code doesn't operate as efficiently as it 
will does the pattern contains only non-NUL bytes.  But this performance 
issue is unaffected by the recent change.

It's low priority to fix the performance issue, but if we wanted to fix 
it among many other things we should use memmem instead of strstr.




Information forwarded to bug-grep <at> gnu.org:
bug#17700; Package grep. (Fri, 06 Jun 2014 22:47:02 GMT) Full text and rfc822 format available.

Message #34 received at 17700 <at> debbugs.gnu.org (full text, mbox):

From: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 17700 <at> debbugs.gnu.org
Subject: Re: bug#17700: [PATCH] dfa: speed-up for a pattern that many atoms
 are catenated
Date: Sat, 07 Jun 2014 07:46:00 +0900
Paul Eggert wrote:
> Could you please investigate why it is not occurring on CentOS 5.10?
> For example, why does the attached program work?  It should fail.

Sorry, I don't update gnulib.  After update, it was returned immediately.
I have also confirmed speed-up in grep.

Thanks,
Norihiro





bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sat, 05 Jul 2014 11:24:03 GMT) Full text and rfc822 format available.

This bug report was last modified 9 years and 291 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.