GNU bug report logs - #17427
grep -v -l and -v -L fail to early terminate

Previous Next

Package: grep;

Reported by: Jörn Hees <dev <at> joernhees.de>

Date: Tue, 6 May 2014 22:36:03 UTC

Severity: normal

Tags: patch

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 17427 in the body.
You can then email your comments to 17427 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-grep <at> gnu.org:
bug#17427; Package grep. (Tue, 06 May 2014 22:36:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Jörn Hees <dev <at> joernhees.de>:
New bug report received and forwarded. Copy sent to bug-grep <at> gnu.org. (Tue, 06 May 2014 22:36:03 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Jörn Hees <dev <at> joernhees.de>
To: bug-grep <at> gnu.org
Subject: grep -v -l and -v -L fail to early terminate
Date: Wed, 7 May 2014 00:15:48 +0200
Hi,

i have a bunch of very big files which _should_ follow a simple line format.
I spotted some errors in these files and now want to search for files containing at least one line violating the specified format.
As soon as such a line is found grep could terminate, but it doesn't seem to.

The use case i describe is neither plain -l (--files-with-matches) nor -L (--files-without-match), it's rather --files-with-at-least-one-non-match.
I tried grep -v -l and this seems to work but doesn't do -l's early termination :(.

Jörn


Toy example for line format 'a b c d':

# insert offender as first line:
echo 'a c d c' > test.tmp

# insert many valid lines (warning ~ 70 MB file):
awk 'BEGIN { for (i=0 ; i < 10000000 ; i++) print "a b c d" }' >> test.tmp

# run grep:
time grep -v -l 'a b c d' test.tmp
test.tmp

real    0m2.758s
user    0m2.692s
sys     0m0.060s


# counter example which is very fast (matches the 2nd line and quits):
time grep -l 'a b c d' test.tmp
test.tmp

real    0m0.032s
user    0m0.000s
sys     0m0.032s





Information forwarded to bug-grep <at> gnu.org:
bug#17427; Package grep. (Wed, 07 May 2014 07:08:02 GMT) Full text and rfc822 format available.

Message #8 received at 17427 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Jörn Hees <dev <at> joernhees.de>, 
 17427 <at> debbugs.gnu.org
Subject: Re: bug#17427: grep -v -l and -v -L fail to early terminate
Date: Wed, 07 May 2014 00:07:24 -0700
How about using -m1?




Information forwarded to bug-grep <at> gnu.org:
bug#17427; Package grep. (Wed, 07 May 2014 07:18:02 GMT) Full text and rfc822 format available.

Message #11 received at 17427 <at> debbugs.gnu.org (full text, mbox):

From: Jörn Hees <dev <at> joernhees.de>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 17427 <at> debbugs.gnu.org
Subject: Re: bug#17427: grep -v -l and -v -L fail to early terminate
Date: Wed, 7 May 2014 09:17:45 +0200
On 7 May 2014, at 09:07, Paul Eggert <eggert <at> cs.ucla.edu> wrote:

> How about using -m1?

Ouch :-D, please accept my apologies for not spotting this.
I still think that -m1 should be implied for -v -l and -v -L as it seems to be implied for -l and -L.

This works:

time grep -v -l -m1 'a b c d' test.tmp
test.tmp

real    0m0.032s
user    0m0.000s
sys     0m0.032s

Cheers,
Jörn





Information forwarded to bug-grep <at> gnu.org:
bug#17427; Package grep. (Wed, 07 May 2014 12:06:02 GMT) Full text and rfc822 format available.

Message #14 received at 17427 <at> debbugs.gnu.org (full text, mbox):

From: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
To: Jorn Hees <dev <at> joernhees.de>
Cc: Paul Eggert <eggert <at> cs.ucla.edu>, 17427 <at> debbugs.gnu.org
Subject: bug#17427: grep -v -l and -v -L fail to early terminate
Date: Wed, 07 May 2014 21:05:02 +0900
[Message part 1 (text/plain, inline)]
It seems that done_on_match and exit_on_match flags don't work for invert.
This patch to fix it.

Norihiro
[0001-grep-done-on-match-for-L-l-or-q-with-invert.patch (text/plain, attachment)]

Added tag(s) patch. Request was from Paul Eggert <eggert <at> cs.ucla.edu> to control <at> debbugs.gnu.org. (Thu, 08 May 2014 16:20:03 GMT) Full text and rfc822 format available.

Reply sent to Paul Eggert <eggert <at> cs.ucla.edu>:
You have taken responsibility. (Thu, 08 May 2014 23:04:01 GMT) Full text and rfc822 format available.

Notification sent to Jörn Hees <dev <at> joernhees.de>:
bug acknowledged by developer. (Thu, 08 May 2014 23:04:02 GMT) Full text and rfc822 format available.

Message #21 received at 17427-done <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Norihiro Tanaka <noritnk <at> kcn.ne.jp>, Jörn Hees
 <dev <at> joernhees.de>
Cc: 17427-done <at> debbugs.gnu.org
Subject: Re: bug#17427: grep -v -l and -v -L fail to early terminate
Date: Thu, 08 May 2014 16:03:11 -0700
[Message part 1 (text/plain, inline)]
Thanks for the patch.  I tweaked its commit message (see first 
attachment).  While reviewing it I found opportunities to clarify and/or 
simplify related code, so I did that too (see second attachment).  Both 
are installed and I am marking this bug report as done.
[0001-grep-improve-performance-of-v-when-combined-with-L-l.patch (text/plain, attachment)]
[0002-grep-simplify-and-clarify-invert-related-code.patch (text/plain, attachment)]

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Fri, 06 Jun 2014 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 9 years and 337 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.