GNU bug report logs -
#24941
Early termination bug in grep 2.26
Previous Next
Reported by: Gary Johnson <garyjohn <at> spocom.com>
Date: Mon, 14 Nov 2016 02:42:01 UTC
Severity: normal
Done: Paul Eggert <eggert <at> cs.ucla.edu>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 24941 in the body.
You can then email your comments to 24941 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-grep <at> gnu.org
:
bug#24941
; Package
grep
.
(Mon, 14 Nov 2016 02:42:01 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Gary Johnson <garyjohn <at> spocom.com>
:
New bug report received and forwarded. Copy sent to
bug-grep <at> gnu.org
.
(Mon, 14 Nov 2016 02:42:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
There was some recent discussion on the vim_dev list of a failure to
update a Vim package which was found to be due to an update of grep
from 2.25 to 2.26. The details of the grep behavior are discussed
here:
https://www.linuxquestions.org/questions/slackware-14/pkgtools-grep-bug-in-slackware[64]-current-4175593054/
In short, it seems to be due to the "grep: /dev/null output
speedup" commit of 2016-05-01, af6af288eac28951b5eee1eaaf373e22b2193b7b.
When grep terminates early, it closes the pipe it's reading stdin
from, which terminates the program on the other side of that pipe
early, before that program has completed its task.
Oops.
Regards,
Gary
Information forwarded
to
bug-grep <at> gnu.org
:
bug#24941
; Package
grep
.
(Mon, 14 Nov 2016 15:33:02 GMT)
Full text and
rfc822 format available.
Message #8 received at submit <at> debbugs.gnu.org (full text, mbox):
On Sun, Nov 13, 2016 at 5:56 PM, Gary Johnson <garyjohn <at> spocom.com> wrote:
> There was some recent discussion on the vim_dev list of a failure to
> update a Vim package which was found to be due to an update of grep
> from 2.25 to 2.26. The details of the grep behavior are discussed
> here:
>
> https://www.linuxquestions.org/questions/slackware-14/pkgtools-grep-bug-in-slackware[64]-current-4175593054/
>
> In short, it seems to be due to the "grep: /dev/null output
> speedup" commit of 2016-05-01, af6af288eac28951b5eee1eaaf373e22b2193b7b.
> When grep terminates early, it closes the pipe it's reading stdin
> from, which terminates the program on the other side of that pipe
> early, before that program has completed its task.
>
> Oops.
Thank you for the report.
Oops, indeed.
While I see nothing in the POSIX specification
(http://pubs.opengroup.org/onlinepubs/9699919799/utilities/grep.html)
that requires it to read all input in such a case, I think the current
behavior violates least-surprise.
Here is an example to demonstrate. I believe that POSIX does not
dictate what this code prints:
grep-2.25:
$ (seq 10000000; echo $? 1>&2) | grep . > /dev/null
0
grep-2.26:
$ (seq 10000000; echo $? 1>&2) | grep . > /dev/null
141
Paul, what do you think about making your heuristic apply only for non-pipes?
Information forwarded
to
bug-grep <at> gnu.org
:
bug#24941
; Package
grep
.
(Mon, 14 Nov 2016 15:33:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-grep <at> gnu.org
:
bug#24941
; Package
grep
.
(Tue, 15 Nov 2016 08:18:02 GMT)
Full text and
rfc822 format available.
Message #14 received at submit <at> debbugs.gnu.org (full text, mbox):
Jim Meyering wrote:
> Paul, what do you think about making your heuristic apply only for non-pipes?
I'm a bit dubious, as grep exits early for other reasons, and did so before the
patch in question. For example, here:
seq 10000000000 | grep -q .
This is as of grep 2.19 (2014), due to a bug report where grep performed badly
by not exiting early <https://bugs.gnu.org/17427>. Which suggests that we'll get
bug reports in this area no matter what grep does....
Looking at other implementations, Solaris 11 grep is similar with -q. And
FreeBSD-current grep exits early for this case:
seq 10000000000 | grep -f /dev/null
Come to think of it, perhaps GNU grep should do a similar optimization for -f
/dev/null, if only to keep up with FreeBSD.
All the above being said, I am sympathetic to the bug report. Perhaps we can
eliminate the optimization if there are no file arguments and only options in
the set -EFivx are specified. Something like that anyway. The idea is to catch
common cases in old (and really, broken) scripts, without hurting performance in
general.
Information forwarded
to
bug-grep <at> gnu.org
:
bug#24941
; Package
grep
.
(Tue, 15 Nov 2016 08:18:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-grep <at> gnu.org
:
bug#24941
; Package
grep
.
(Tue, 15 Nov 2016 19:36:02 GMT)
Full text and
rfc822 format available.
Message #20 received at submit <at> debbugs.gnu.org (full text, mbox):
On Tue, Nov 15, 2016 at 12:17 AM, Paul Eggert <eggert <at> cs.ucla.edu> wrote:
> Jim Meyering wrote:
>>
>> Paul, what do you think about making your heuristic apply only for
>> non-pipes?
>
> I'm a bit dubious, as grep exits early for other reasons, and did so before
> the patch in question. For example, here:
>
> seq 10000000000 | grep -q .
>
> This is as of grep 2.19 (2014), due to a bug report where grep performed
> badly by not exiting early <https://bugs.gnu.org/17427>. Which suggests that
> we'll get bug reports in this area no matter what grep does....
The issue of -q is separate, because anyone who has invoked grep with
-q has long been exposed to this behavior already. And behavior could
even be inferred based on the POSIX description. I think it is fine
for grep -q to terminate early in all cases.
My concern is when no "exit-early"-implying option (none of at least
-l, -q, -m) is specified, say within some script that has always
worked, yet grep-2.26 makes OP's example fail most surprisingly when
at a distance (i.e., the invocation of grep was hidden) someone
unwittingly redirects standard output to /dev/null.
> Looking at other implementations, Solaris 11 grep is similar with -q. And
> FreeBSD-current grep exits early for this case:
>
> seq 10000000000 | grep -f /dev/null
>
> Come to think of it, perhaps GNU grep should do a similar optimization for
> -f /dev/null, if only to keep up with FreeBSD.
>
> All the above being said, I am sympathetic to the bug report. Perhaps we can
> eliminate the optimization if there are no file arguments and only options
> in the set -EFivx are specified. Something like that anyway. The idea is to
> catch common cases in old (and really, broken) scripts, without hurting
> performance in general.
I suppose you mean in addition to the S_ISFIFO test? That sounds good.
We should retain the optimization when reading from stdin that is a
non-pipe.
Information forwarded
to
bug-grep <at> gnu.org
:
bug#24941
; Package
grep
.
(Tue, 15 Nov 2016 19:36:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-grep <at> gnu.org
:
bug#24941
; Package
grep
.
(Tue, 15 Nov 2016 22:14:02 GMT)
Full text and
rfc822 format available.
Message #26 received at 24941 <at> debbugs.gnu.org (full text, mbox):
On Tue, 15 Nov 2016 11:35:15 -0800
Jim Meyering <jim <at> meyering.net> wrote:
> I suppose you mean in addition to the S_ISFIFO test? That sounds good.
> We should retain the optimization when reading from stdin that is a
> non-pipe.
This can also happen in stdin. If we redirect stdout to /dev/null,
grep-2.26 exits immediately and prompt is returned.
- grep-2.25
$ grep . >/dev/null
a <<input a and newline
b <<input b and newline
c <<input c and newline
- grep-2.26
$ src/grep . >/dev/null
a <<input a and newline
$
Information forwarded
to
bug-grep <at> gnu.org
:
bug#24941
; Package
grep
.
(Tue, 15 Nov 2016 23:14:02 GMT)
Full text and
rfc822 format available.
Message #29 received at 24941 <at> debbugs.gnu.org (full text, mbox):
On Tue, Nov 15, 2016 at 2:13 PM, Norihiro Tanaka <noritnk <at> kcn.ne.jp> wrote:
>
> On Tue, 15 Nov 2016 11:35:15 -0800
> Jim Meyering <jim <at> meyering.net> wrote:
>
>> I suppose you mean in addition to the S_ISFIFO test? That sounds good.
>> We should retain the optimization when reading from stdin that is a
>> non-pipe.
>
> This can also happen in stdin. If we redirect stdout to /dev/null,
> grep-2.26 exits immediately and prompt is returned.
>
> - grep-2.25
>
> $ grep . >/dev/null
> a <<input a and newline
> b <<input b and newline
> c <<input c and newline
>
> - grep-2.26
>
> $ src/grep . >/dev/null
> a <<input a and newline
> $
Good point. While I suspect that would be much less likely to cause
trouble in practice, I would now rephrase:
We should retain the optimization when reading from stdin that is
neither a pipe nor a tty.
Information forwarded
to
bug-grep <at> gnu.org
:
bug#24941
; Package
grep
.
(Tue, 15 Nov 2016 23:34:01 GMT)
Full text and
rfc822 format available.
Message #32 received at 24941 <at> debbugs.gnu.org (full text, mbox):
On 11/15/2016 03:13 PM, Jim Meyering wrote:
> We should retain the optimization when reading from stdin that is
> neither a pipe nor a tty.
I am toying with the idea of retaining the optimization only if
lseek-to-EOF succeeds, a heuristic that is a bit more restrictive. This
arguably would conform better to the POSIX requirement that when grep
exits "the file offset in the open file description is properly
positioned just past the last byte processed by the utility." See the
INPUT FILES section of
<http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap01.html#tag_17_04>.
Information forwarded
to
bug-grep <at> gnu.org
:
bug#24941
; Package
grep
.
(Wed, 16 Nov 2016 00:11:02 GMT)
Full text and
rfc822 format available.
Message #35 received at 24941 <at> debbugs.gnu.org (full text, mbox):
On Tue, Nov 15, 2016 at 3:33 PM, Paul Eggert <eggert <at> cs.ucla.edu> wrote:
> On 11/15/2016 03:13 PM, Jim Meyering wrote:
>>
>> We should retain the optimization when reading from stdin that is
>> neither a pipe nor a tty.
>
> I am toying with the idea of retaining the optimization only if lseek-to-EOF
> succeeds, a heuristic that is a bit more restrictive. This arguably would
> conform better to the POSIX requirement that when grep exits "the file
> offset in the open file description is properly positioned just past the
> last byte processed by the utility." See the INPUT FILES section of
> <http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap01.html#tag_17_04>.
I like it. That would make the offset in the input file predictable in
those cases.
Reply sent
to
Paul Eggert <eggert <at> cs.ucla.edu>
:
You have taken responsibility.
(Sat, 19 Nov 2016 09:49:01 GMT)
Full text and
rfc822 format available.
Notification sent
to
Gary Johnson <garyjohn <at> spocom.com>
:
bug acknowledged by developer.
(Sat, 19 Nov 2016 09:49:01 GMT)
Full text and
rfc822 format available.
Message #40 received at 24941-done <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
This turned into more work than I expected, as I kept finding performance
glitches and/or correctness bugs in the neighborhood. I installed the attached
set of patches. Patch 03 is the crucial one. Patch 10 trivially fixes an earlier
test of mine and I'm too lazy to write a separate email for it.
This fixes the problem for me, so I'm taking the liberty of closing this bug report.
[0001-grep-avoid-unnecessary-isatty-calls.patch (text/x-diff, attachment)]
[0002-grep-improve-diagnostic-on-lseek-failure.patch (text/x-diff, attachment)]
[0003-grep-scale-back-dev-null-speedup.patch (text/x-diff, attachment)]
[0004-grep-drain-the-input-pipe-faster.patch (text/x-diff, attachment)]
[0005-grep-avoid-unnecessary-gettext-call.patch (text/x-diff, attachment)]
[0006-grep-avoid-O-N-2-buffer-reallocation.patch (text/x-diff, attachment)]
[0007-grep-treat-f-dev-null-like-m0.patch (text/x-diff, attachment)]
[0008-grep-tune-f-dev-null.patch (text/x-diff, attachment)]
[0009-grep-f-dev-null-L-PAT-FILE-outputs-FILE.patch (text/x-diff, attachment)]
[0010-tests-use-returns_-rather-than.patch (text/x-diff, attachment)]
Information forwarded
to
bug-grep <at> gnu.org
:
bug#24941
; Package
grep
.
(Sun, 20 Nov 2016 01:41:01 GMT)
Full text and
rfc822 format available.
Message #43 received at 24941-done <at> debbugs.gnu.org (full text, mbox):
On Sat, Nov 19, 2016 at 1:47 AM, Paul Eggert <eggert <at> cs.ucla.edu> wrote:
> This turned into more work than I expected, as I kept finding performance
> glitches and/or correctness bugs in the neighborhood. I installed the
> attached set of patches. Patch 03 is the crucial one. Patch 10 trivially
> fixes an earlier test of mine and I'm too lazy to write a separate email for
> it.
>
> This fixes the problem for me, so I'm taking the liberty of closing this bug
> report.
Impressive work. Thanks a lot!
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Sun, 18 Dec 2016 12:24:03 GMT)
Full text and
rfc822 format available.
This bug report was last modified 7 years and 133 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.