GNU bug report logs -
#17640
grep with -m reads the entire input
Previous Next
Reported by: Marc Aldorasi <m101010a <at> gmail.com>
Date: Fri, 30 May 2014 06:53:02 UTC
Severity: normal
Done: Jim Meyering <jim <at> meyering.net>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 17640 in the body.
You can then email your comments to 17640 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-grep <at> gnu.org
:
bug#17640
; Package
grep
.
(Fri, 30 May 2014 06:53:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Marc Aldorasi <m101010a <at> gmail.com>
:
New bug report received and forwarded. Copy sent to
bug-grep <at> gnu.org
.
(Fri, 30 May 2014 06:53:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
With grep 2.18, the -m option would cause grep to stop reading input
after printing the requested number of matching lines. With version
2.19, grep reads the entire input before exiting. Interestingly, grep
does not read the entire input if the -c or -C0 options are added in
addition to -m, and also when using -l or -q instead of -m. I believe
this is caused by commit 5122195.
Information forwarded
to
bug-grep <at> gnu.org
:
bug#17640
; Package
grep
.
(Fri, 30 May 2014 07:09:01 GMT)
Full text and
rfc822 format available.
Message #8 received at 17640 <at> debbugs.gnu.org (full text, mbox):
> With grep 2.18, the -m option would cause grep to stop reading input
> after printing the requested number of matching lines. With version
> 2.19, grep reads the entire input before exiting.
Can you give an example of the failure? What platform are you running
on? I couldn't reproduce the problem on Fedora 20 x86-64. Here's how I
tried:
$ seq 1000000 >million
$ (grep -m1000 0 | wc -l; wc -l) <million
1000
995994
and these numbers look correct to me.
Information forwarded
to
bug-grep <at> gnu.org
:
bug#17640
; Package
grep
.
(Fri, 30 May 2014 15:58:01 GMT)
Full text and
rfc822 format available.
Message #11 received at 17640 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Thu, May 29, 2014 at 10:45 PM, Marc Aldorasi <m101010a <at> gmail.com> wrote:
> With grep 2.18, the -m option would cause grep to stop reading input
> after printing the requested number of matching lines. With version
> 2.19, grep reads the entire input before exiting. Interestingly, grep
> does not read the entire input if the -c or -C0 options are added in
> addition to -m, and also when using -l or -q instead of -m. I believe
> this is caused by commit 5122195.
Thanks a lot for the report. Just in time.
I confirm that it's a bug introduced in 2.19.
To test, run "seq 1000000 > million", then
"strace -e read grep 0 million" first using grep-2.18
(shows just a few read syscalls), and then with 2.19,
which shows grep reading the entire million-line file.
Here's an incomplete patch. Obviously there's a lot more
to be added, including NEWS and a nontrivial test. This
was introduced by commit v2.18-140-g6f07900
[grep-m-patch.txt (text/plain, attachment)]
Information forwarded
to
bug-grep <at> gnu.org
:
bug#17640
; Package
grep
.
(Fri, 30 May 2014 16:00:04 GMT)
Full text and
rfc822 format available.
Message #14 received at 17640 <at> debbugs.gnu.org (full text, mbox):
On Fri, May 30, 2014 at 8:56 AM, Jim Meyering <jim <at> meyering.net> wrote:
> On Thu, May 29, 2014 at 10:45 PM, Marc Aldorasi <m101010a <at> gmail.com> wrote:
>> With grep 2.18, the -m option would cause grep to stop reading input
>> after printing the requested number of matching lines. With version
>> 2.19, grep reads the entire input before exiting. Interestingly, grep
>> does not read the entire input if the -c or -C0 options are added in
>> addition to -m, and also when using -l or -q instead of -m. I believe
>> this is caused by commit 5122195.
>
> Thanks a lot for the report. Just in time.
> I confirm that it's a bug introduced in 2.19.
> To test, run "seq 1000000 > million", then
> "strace -e read grep 0 million" first using grep-2.18
> (shows just a few read syscalls), and then with 2.19,
> which shows grep reading the entire million-line file.
Correction: to reproduce, you'll have to insert -m1 in that grep command.
> Here's an incomplete patch. Obviously there's a lot more
> to be added, including NEWS and a nontrivial test. This
> was introduced by commit v2.18-140-g6f07900
Reply sent
to
Jim Meyering <jim <at> meyering.net>
:
You have taken responsibility.
(Fri, 30 May 2014 16:36:02 GMT)
Full text and
rfc822 format available.
Notification sent
to
Marc Aldorasi <m101010a <at> gmail.com>
:
bug acknowledged by developer.
(Fri, 30 May 2014 16:36:03 GMT)
Full text and
rfc822 format available.
Message #19 received at 17640-done <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Fri, May 30, 2014 at 8:58 AM, Jim Meyering <jim <at> meyering.net> wrote:
> On Fri, May 30, 2014 at 8:56 AM, Jim Meyering <jim <at> meyering.net> wrote:
>> On Thu, May 29, 2014 at 10:45 PM, Marc Aldorasi <m101010a <at> gmail.com> wrote:
>>> With grep 2.18, the -m option would cause grep to stop reading input
>>> after printing the requested number of matching lines. With version
>>> 2.19, grep reads the entire input before exiting. Interestingly, grep
>>> does not read the entire input if the -c or -C0 options are added in
>>> addition to -m, and also when using -l or -q instead of -m. I believe
>>> this is caused by commit 5122195.
>>
>> Thanks a lot for the report. Just in time.
>> I confirm that it's a bug introduced in 2.19.
>> To test, run "seq 1000000 > million", then
>> "strace -e read grep 0 million" first using grep-2.18
>> (shows just a few read syscalls), and then with 2.19,
>> which shows grep reading the entire million-line file.
>
> Correction: to reproduce, you'll have to insert -m1 in that grep command.
>
>> Here's an incomplete patch. Obviously there's a lot more
>> to be added, including NEWS and a nontrivial test. This
>> was introduced by commit v2.18-140-g6f07900
This bears some explanation. I've attached a more complete patch
(albeit still hastily composed, so I'll wait a few hours,
in case there's feedback)
Prior to grep-2.19, with --max-count=N, this first disjunct would
be true after the Nth match, because pending would be 0:
if ((!outleft && !pending) || (nlines && done_on_match))
goto finish_grep;
However, a seemingly unrelated change affected how "pending" is set:
pending = out_quiet ? 0 : out_after;
We used to ensure that "out_after" was non-negative, because
default_context was always non-negative:
if (out_after < 0)
out_after = default_context;
But the recent context-related change invalidated that assumption:
- default_context = 0;
+ default_context = -1;
Here's the patch:
[0001-grep-fix-max-count-N-m-N-to-stop-reading-after-Nth-m.txt (text/plain, attachment)]
Information forwarded
to
bug-grep <at> gnu.org
:
bug#17640
; Package
grep
.
(Fri, 30 May 2014 19:08:02 GMT)
Full text and
rfc822 format available.
Message #22 received at 17640-done <at> debbugs.gnu.org (full text, mbox):
On Fri, May 30, 2014 at 9:34 AM, Jim Meyering <jim <at> meyering.net> wrote:
...
> Here's the patch:
FYI, I've adjusted the commit log to point to the correct diff:
This bug was introduced by commit v2.18-139-g5122195.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Sat, 28 Jun 2014 11:24:04 GMT)
Full text and
rfc822 format available.
This bug report was last modified 9 years and 304 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.