GNU bug report logs - #13530
head: memory exhausted when printing all from stdin but last P/E bytes

Previous Next

Package: coreutils;

Reported by: Lei Zhang <lei.zhang <at> uwaterloo.ca>

Date: Wed, 23 Jan 2013 02:43:01 UTC

Severity: normal

Done: Pádraig Brady <P <at> draigBrady.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 13530 in the body.
You can then email your comments to 13530 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-coreutils <at> gnu.org:
bug#13530; Package coreutils. (Wed, 23 Jan 2013 02:43:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Lei Zhang <lei.zhang <at> uwaterloo.ca>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Wed, 23 Jan 2013 02:43:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Lei Zhang <lei.zhang <at> uwaterloo.ca>
To: bug-coreutils <at> gnu.org
Subject: head: memory exhausted when printing all from stdin but last P/E bytes
Date: Tue, 22 Jan 2013 21:32:57 -0500
[Message part 1 (text/plain, inline)]
Hi All,

We found a bug in the `head' program of coreutils 8.20:

Invoking `head -c -P' or `head -c -E' will cause memory exhaustion.

However, smaller units (e.g., b, K, M) work fine; bigger units (e.g., Z, Y)
fail properly
(by outputing "number of bytes is so large that it is not representable").
And `-n' also
works fine.

A bit dig reveals that the bug is introduced at line 322 of head.c
(coreutils 8.20):

204: elide_tail_bytes_pipe (const char *filename, int fd, uintmax_t
n_elide_0)
205: {
322:     b = xcalloc (n_bufs, sizeof *b); /** this statement fails **/
398: }

We also examined n_elide and n_bufs before that statement. Actually, they
are very
large:

n_elide: 1125899906842624
n_bufs: 137438953474

According to the following comment in the source file:

> CAUTION: do not fail (out of memory) when asked to elide
> a ridiculous amount, but when given only a small input.  */

We think this is a bug and bring this issue to your attention. Thanks!


Environments:

$ uname -a
Linux anti-think 3.7.3-1-ARCH #1 SMP PREEMPT Thu Jan 17 18:52:30 CET 2013
x86_64 GNU/Linux

$ pacman -Qi coreutils
Name           : coreutils
Version        : 8.20-1
URL            : http://www.gnu.org/software/coreutils
Licenses       : GPL3
Groups         : base
Provides       : None
Depends On     : glibc  pam  acl  gmp  libcap
Optional Deps  : None
Required By    : ca-certificates  dbus  filesystem  linux  mkinitcpio  perl
 sysvinit-tools  util-linux
Conflicts With : None
Replaces       : None
Installed Size : 13820.00 KiB
Packager       : Allan McRae <allan <at> archlinux.org>
Architecture   : x86_64
Build Date     : Wed 24 Oct 2012 03:57:11 AM EDT
Install Date   : Sun 28 Oct 2012 01:51:06 PM EDT
Install Reason : Explicitly installed
Install Script : Yes
Description    : The basic file, shell and text manipulation utilities of
the GNU operating system

CPU: Intel(R) Core(TM) i5-3320M CPU @ 2.60GHz CPU
memory: 4GB

Best regards,

-- 
Lei Zhang
Department of Electrical and Computer Engineering
University of Waterloo
 200 University Avenue West
Waterloo, Ontario, Canada N2L 3G1
[Message part 2 (text/html, inline)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#13530; Package coreutils. (Wed, 23 Jan 2013 12:36:02 GMT) Full text and rfc822 format available.

Message #8 received at 13530 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Lei Zhang <lei.zhang <at> uwaterloo.ca>
Cc: 13530 <at> debbugs.gnu.org
Subject: Re: bug#13530: head: memory exhausted when printing all from stdin
	but last P/E bytes
Date: Wed, 23 Jan 2013 12:34:39 +0000
On 01/23/2013 02:32 AM, Lei Zhang wrote:
> Hi All,
>
> We found a bug in the `head' program of coreutils 8.20:
>
> Invoking `head -c -P' or `head -c -E' will cause memory exhaustion.
>
> However, smaller units (e.g., b, K, M) work fine; bigger units (e.g., Z, Y)
> fail properly
> (by outputing "number of bytes is so large that it is not representable").
> And `-n' also
> works fine.
>
> A bit dig reveals that the bug is introduced at line 322 of head.c
> (coreutils 8.20):
>
> 204: elide_tail_bytes_pipe (const char *filename, int fd, uintmax_t
> n_elide_0)
> 205: {
> 322:     b = xcalloc (n_bufs, sizeof *b); /** this statement fails **/
> 398: }
>
> We also examined n_elide and n_bufs before that statement. Actually, they
> are very
> large:
>
> n_elide: 1125899906842624
> n_bufs: 137438953474
>
> According to the following comment in the source file:
>
>> CAUTION: do not fail (out of memory) when asked to elide
>> a ridiculous amount, but when given only a small input.  */
>
> We think this is a bug and bring this issue to your attention. Thanks!

There is the argument that we _should_ allocate
everything up front to indicate immediately
that the system can't (currently) support the requested operation,
but given the 'currently' caveat above I guess it's
more general to fail when we actually run out of mem?

So to do that we can either realloc our pointer array
in chunks or use the simpler approach in the patch below,
and take advantage of kernel overcommit strategies.
(calloc is problematic for overcommit as zeroing the
memory fully allocates it to the process).
The caveat with that though is that it's dependent
on the overcommit config for the kernel and possibly
current memory conditions.

thanks,
Pádraig.

diff --git a/src/head.c b/src/head.c
index cf84a95..909be04 100644
--- a/src/head.c
+++ b/src/head.c
@@ -197,7 +197,7 @@ copy_fd (int src_fd, FILE *o_stream, uintmax_t n_bytes)
   return COPY_FD_OK;
 }

-/* Print all but the last N_ELIDE lines from the input available via
+/* Print all but the last N_ELIDE bytes from the input available via
    the non-seekable file descriptor FD.  Return true upon success.
    Give a diagnostic and return false upon error.  */
 static bool
@@ -314,18 +314,22 @@ elide_tail_bytes_pipe (const char *filename, int fd, uintmax_t n_elide_0)
       size_t n_read;
       bool buffered_enough;
       size_t i, i_next;
+      size_t n_alloc = 0;
       char **b;
       /* Round n_elide up to a multiple of READ_BUFSIZE.  */
       size_t rem = READ_BUFSIZE - (n_elide % READ_BUFSIZE);
       size_t n_elide_round = n_elide + rem;
       size_t n_bufs = n_elide_round / READ_BUFSIZE + 1;
-      b = xcalloc (n_bufs, sizeof *b);
+      b = xnmalloc (n_bufs, sizeof *b);

       buffered_enough = false;
       for (i = 0, i_next = 1; !eof; i = i_next, i_next = (i_next + 1) % n_bufs)
         {
-          if (b[i] == NULL)
-            b[i] = xmalloc (READ_BUFSIZE);
+          if (! buffered_enough)
+            {
+              b[i] = xmalloc (READ_BUFSIZE);
+              n_alloc = i + 1;
+            }
           n_read = full_read (fd, b[i], READ_BUFSIZE);
           if (n_read < READ_BUFSIZE)
             {
@@ -389,7 +393,7 @@ elide_tail_bytes_pipe (const char *filename, int fd, uintmax_t n_elide_0)
         }

     free_mem:
-      for (i = 0; i < n_bufs; i++)
+      for (i = 0; i < n_alloc; i++)
         free (b[i]);
       free (b);





Information forwarded to bug-coreutils <at> gnu.org:
bug#13530; Package coreutils. (Wed, 23 Jan 2013 12:56:02 GMT) Full text and rfc822 format available.

Message #11 received at 13530 <at> debbugs.gnu.org (full text, mbox):

From: Bernhard Voelker <mail <at> bernhard-voelker.de>
To: Pádraig Brady <P <at> draigBrady.com>
Cc: Lei Zhang <lei.zhang <at> uwaterloo.ca>, 13530 <at> debbugs.gnu.org
Subject: Re: bug#13530: head: memory exhausted when printing all from stdin
	but last P/E bytes
Date: Wed, 23 Jan 2013 13:53:47 +0100
On 01/23/2013 01:34 PM, Pádraig Brady wrote:
> There is the argument that we _should_ allocate
> everything up front to indicate immediately
> that the system can't (currently) support the requested operation,
> but given the 'currently' caveat above I guess it's
> more general to fail when we actually run out of mem?

head doesn't "allocate everything up front" - instead, it only
allocates the pointer array which would hold the actual data.
The strategy in elide_tail_lines_pipe() seems more robust:
it allocates memory when needed.

Or another (probably martian) idea:
what about a tmpfile()?

>       free_mem:
> -      for (i = 0; i < n_bufs; i++)
> +      for (i = 0; i < n_alloc; i++)
>           free (b[i]);
>         free (b);

BTW: both in the old and the new version, the loop can break
if (b[i] == 0) because the array is filled from the beginning.

-      for (i = 0; i < n_bufs; i++)
+      for (i = 0; i < n_bufs && b[i]; i++)

This makes "echo 123 | head -c -T" ~40% faster here on my PC.

Have a nice day,
Berny






Information forwarded to bug-coreutils <at> gnu.org:
bug#13530; Package coreutils. (Wed, 23 Jan 2013 13:06:01 GMT) Full text and rfc822 format available.

Message #14 received at 13530 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Bernhard Voelker <mail <at> bernhard-voelker.de>
Cc: Lei Zhang <lei.zhang <at> uwaterloo.ca>, 13530 <at> debbugs.gnu.org
Subject: Re: bug#13530: head: memory exhausted when printing all from stdin
	but last P/E bytes
Date: Wed, 23 Jan 2013 13:03:55 +0000
On 01/23/2013 12:53 PM, Bernhard Voelker wrote:
> On 01/23/2013 01:34 PM, Pádraig Brady wrote:
>> There is the argument that we _should_ allocate
>> everything up front to indicate immediately
>> that the system can't (currently) support the requested operation,
>> but given the 'currently' caveat above I guess it's
>> more general to fail when we actually run out of mem?
>
> head doesn't "allocate everything up front" - instead, it only
> allocates the pointer array which would hold the actual data.

Sure. I was wondering whether that should change
to allocate everything up front so as to exit early.

> The strategy in elide_tail_lines_pipe() seems more robust:
> it allocates memory when needed.

Yes probably.

> Or another (probably martian) idea:
> what about a tmpfile()?

Not worth it I think.

>>        free_mem:
>> -      for (i = 0; i < n_bufs; i++)
>> +      for (i = 0; i < n_alloc; i++)
>>            free (b[i]);
>>          free (b);
>
> BTW: both in the old and the new version, the loop can break
> if (b[i] == 0) because the array is filled from the beginning.

The new version only frees what's allocated and so gets this benefit too.

> -      for (i = 0; i < n_bufs; i++)
> +      for (i = 0; i < n_bufs && b[i]; i++)
>
> This makes "echo 123 | head -c -T" ~40% faster here on my PC.

nice

thanks,
Pádraig




Information forwarded to bug-coreutils <at> gnu.org:
bug#13530; Package coreutils. (Wed, 23 Jan 2013 13:23:01 GMT) Full text and rfc822 format available.

Message #17 received at 13530 <at> debbugs.gnu.org (full text, mbox):

From: Bernhard Voelker <mail <at> bernhard-voelker.de>
To: Pádraig Brady <P <at> draigBrady.com>
Cc: Lei Zhang <lei.zhang <at> uwaterloo.ca>, 13530 <at> debbugs.gnu.org
Subject: Re: bug#13530: head: memory exhausted when printing all from stdin
	but last P/E bytes
Date: Wed, 23 Jan 2013 14:21:29 +0100
On 01/23/2013 02:03 PM, Pádraig Brady wrote:
> On 01/23/2013 12:53 PM, Bernhard Voelker wrote:
>> head doesn't "allocate everything up front" - instead, it only
>> allocates the pointer array which would hold the actual data.
> 
> Sure. I was wondering whether that should change
> to allocate everything up front so as to exit early.

I think that's not a good idea because head would ENOMEM
even if it doesn't wouldn't need the memory (depending on the
input). Why should the following fail?

  $ echo 123 | head -c -T

I think there's no reason to fail until we really know it would
fail. In this case, we can only know it when we actually come
to the point when we need the memory.

It doesn't matter if we have
  $ echo 123 | head -c -10
or
  $ echo 123 | head -c -P

>>>        free_mem:
>>> -      for (i = 0; i < n_bufs; i++)
>>> +      for (i = 0; i < n_alloc; i++)
>>>            free (b[i]);
>>>          free (b);
>>
>> BTW: both in the old and the new version, the loop can break
>> if (b[i] == 0) because the array is filled from the beginning.
> 
> The new version only frees what's allocated and so gets this benefit too.

ah yes, sorry, I didn't look close enough.

Have a nice day,
Berny




Information forwarded to bug-coreutils <at> gnu.org:
bug#13530; Package coreutils. (Wed, 23 Jan 2013 13:29:01 GMT) Full text and rfc822 format available.

Message #20 received at 13530 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Bernhard Voelker <mail <at> bernhard-voelker.de>
Cc: Lei Zhang <lei.zhang <at> uwaterloo.ca>, 13530 <at> debbugs.gnu.org
Subject: Re: bug#13530: head: memory exhausted when printing all from stdin
	but last P/E bytes
Date: Wed, 23 Jan 2013 13:27:15 +0000
On 01/23/2013 01:21 PM, Bernhard Voelker wrote:
> On 01/23/2013 02:03 PM, Pádraig Brady wrote:
>> On 01/23/2013 12:53 PM, Bernhard Voelker wrote:
>>> head doesn't "allocate everything up front" - instead, it only
>>> allocates the pointer array which would hold the actual data.
>>
>> Sure. I was wondering whether that should change
>> to allocate everything up front so as to exit early.
>
> I think that's not a good idea because head would ENOMEM
> even if it doesn't wouldn't need the memory (depending on the
> input). Why should the following fail?
>
>    $ echo 123 | head -c -T
>
> I think there's no reason to fail until we really know it would
> fail. In this case, we can only know it when we actually come
> to the point when we need the memory.
>
> It doesn't matter if we have
>    $ echo 123 | head -c -10
> or
>    $ echo 123 | head -c -P

Yes my point is that the above could be a problem at:

today    | head -c -P # This noop is OK
tomorrow | head -c -P # If we actually send lots of data we fail

cheers,
Pádraig.




Reply sent to Pádraig Brady <P <at> draigBrady.com>:
You have taken responsibility. (Fri, 12 Apr 2013 18:23:02 GMT) Full text and rfc822 format available.

Notification sent to Lei Zhang <lei.zhang <at> uwaterloo.ca>:
bug acknowledged by developer. (Fri, 12 Apr 2013 18:23:02 GMT) Full text and rfc822 format available.

Message #25 received at 13530-done <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Lei Zhang <lei.zhang <at> uwaterloo.ca>
Cc: 13530-done <at> debbugs.gnu.org
Subject: Re: bug#13530: head: memory exhausted when printing all from stdin
	but last P/E bytes
Date: Fri, 12 Apr 2013 19:18:49 +0100
[Message part 1 (text/plain, inline)]
On 01/23/2013 12:34 PM, Pádraig Brady wrote:
> On 01/23/2013 02:32 AM, Lei Zhang wrote:
>> Hi All,
>>
>> We found a bug in the `head' program of coreutils 8.20:
>>
>> Invoking `head -c -P' or `head -c -E' will cause memory exhaustion.
>>
>> However, smaller units (e.g., b, K, M) work fine; bigger units (e.g., Z, Y)
>> fail properly
>> (by outputing "number of bytes is so large that it is not representable").
>> And `-n' also
>> works fine.
>>
>> A bit dig reveals that the bug is introduced at line 322 of head.c
>> (coreutils 8.20):
>>
>> 204: elide_tail_bytes_pipe (const char *filename, int fd, uintmax_t
>> n_elide_0)
>> 205: {
>> 322:     b = xcalloc (n_bufs, sizeof *b); /** this statement fails **/
>> 398: }
>>
>> We also examined n_elide and n_bufs before that statement. Actually, they
>> are very
>> large:
>>
>> n_elide: 1125899906842624
>> n_bufs: 137438953474
>>
>> According to the following comment in the source file:
>>
>>> CAUTION: do not fail (out of memory) when asked to elide
>>> a ridiculous amount, but when given only a small input.  */
>>
>> We think this is a bug and bring this issue to your attention. Thanks!
> 
> There is the argument that we _should_ allocate
> everything up front to indicate immediately
> that the system can't (currently) support the requested operation,
> but given the 'currently' caveat above I guess it's
> more general to fail when we actually run out of mem?
> 
> So to do that we can either realloc our pointer array
> in chunks or use the simpler approach in the patch below,
> and take advantage of kernel overcommit strategies.
> (calloc is problematic for overcommit as zeroing the
> memory fully allocates it to the process).
> The caveat with that though is that it's dependent
> on the overcommit config for the kernel and possibly
> current memory conditions.
> 
> thanks,
> Pádraig.
> 
> diff --git a/src/head.c b/src/head.c
> index cf84a95..909be04 100644
> --- a/src/head.c
> +++ b/src/head.c
> @@ -197,7 +197,7 @@ copy_fd (int src_fd, FILE *o_stream, uintmax_t n_bytes)
>    return COPY_FD_OK;
>  }
> 
> -/* Print all but the last N_ELIDE lines from the input available via
> +/* Print all but the last N_ELIDE bytes from the input available via
>     the non-seekable file descriptor FD.  Return true upon success.
>     Give a diagnostic and return false upon error.  */
>  static bool
> @@ -314,18 +314,22 @@ elide_tail_bytes_pipe (const char *filename, int fd, uintmax_t n_elide_0)
>        size_t n_read;
>        bool buffered_enough;
>        size_t i, i_next;
> +      size_t n_alloc = 0;
>        char **b;
>        /* Round n_elide up to a multiple of READ_BUFSIZE.  */
>        size_t rem = READ_BUFSIZE - (n_elide % READ_BUFSIZE);
>        size_t n_elide_round = n_elide + rem;
>        size_t n_bufs = n_elide_round / READ_BUFSIZE + 1;
> -      b = xcalloc (n_bufs, sizeof *b);
> +      b = xnmalloc (n_bufs, sizeof *b);
> 
>        buffered_enough = false;
>        for (i = 0, i_next = 1; !eof; i = i_next, i_next = (i_next + 1) % n_bufs)
>          {
> -          if (b[i] == NULL)
> -            b[i] = xmalloc (READ_BUFSIZE);
> +          if (! buffered_enough)
> +            {
> +              b[i] = xmalloc (READ_BUFSIZE);
> +              n_alloc = i + 1;
> +            }
>            n_read = full_read (fd, b[i], READ_BUFSIZE);
>            if (n_read < READ_BUFSIZE)
>              {
> @@ -389,7 +393,7 @@ elide_tail_bytes_pipe (const char *filename, int fd, uintmax_t n_elide_0)
>          }
> 
>      free_mem:
> -      for (i = 0; i < n_bufs; i++)
> +      for (i = 0; i < n_alloc; i++)
>          free (b[i]);
>        free (b);

I expect to push soon, the attached more complete fix to realloc the array.

thanks,
Pádraig.

[head-elide-tail-pipe-mem.patch (text/x-patch, attachment)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#13530; Package coreutils. (Fri, 12 Apr 2013 19:11:01 GMT) Full text and rfc822 format available.

Message #28 received at 13530 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: 13530 <at> debbugs.gnu.org
Cc: lei.zhang <at> uwaterloo.ca, P <at> draigBrady.com
Subject: Re: bug#13530: head: memory exhausted when printing all from stdin
	but last P/E bytes
Date: Fri, 12 Apr 2013 21:06:28 +0200
Pádraig Brady wrote:
...
> I expect to push soon, the attached more complete fix to realloc the array.

Thanks!  That change looks fine and passed tests here, too.




Information forwarded to bug-coreutils <at> gnu.org:
bug#13530; Package coreutils. (Fri, 12 Apr 2013 19:25:01 GMT) Full text and rfc822 format available.

Message #31 received at 13530 <at> debbugs.gnu.org (full text, mbox):

From: Bernhard Voelker <mail <at> bernhard-voelker.de>
To: Jim Meyering <jim <at> meyering.net>
Cc: lei.zhang <at> uwaterloo.ca, 13530 <at> debbugs.gnu.org, P <at> draigBrady.com
Subject: Re: bug#13530: head: memory exhausted when printing all from stdin
	but last P/E bytes
Date: Fri, 12 Apr 2013 21:19:56 +0200
On 04/12/2013 09:06 PM, Jim Meyering wrote:
> Pádraig Brady wrote:
> ...
>> I expect to push soon, the attached more complete fix to realloc the array.
> 
> Thanks!  That change looks fine and passed tests here, too.

+1

Have a nice day,
Berny




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sat, 11 May 2013 11:24:03 GMT) Full text and rfc822 format available.

bug unarchived. Request was from Pádraig Brady <P <at> draigBrady.com> to control <at> debbugs.gnu.org. (Mon, 27 May 2013 02:27:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-coreutils <at> gnu.org:
bug#13530; Package coreutils. (Mon, 27 May 2013 02:38:03 GMT) Full text and rfc822 format available.

Message #38 received at 13530 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Jim Meyering <jim <at> meyering.net>
Cc: lei.zhang <at> uwaterloo.ca, 13530 <at> debbugs.gnu.org, coreutils <at> gnu.org
Subject: Re: bug#13530: head: memory exhausted when printing all from stdin
	but last P/E bytes
Date: Mon, 27 May 2013 03:35:01 +0100
On 05/27/2013 01:29 AM, Jim Meyering wrote:
> Pádraig Brady wrote:
> 
>> unarchive 13530
>> stop
> 
> Thanks.
> 
> ...
>>> @@ -358,6 +356,14 @@ elide_tail_bytes_pipe (const char *filename, int fd, uintmax_t n_elide_0)
>>>
>>>            if (buffered_enough)
>>>              {
>>> +              if (n_elide_0 != n_elide)
>>> +                {
>>> +                  error (0, 0, _("memory exhausted while reading %s"),
>>> +                         quote (filename));
>>> +                  ok = false;
>>> +                  goto free_mem;
>>> +                }
>>> +
> ...
>> Oh right it's coming back to me a bit now.
>> So by removing these upfront checks where possible,
>> it only makes sense of the program could under different
>> free mem available, fulfil the operation up to the specified limit.
>> In this case though, the program could never fulfil the request,
>> so it's better to fail early as is the case in the code currently?
> 
> Well, it *can* fulfill the request whenever the request is degenerate,
> i.e., when the size of the input is smaller than N and also small enough
> to be read into memory.

Sure, but...

> Technically, we could handle this case the same way we handle it
> in tac.c: read data from nonseekable FD and write it to a temporary file.
> I'm not sure it's worth the effort here, though.

Yes ideally we could avoid this limit,
though I don't see it as a priority either.

> ...
>>> -(ulimit -v 20000; head --bytes=-E < /dev/null) || fail=1
>>> +(ulimit -v 20000; head --bytes=-$OFF_T_MAX < /dev/null) || fail=1
> 
> I'm inclined to make the above (nonseekable input) cases succeed,
> for consistency with the seekable-input case like this:
> 
>     : > empty
>     head --bytes=-E empty
> 
> I confess that I did not like the way my manual test ended up
> using so much memory... but it couldn't know if it was going
> to be able to succeed without actually reading/allocating all
> of that space.
> 
> If we give up immediately, we fail unnecessarily in cases like
> the above where the input is smaller than N.
> 
> What do you think?

... I'm inclined to treat a value that could never be
fulfilled in total as invalid.
Otherwise one might run into unexpected limits in _future_.
This is the same way we immediately error on values over UINTMAX_MAX,
rather truncating the values silently to something we can.

thanks,
Pádraig.




Information forwarded to bug-coreutils <at> gnu.org:
bug#13530; Package coreutils. (Tue, 28 May 2013 00:09:01 GMT) Full text and rfc822 format available.

Message #41 received at 13530 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Pádraig Brady <P <at> draigBrady.com>
Cc: lei.zhang <at> uwaterloo.ca, 13530 <at> debbugs.gnu.org, coreutils <at> gnu.org
Subject: Re: bug#13530: head: memory exhausted when printing all from stdin
	but last P/E bytes
Date: Tue, 28 May 2013 02:07:20 +0200
Pádraig Brady wrote:
...
>> -(ulimit -v 20000; head --bytes=-E < /dev/null) || fail=1
>> +(ulimit -v 20000; head --bytes=-$OFF_T_MAX < /dev/null) || fail=1
>>
>>  Exit $fail
>> --
>> 1.8.3
>>
>
> So the test would have to be adjusted to take the min(SIZE_MAX, OFF_T_MAX)?
>
> HEAD_TAIL_LIMIT=$(printf '%s\n' $SIZE_MAX $OFF_T_MAX | sort -g | head -n1)

I didn't see a reason to include OFF_T_MAX, since this has to do
solely with in-memory buffering, and no seeking.
Also, I had to subtract BUFSIZ, so made another adjustment.

This works for me:

From 6b20bb5c9c464277f843a1d3f418824275f25f6b Mon Sep 17 00:00:00 2001
From: Jim Meyering <meyering <at> fb.com>
Date: Mon, 27 May 2013 17:01:14 -0700
Subject: [PATCH] tests: head-c: avoid spurious failure with a 32-bit SIZE_MAX

* tests/misc/head-c.sh: When eliding N bytes from a non-seekable
input, N must be slightly smaller than SIZE_MAX in order to handle
input longer than N bytes, since the current implementation buffers
N bytes in memory.  This command would fail on 32-bit systems,
where SIZE_MAX < 1E:
  head --bytes=-E < /dev/null
Instead of "E", use a value slightly smaller than SIZE_MAX.
---
 tests/misc/head-c.sh | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/tests/misc/head-c.sh b/tests/misc/head-c.sh
index 37a86ce..70a6ccc 100755
--- a/tests/misc/head-c.sh
+++ b/tests/misc/head-c.sh
@@ -19,6 +19,7 @@
 . "${srcdir=.}/tests/init.sh"; path_prepend_ ./src
 print_ver_ head
 require_ulimit_v_
+getlimits_

 # exercise the fix of 2001-08-18, based on test case from Ian Bruce
 echo abc > in || framework_failure_
@@ -28,9 +29,16 @@ case "$(cat out)" in
   *) fail=1 ;;
 esac

+# Use a limit of N = SIZE_MAX - max_BUFSIZ
+# The "- max_BUFSIZ" term is because head must be able to add BUFSIZ
+# to the selected value of N without exceeding SIZE_MAX.
+# Since we've seen BUFSIZ up to 128K, use 256K to be safe.
+max_BUFSIZ=$(expr 256 '*' 1024)
+lim=$(expr $SIZE_MAX - $max_BUFSIZ)
+
 # Only allocate memory as needed.
 # Coreutils <= 8.21 would allocate memory up front
 # based on the value passed to -c
-(ulimit -v 20000; head --bytes=-E < /dev/null) || fail=1
+(ulimit -v 20000; head --bytes=-$lim < /dev/null) || fail=1

 Exit $fail
--
1.8.3




Information forwarded to bug-coreutils <at> gnu.org:
bug#13530; Package coreutils. (Tue, 28 May 2013 00:28:01 GMT) Full text and rfc822 format available.

Message #44 received at 13530 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Jim Meyering <jim <at> meyering.net>
Cc: lei.zhang <at> uwaterloo.ca, 13530 <at> debbugs.gnu.org,
	Pádraig Brady <P <at> draigBrady.com>, coreutils <at> gnu.org
Subject: Re: bug#13530: head: memory exhausted when printing all from stdin
	but last P/E bytes
Date: Mon, 27 May 2013 17:25:36 -0700
On 05/27/2013 05:07 PM, Jim Meyering wrote:

> +max_BUFSIZ=$(expr 256 '*' 1024)
> +lim=$(expr $SIZE_MAX - $max_BUFSIZ)

Can't this code fail, due to overflow, on non-GMP hosts?  See:

http://lists.gnu.org/archive/html/coreutils/2013-05/msg00060.html

and look for "$SIZE_MAX".




Information forwarded to bug-coreutils <at> gnu.org:
bug#13530; Package coreutils. (Tue, 28 May 2013 00:56:02 GMT) Full text and rfc822 format available.

Message #47 received at 13530 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: lei.zhang <at> uwaterloo.ca, 13530 <at> debbugs.gnu.org,
	Pádraig Brady <P <at> draigBrady.com>, coreutils <at> gnu.org
Subject: Re: bug#13530: head: memory exhausted when printing all from stdin
	but last P/E bytes
Date: Tue, 28 May 2013 02:54:28 +0200
Paul Eggert wrote:

> On 05/27/2013 05:07 PM, Jim Meyering wrote:
>
>> +max_BUFSIZ=$(expr 256 '*' 1024)
>> +lim=$(expr $SIZE_MAX - $max_BUFSIZ)
>
> Can't this code fail, due to overflow, on non-GMP hosts?  See:
>
> http://lists.gnu.org/archive/html/coreutils/2013-05/msg00060.html
>
> and look for "$SIZE_MAX".

Thanks for mentioning that.
I propose to move your subtract_one variable into init.cfg
and then use it like this:

diff --git a/tests/misc/head-c.sh b/tests/misc/head-c.sh
index 70a6ccc..41ea52b 100755
--- a/tests/misc/head-c.sh
+++ b/tests/misc/head-c.sh
@@ -34,7 +34,15 @@ esac
 # to the selected value of N without exceeding SIZE_MAX.
 # Since we've seen BUFSIZ up to 128K, use 256K to be safe.
 max_BUFSIZ=$(expr 256 '*' 1024)
-lim=$(expr $SIZE_MAX - $max_BUFSIZ)
+
+# Normally we would just write this,
+#  lim=$(expr $SIZE_MAX - $max_BUFSIZ)
+# But that fails for non-GMP expr.  See this:
+# https://lists.gnu.org/archive/html/coreutils/2013-05/msg00060.html
+# Instead, use that same approach to obtain SIZE_MAX-1, and *then*
+# subtract $max_BUFSIZ.
+lim=$(echo $SIZE_MAX | sed "$subtract_one")
+lim=$(expr $lim - $max_BUFSIZ)

 # Only allocate memory as needed.
 # Coreutils <= 8.21 would allocate memory up front




Information forwarded to bug-coreutils <at> gnu.org:
bug#13530; Package coreutils. (Tue, 28 May 2013 01:06:01 GMT) Full text and rfc822 format available.

Message #50 received at 13530 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: lei.zhang <at> uwaterloo.ca, 13530 <at> debbugs.gnu.org,
	Pádraig Brady <P <at> draigBrady.com>, coreutils <at> gnu.org
Subject: Re: bug#13530: head: memory exhausted when printing all from stdin
	but last P/E bytes
Date: Tue, 28 May 2013 03:04:21 +0200
Paul Eggert wrote:

> On 05/27/2013 05:07 PM, Jim Meyering wrote:
>
>> +max_BUFSIZ=$(expr 256 '*' 1024)
>> +lim=$(expr $SIZE_MAX - $max_BUFSIZ)
>
> Can't this code fail, due to overflow, on non-GMP hosts?  See:
>
> http://lists.gnu.org/archive/html/coreutils/2013-05/msg00060.html
>
> and look for "$SIZE_MAX".

Here are two patches.
The first factors out the definition into a new function.
The second uses it in the revised head-c-fixing patch.
Both tests still pass, though I haven't yet run them
against a GMP-free expr.

From f97095c19244d61af4172ab457b5bc79081ada79 Mon Sep 17 00:00:00 2001
From: Jim Meyering <meyering <at> fb.com>
Date: Mon, 27 May 2013 17:59:41 -0700
Subject: [PATCH 1/2] maint: factor out new subtract_one_ function

* tests/misc/cut-huge-range.sh (subtract_one): Move definition
of this sed script to init.cfg so we can use it from another test.
* init.cfg (subtract_one_): New function, from that variable.
---
 init.cfg                     | 24 ++++++++++++++++++++++++
 tests/misc/cut-huge-range.sh | 22 +---------------------
 2 files changed, 25 insertions(+), 21 deletions(-)

diff --git a/init.cfg b/init.cfg
index c48607c..b013c4e 100644
--- a/init.cfg
+++ b/init.cfg
@@ -596,4 +596,28 @@ require_gnu_()
     || skip_ 'not running on GNU/Hurd'
 }

+subtract_one_()
+{
+  # sed script to subtract one from the input.
+  # Each input line should consist of a positive decimal number.
+  # Each output line's number is one less than the input's.
+  # There is no limit (other than line length) on the number's magnitude.
+  local subtract_one='
+    s/$/@/
+    : again
+    s/0@/@9/
+    s/1@/0/
+    s/2@/1/
+    s/3@/2/
+    s/4@/3/
+    s/5@/4/
+    s/6@/5/
+    s/7@/6/
+    s/8@/7/
+    s/9@/8/
+    t again
+  '
+  sed "$subtract_one"
+}
+
 sanitize_path_
diff --git a/tests/misc/cut-huge-range.sh b/tests/misc/cut-huge-range.sh
index 7816577..ae7cc70 100755
--- a/tests/misc/cut-huge-range.sh
+++ b/tests/misc/cut-huge-range.sh
@@ -21,31 +21,11 @@ print_ver_ cut
 require_ulimit_v_
 getlimits_

-# sed script to subtract one from the input.
-# Each input line should consist of a positive decimal number.
-# Each output line's number is one less than the input's.
-# There's no limit (other than line length) on the number's magnitude.
-subtract_one='
-  s/$/@/
-  : again
-  s/0@/@9/
-  s/1@/0/
-  s/2@/1/
-  s/3@/2/
-  s/4@/3/
-  s/5@/4/
-  s/6@/5/
-  s/7@/6/
-  s/8@/7/
-  s/9@/8/
-  t again
-'
-
 # Ensure we can cut up to our sentinel value.
 # This is currently SIZE_MAX, but could be raised to UINTMAX_MAX
 # if we didn't allocate memory for each line as a unit.
 # Don't use expr to subtract one, since SIZE_MAX may exceed its maximum value.
-CUT_MAX=$(echo $SIZE_MAX | sed "$subtract_one")
+CUT_MAX=$(echo $SIZE_MAX | subtract_one_ )

 # From coreutils-8.10 through 8.20, this would make cut try to allocate
 # a 256MiB bit vector.  With a 20MB limit on VM, the following would fail.
--
1.8.3


From 905cd2b5c503c82894b433767810ff2f1e40b69d Mon Sep 17 00:00:00 2001
From: Jim Meyering <meyering <at> fb.com>
Date: Mon, 27 May 2013 17:01:14 -0700
Subject: [PATCH 2/2] tests: head-c: avoid spurious failure with a 32-bit
 SIZE_MAX

* tests/misc/head-c.sh: When eliding N bytes from a non-seekable
input, N must be slightly smaller than SIZE_MAX in order to handle
input longer than N bytes, since the current implementation buffers
N bytes in memory.  This command would fail on 32-bit systems,
where SIZE_MAX < 1E:
  head --bytes=-E < /dev/null
Instead of "E", use a value slightly smaller than SIZE_MAX.
---
 tests/misc/head-c.sh | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/tests/misc/head-c.sh b/tests/misc/head-c.sh
index 37a86ce..155c7f6 100755
--- a/tests/misc/head-c.sh
+++ b/tests/misc/head-c.sh
@@ -19,6 +19,7 @@
 . "${srcdir=.}/tests/init.sh"; path_prepend_ ./src
 print_ver_ head
 require_ulimit_v_
+getlimits_

 # exercise the fix of 2001-08-18, based on test case from Ian Bruce
 echo abc > in || framework_failure_
@@ -28,9 +29,24 @@ case "$(cat out)" in
   *) fail=1 ;;
 esac

+# Use a limit of N = SIZE_MAX - max_BUFSIZ
+# The "- max_BUFSIZ" term is because head must be able to add BUFSIZ
+# to the selected value of N without exceeding SIZE_MAX.
+# Since we've seen BUFSIZ up to 128K, use 256K to be safe.
+max_BUFSIZ=$(expr 256 '*' 1024)
+
+# Normally we would just write this,
+#  lim=$(expr $SIZE_MAX - $max_BUFSIZ)
+# But that fails for non-GMP expr.  See this:
+# https://lists.gnu.org/archive/html/coreutils/2013-05/msg00060.html
+# Instead, use that same approach to obtain SIZE_MAX-1, and *then*
+# subtract $max_BUFSIZ.
+lim=$(echo $SIZE_MAX | subtract_one_)
+lim=$(expr $lim - $max_BUFSIZ)
+
 # Only allocate memory as needed.
 # Coreutils <= 8.21 would allocate memory up front
 # based on the value passed to -c
-(ulimit -v 20000; head --bytes=-E < /dev/null) || fail=1
+(ulimit -v 20000; head --bytes=-$lim < /dev/null) || fail=1

 Exit $fail
--
1.8.3




Information forwarded to bug-coreutils <at> gnu.org:
bug#13530; Package coreutils. (Tue, 28 May 2013 01:16:02 GMT) Full text and rfc822 format available.

Message #53 received at 13530 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Jim Meyering <jim <at> meyering.net>
Cc: lei.zhang <at> uwaterloo.ca, 13530 <at> debbugs.gnu.org,
	Pádraig Brady <P <at> draigBrady.com>, coreutils <at> gnu.org
Subject: Re: bug#13530: head: memory exhausted when printing all from stdin
	but last P/E bytes
Date: Mon, 27 May 2013 18:14:32 -0700
On 05/27/2013 06:04 PM, Jim Meyering wrote:
> +lim=$(echo $SIZE_MAX | subtract_one_)
> +lim=$(expr $lim - $max_BUFSIZ)

Sorry, I don't see how this will work either.
It's common for a GMP-less expr to handle values
only up to SIZE_MAX / 2, and subtracting just 1
won't work around that problem.

Maybe divide by 10 instead?  That's easy to do
textually.  (I don't know what the test is about
so I'm not sure what to suggest.)




Information forwarded to bug-coreutils <at> gnu.org:
bug#13530; Package coreutils. (Tue, 28 May 2013 01:41:01 GMT) Full text and rfc822 format available.

Message #56 received at 13530 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: lei.zhang <at> uwaterloo.ca, 13530 <at> debbugs.gnu.org,
	Pádraig Brady <P <at> draigBrady.com>, coreutils <at> gnu.org
Subject: Re: bug#13530: head: memory exhausted when printing all from stdin
	but last P/E bytes
Date: Tue, 28 May 2013 03:38:55 +0200
Paul Eggert wrote:
> On 05/27/2013 06:04 PM, Jim Meyering wrote:
>> +lim=$(echo $SIZE_MAX | subtract_one_)
>> +lim=$(expr $lim - $max_BUFSIZ)
>
> Sorry, I don't see how this will work either.
> It's common for a GMP-less expr to handle values
> only up to SIZE_MAX / 2, and subtracting just 1
> won't work around that problem.

I was concentrating on the 32-bit case, where GMP-less expr
works fine on $SIZE_MAX.  It's on 64-bit that it's a problem, so...

> Maybe divide by 10 instead?  That's easy to do
> textually.  (I don't know what the test is about
> so I'm not sure what to suggest.)

I'd rather not divide by 10 all around.
That'd decrease it too much in the 32-bit case.
I took a different tack:


From 0d81799b18bef79b49d9042111f9dee3312796a7 Mon Sep 17 00:00:00 2001
From: Jim Meyering <meyering <at> fb.com>
Date: Mon, 27 May 2013 17:01:14 -0700
Subject: [PATCH] tests: head-c: avoid spurious failure with a 32-bit SIZE_MAX

* tests/misc/head-c.sh: When eliding N bytes from a non-seekable
input, N must be slightly smaller than SIZE_MAX in order to handle
input longer than N bytes, since the current implementation buffers
N bytes in memory.  This command would fail on 32-bit systems,
where SIZE_MAX < 1E:
  head --bytes=-E < /dev/null
Instead of "E", use a value slightly smaller than SIZE_MAX.
---
 tests/misc/head-c.sh | 21 ++++++++++++++++++++-
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/tests/misc/head-c.sh b/tests/misc/head-c.sh
index 37a86ce..a81754e 100755
--- a/tests/misc/head-c.sh
+++ b/tests/misc/head-c.sh
@@ -19,6 +19,7 @@
 . "${srcdir=.}/tests/init.sh"; path_prepend_ ./src
 print_ver_ head
 require_ulimit_v_
+getlimits_

 # exercise the fix of 2001-08-18, based on test case from Ian Bruce
 echo abc > in || framework_failure_
@@ -28,9 +29,27 @@ case "$(cat out)" in
   *) fail=1 ;;
 esac

+# Use a limit of N = SIZE_MAX - max_BUFSIZ
+# The "- max_BUFSIZ" term is because head must be able to add BUFSIZ
+# to the selected value of N without exceeding SIZE_MAX.
+# Since we've seen BUFSIZ up to 128K, use 256K to be safe.
+max_BUFSIZ=$(expr 256 '*' 1024)
+
+# It's ok to use a 10-digit $SIZE_MAX, because expr uses wider intmax_t.
+# However, when $SIZE_MAX is longer, it's a 20-digit quantity that is
+# too large for GMP-disabled expr to handle, so use $SSIZE_MAX there.
+# We'd use $SSIZE_MAX in both cases, but want to keep the number as
+# large as possible for the 32-bit case, since there, it may barely
+# exceed the size of available RAM, while SSIZE_MAX probably will not.
+test $(printf $SIZE_MAX|wc -c) -lt 11 \
+  && big=$SIZE_MAX \
+  || big=$SSIZE_MAX
+
+lim=$(expr $big - $max_BUFSIZ) \
+
 # Only allocate memory as needed.
 # Coreutils <= 8.21 would allocate memory up front
 # based on the value passed to -c
-(ulimit -v 20000; head --bytes=-E < /dev/null) || fail=1
+(ulimit -v 20000; head --bytes=-$lim < /dev/null) || fail=1

 Exit $fail
--
1.8.3




Information forwarded to bug-coreutils <at> gnu.org:
bug#13530; Package coreutils. (Tue, 28 May 2013 09:14:01 GMT) Full text and rfc822 format available.

Message #59 received at 13530 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Jim Meyering <jim <at> meyering.net>
Cc: lei.zhang <at> uwaterloo.ca, Paul Eggert <eggert <at> cs.ucla.edu>,
	13530 <at> debbugs.gnu.org, coreutils <at> gnu.org
Subject: Re: bug#13530: head: memory exhausted when printing all from stdin
	but last P/E bytes
Date: Tue, 28 May 2013 10:11:42 +0100
On 05/28/2013 02:38 AM, Jim Meyering wrote:
> Paul Eggert wrote:
>> On 05/27/2013 06:04 PM, Jim Meyering wrote:
>>> +lim=$(echo $SIZE_MAX | subtract_one_)
>>> +lim=$(expr $lim - $max_BUFSIZ)
>>
>> Sorry, I don't see how this will work either.
>> It's common for a GMP-less expr to handle values
>> only up to SIZE_MAX / 2, and subtracting just 1
>> won't work around that problem.
> 
> I was concentrating on the 32-bit case, where GMP-less expr
> works fine on $SIZE_MAX.  It's on 64-bit that it's a problem, so...
> 
>> Maybe divide by 10 instead?  That's easy to do
>> textually.  (I don't know what the test is about
>> so I'm not sure what to suggest.)
> 
> I'd rather not divide by 10 all around.
> That'd decrease it too much in the 32-bit case.
> I took a different tack:
> 
> 
>>From 0d81799b18bef79b49d9042111f9dee3312796a7 Mon Sep 17 00:00:00 2001
> From: Jim Meyering <meyering <at> fb.com>
> Date: Mon, 27 May 2013 17:01:14 -0700
> Subject: [PATCH] tests: head-c: avoid spurious failure with a 32-bit SIZE_MAX
> 
> * tests/misc/head-c.sh: When eliding N bytes from a non-seekable
> input, N must be slightly smaller than SIZE_MAX in order to handle
> input longer than N bytes, since the current implementation buffers
> N bytes in memory.  This command would fail on 32-bit systems,
> where SIZE_MAX < 1E:
>   head --bytes=-E < /dev/null
> Instead of "E", use a value slightly smaller than SIZE_MAX.
> ---
>  tests/misc/head-c.sh | 21 ++++++++++++++++++++-
>  1 file changed, 20 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/misc/head-c.sh b/tests/misc/head-c.sh
> index 37a86ce..a81754e 100755
> --- a/tests/misc/head-c.sh
> +++ b/tests/misc/head-c.sh
> @@ -19,6 +19,7 @@
>  . "${srcdir=.}/tests/init.sh"; path_prepend_ ./src
>  print_ver_ head
>  require_ulimit_v_
> +getlimits_
> 
>  # exercise the fix of 2001-08-18, based on test case from Ian Bruce
>  echo abc > in || framework_failure_
> @@ -28,9 +29,27 @@ case "$(cat out)" in
>    *) fail=1 ;;
>  esac
> 
> +# Use a limit of N = SIZE_MAX - max_BUFSIZ
> +# The "- max_BUFSIZ" term is because head must be able to add BUFSIZ
> +# to the selected value of N without exceeding SIZE_MAX.
> +# Since we've seen BUFSIZ up to 128K, use 256K to be safe.
> +max_BUFSIZ=$(expr 256 '*' 1024)
> +
> +# It's ok to use a 10-digit $SIZE_MAX, because expr uses wider intmax_t.
> +# However, when $SIZE_MAX is longer, it's a 20-digit quantity that is
> +# too large for GMP-disabled expr to handle, so use $SSIZE_MAX there.
> +# We'd use $SSIZE_MAX in both cases, but want to keep the number as
> +# large as possible for the 32-bit case, since there, it may barely
> +# exceed the size of available RAM, while SSIZE_MAX probably will not.
> +test $(printf $SIZE_MAX|wc -c) -lt 11 \
> +  && big=$SIZE_MAX \
> +  || big=$SSIZE_MAX
> +
> +lim=$(expr $big - $max_BUFSIZ) \

trailing \ is not needed.

The above assumes intmax_t is always wider than size_t,
which is might be OK?
Also since OFF_T_MAX is checked earlier in the code,
it assumes that off_t is wider than size_t which isn't
the case if _FILE_OFFSET_BITS is not 64.

You could take the approach to just use $SSIZE_MAX,
and assume 64 bit testing covers the 32 bit case implicitly?
That would also assume that off_t was never narrower than size_t,
but that's probably OK.

These limits are painful. Sorry for the churn :(

thanks,
Pádraig.





Information forwarded to bug-coreutils <at> gnu.org:
bug#13530; Package coreutils. (Sun, 02 Jun 2013 03:49:01 GMT) Full text and rfc822 format available.

Message #62 received at 13530 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Pádraig Brady <P <at> draigBrady.com>
Cc: lei.zhang <at> uwaterloo.ca, Paul Eggert <eggert <at> cs.ucla.edu>,
	13530 <at> debbugs.gnu.org, coreutils <at> gnu.org
Subject: Re: bug#13530: head: memory exhausted when printing all from stdin
	but last P/E bytes
Date: Sun, 02 Jun 2013 05:46:53 +0200
Pádraig Brady wrote:
...
>> +# Use a limit of N = SIZE_MAX - max_BUFSIZ
>> +# The "- max_BUFSIZ" term is because head must be able to add BUFSIZ
>> +# to the selected value of N without exceeding SIZE_MAX.
>> +# Since we've seen BUFSIZ up to 128K, use 256K to be safe.
>> +max_BUFSIZ=$(expr 256 '*' 1024)
>> +
>> +# It's ok to use a 10-digit $SIZE_MAX, because expr uses wider intmax_t.
>> +# However, when $SIZE_MAX is longer, it's a 20-digit quantity that is
>> +# too large for GMP-disabled expr to handle, so use $SSIZE_MAX there.
>> +# We'd use $SSIZE_MAX in both cases, but want to keep the number as
>> +# large as possible for the 32-bit case, since there, it may barely
>> +# exceed the size of available RAM, while SSIZE_MAX probably will not.
>> +test $(printf $SIZE_MAX|wc -c) -lt 11 \
>> +  && big=$SIZE_MAX \
>> +  || big=$SSIZE_MAX
>> +
>> +lim=$(expr $big - $max_BUFSIZ) \
>
> trailing \ is not needed.
>
> The above assumes intmax_t is always wider than size_t,
> which is might be OK?
> Also since OFF_T_MAX is checked earlier in the code,
> it assumes that off_t is wider than size_t which isn't
> the case if _FILE_OFFSET_BITS is not 64.
>
> You could take the approach to just use $SSIZE_MAX,
> and assume 64 bit testing covers the 32 bit case implicitly?
> That would also assume that off_t was never narrower than size_t,
> but that's probably OK.
>
> These limits are painful. Sorry for the churn :(

This started because the test passed on 64-bit, yet failed on 32-bit.
If it were to use $SSIZE_MAX, the test could mistakenly pass when
run without the underlying patch, because on a 32-bit system,
it's not uncommon to have 2^31 bytes of RAM, so the allocation
could succeed.

However, that's so much simpler that I now think it's the way to go.
Thanks!


From aeaeb87c134ce748527ba99e749b7369fcba2438 Mon Sep 17 00:00:00 2001
From: Jim Meyering <meyering <at> fb.com>
Date: Mon, 27 May 2013 17:01:14 -0700
Subject: [PATCH] tests: head-c: avoid spurious failure with a 32-bit size_t

* tests/misc/head-c.sh: Don't try to elide 1 exabytes, since on
32-bit systems, that number is not representable as a size_t.
This command would fail on 32-bit systems, where SIZE_MAX < 1E:
  head --bytes=-E < /dev/null
Instead of "E", use $SSIZE_MAX.
For discussion, see http://bugs.gnu.org/13530
---
 tests/misc/head-c.sh | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tests/misc/head-c.sh b/tests/misc/head-c.sh
index 37a86ce..a5b6c6a 100755
--- a/tests/misc/head-c.sh
+++ b/tests/misc/head-c.sh
@@ -19,6 +19,7 @@
 . "${srcdir=.}/tests/init.sh"; path_prepend_ ./src
 print_ver_ head
 require_ulimit_v_
+getlimits_

 # exercise the fix of 2001-08-18, based on test case from Ian Bruce
 echo abc > in || framework_failure_
@@ -31,6 +32,6 @@ esac
 # Only allocate memory as needed.
 # Coreutils <= 8.21 would allocate memory up front
 # based on the value passed to -c
-(ulimit -v 20000; head --bytes=-E < /dev/null) || fail=1
+(ulimit -v 20000; head --bytes=-$SSIZE_MAX < /dev/null) || fail=1

 Exit $fail
--
1.8.3.101.g727a46b




Information forwarded to bug-coreutils <at> gnu.org:
bug#13530; Package coreutils. (Sun, 02 Jun 2013 10:54:01 GMT) Full text and rfc822 format available.

Message #65 received at 13530 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Jim Meyering <jim <at> meyering.net>
Cc: lei.zhang <at> uwaterloo.ca, Paul Eggert <eggert <at> cs.ucla.edu>,
	13530 <at> debbugs.gnu.org, coreutils <at> gnu.org
Subject: Re: bug#13530: head: memory exhausted when printing all from stdin
	but last P/E bytes
Date: Sun, 02 Jun 2013 11:51:55 +0100
On 06/02/2013 04:46 AM, Jim Meyering wrote:
> diff --git a/tests/misc/head-c.sh b/tests/misc/head-c.sh
> index 37a86ce..a5b6c6a 100755
> --- a/tests/misc/head-c.sh
> +++ b/tests/misc/head-c.sh
> @@ -19,6 +19,7 @@
>  . "${srcdir=.}/tests/init.sh"; path_prepend_ ./src
>  print_ver_ head
>  require_ulimit_v_
> +getlimits_
> 
>  # exercise the fix of 2001-08-18, based on test case from Ian Bruce
>  echo abc > in || framework_failure_
> @@ -31,6 +32,6 @@ esac
>  # Only allocate memory as needed.
>  # Coreutils <= 8.21 would allocate memory up front
>  # based on the value passed to -c
> -(ulimit -v 20000; head --bytes=-E < /dev/null) || fail=1
> +(ulimit -v 20000; head --bytes=-$SSIZE_MAX < /dev/null) || fail=1

+1

thanks!
Pádraig.





Information forwarded to bug-coreutils <at> gnu.org:
bug#13530; Package coreutils. (Sun, 02 Jun 2013 16:27:01 GMT) Full text and rfc822 format available.

Message #68 received at 13530 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Pádraig Brady <P <at> draigBrady.com>
Cc: lei.zhang <at> uwaterloo.ca, Paul Eggert <eggert <at> cs.ucla.edu>,
	13530 <at> debbugs.gnu.org, coreutils <at> gnu.org
Subject: Re: bug#13530: head: memory exhausted when printing all from stdin
	but last P/E bytes
Date: Sun, 02 Jun 2013 18:25:03 +0200
Pádraig Brady wrote:
> On 06/02/2013 04:46 AM, Jim Meyering wrote:
>> diff --git a/tests/misc/head-c.sh b/tests/misc/head-c.sh
>> index 37a86ce..a5b6c6a 100755
>> --- a/tests/misc/head-c.sh
>> +++ b/tests/misc/head-c.sh
>> @@ -19,6 +19,7 @@
>>  . "${srcdir=.}/tests/init.sh"; path_prepend_ ./src
>>  print_ver_ head
>>  require_ulimit_v_
>> +getlimits_
>>
>>  # exercise the fix of 2001-08-18, based on test case from Ian Bruce
>>  echo abc > in || framework_failure_
>> @@ -31,6 +32,6 @@ esac
>>  # Only allocate memory as needed.
>>  # Coreutils <= 8.21 would allocate memory up front
>>  # based on the value passed to -c
>> -(ulimit -v 20000; head --bytes=-E < /dev/null) || fail=1
>> +(ulimit -v 20000; head --bytes=-$SSIZE_MAX < /dev/null) || fail=1
>
> +1

Pushed.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Mon, 01 Jul 2013 11:24:03 GMT) Full text and rfc822 format available.

This bug report was last modified 10 years and 295 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.