GNU bug report logs - #13098
[PATCH] cut.c: Fix memory leak

Previous Next

Package: coreutils;

Reported by: Cojocaru Alexandru <xojoc <at> gmx.com>

Date: Thu, 6 Dec 2012 02:17:01 UTC

Severity: normal

Tags: patch

Done: Pádraig Brady <P <at> draigBrady.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 13098 in the body.
You can then email your comments to 13098 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-coreutils <at> gnu.org:
bug#13098; Package coreutils. (Thu, 06 Dec 2012 02:17:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Cojocaru Alexandru <xojoc <at> gmx.com>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Thu, 06 Dec 2012 02:17:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Cojocaru Alexandru <xojoc <at> gmx.com>
To: bug-coreutils <at> gnu.org
Subject: [PATCH] cut.c: Fix memory leak
Date: Thu, 6 Dec 2012 03:11:48 +0100
From 82f2b062c0e21d9a0d64f9ceab363d2a79f5a6eb Mon Sep 17 00:00:00 2001
From: Cojocaru Alexandru <xojoc <at> gmx.com>
Date: Thu, 6 Dec 2012 03:03:41 +0100
Subject: [PATCH] cut: fix memory leak

* src/cut.c (set_fields): don't allocate memory for
`printable_field' if there are no finite ranges.
The bug was introduced on 2012-2-7 via commit 2e636af.
---
 src/cut.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/src/cut.c b/src/cut.c
index 4219d24..87abd15 100644
--- a/src/cut.c
+++ b/src/cut.c
@@ -500,14 +500,13 @@ set_fields (const char *fieldstr)
       if (rp[i].hi > max_range_endpoint)
         max_range_endpoint = rp[i].hi;
     }
-  if (max_range_endpoint < eol_range_start)
-    max_range_endpoint = eol_range_start;
 
   /* Allocate an array large enough so that it may be indexed by
      the field numbers corresponding to all finite ranges
      (i.e. '2-6' or '-4', but not '5-') in FIELDSTR.  */
 
-  printable_field = xzalloc (max_range_endpoint / CHAR_BIT + 1);
+  if (n_rp)
+    printable_field = xzalloc (max_range_endpoint / CHAR_BIT + 1);
 
   qsort (rp, n_rp, sizeof (rp[0]), compare_ranges);
 
@@ -531,8 +530,11 @@ set_fields (const char *fieldstr)
 
   if (output_delimiter_specified
       && !complement
-      && eol_range_start && !is_printable_field (eol_range_start))
-    mark_range_start (eol_range_start);
+      && eol_range_start
+      && printable_field && !is_printable_field (eol_range_start))
+    {
+      mark_range_start (eol_range_start);
+    }
 
   free (rp);
 
-- 
1.8.0.1



-- 
Cojocaru Alexandru <xojoc <at> gmx.com>




Reply sent to Pádraig Brady <P <at> draigBrady.com>:
You have taken responsibility. (Thu, 06 Dec 2012 16:43:04 GMT) Full text and rfc822 format available.

Notification sent to Cojocaru Alexandru <xojoc <at> gmx.com>:
bug acknowledged by developer. (Thu, 06 Dec 2012 16:43:04 GMT) Full text and rfc822 format available.

Message #10 received at 13098-done <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Cojocaru Alexandru <xojoc <at> gmx.com>
Cc: 13098-done <at> debbugs.gnu.org
Subject: Re: bug#13098: [PATCH] cut.c: Fix memory leak
Date: Thu, 06 Dec 2012 16:42:13 +0000
[Message part 1 (text/plain, inline)]
On 12/06/2012 02:11 AM, Cojocaru Alexandru wrote:
>>From 82f2b062c0e21d9a0d64f9ceab363d2a79f5a6eb Mon Sep 17 00:00:00 2001
> From: Cojocaru Alexandru <xojoc <at> gmx.com>
> Date: Thu, 6 Dec 2012 03:03:41 +0100
> Subject: [PATCH] cut: fix memory leak
>
> * src/cut.c (set_fields): don't allocate memory for
> `printable_field' if there are no finite ranges.
> The bug was introduced on 2012-2-7 via commit 2e636af.
> ---
>   src/cut.c | 12 +++++++-----
>   1 file changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/src/cut.c b/src/cut.c
> index 4219d24..87abd15 100644
> --- a/src/cut.c
> +++ b/src/cut.c
> @@ -500,14 +500,13 @@ set_fields (const char *fieldstr)
>         if (rp[i].hi > max_range_endpoint)
>           max_range_endpoint = rp[i].hi;
>       }
> -  if (max_range_endpoint < eol_range_start)
> -    max_range_endpoint = eol_range_start;
>
>     /* Allocate an array large enough so that it may be indexed by
>        the field numbers corresponding to all finite ranges
>        (i.e. '2-6' or '-4', but not '5-') in FIELDSTR.  */
>
> -  printable_field = xzalloc (max_range_endpoint / CHAR_BIT + 1);
> +  if (n_rp)
> +    printable_field = xzalloc (max_range_endpoint / CHAR_BIT + 1);
>
>     qsort (rp, n_rp, sizeof (rp[0]), compare_ranges);
>
> @@ -531,8 +530,11 @@ set_fields (const char *fieldstr)
>
>     if (output_delimiter_specified
>         && !complement
> -      && eol_range_start && !is_printable_field (eol_range_start))
> -    mark_range_start (eol_range_start);
> +      && eol_range_start
> +      && printable_field && !is_printable_field (eol_range_start))
> +    {
> +      mark_range_start (eol_range_start);
> +    }
>
>     free (rp);

This looks right. subsequent accesses to the now not alloced bit array,
are avoided when max_range_endpoint = 0.

$ valgrind cut-before --output=" " -f1234567- /dev/null 2>&1 | grep reachable
==20949==    still reachable: 154,323 bytes in 2 blocks

$ valgrind cut-after --output=" " -f1234567- /dev/null 2>&1 | grep reachable
==20816==    still reachable: 2 bytes in 1 blocks

I wouldn't describe it as a leak though, rather a redundant allocation.
Since this is an edge case minor performance issue, I don't think
it needs a NEWS entry and will just adjust the commit summary a little
to say "advoid a redundant memory allocation".

Hmm, it might be a bit more consistent to guard all
references to the bit vector array with max_range_endpoint?
How about the attached?

thanks!
Pádraig.

[cut-avoid-alloc.diff (text/x-patch, attachment)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#13098; Package coreutils. (Thu, 06 Dec 2012 17:38:02 GMT) Full text and rfc822 format available.

Message #13 received at 13098-done <at> debbugs.gnu.org (full text, mbox):

From: Cojocaru Alexandru <xojoc <at> gmx.com>
To: Pádraig Brady <P <at> draigBrady.com>
Cc: 13098-done <at> debbugs.gnu.org
Subject: Re: bug#13098: [PATCH] cut.c: Fix memory leak
Date: Thu, 6 Dec 2012 17:59:41 +0100
> Hmm, it might be a bit more consistent to guard all
> references to the bit vector array with max_range_endpoint?
> How about the attached?

I checked for `n_rp' and `printable_field' becuase it felt more
natural.
I reasoned as follows: first check if we have any finite ranges (n_rp) if yes,
allocate memory for `printable_field'. Then check if `printable_field'
has been allocated memory, if yes access it.
Anyway both are good solutions.

Best regards,
Cojocaru Alexandru




Information forwarded to bug-coreutils <at> gnu.org:
bug#13098; Package coreutils. (Fri, 07 Dec 2012 04:34:02 GMT) Full text and rfc822 format available.

Message #16 received at 13098 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: 13098 <at> debbugs.gnu.org
Cc: xojoc <at> gmx.com, P <at> draigBrady.com
Subject: Re: bug#13098: [PATCH] cut.c: Fix memory leak
Date: Fri, 07 Dec 2012 05:32:53 +0100
Pádraig Brady wrote:
...
> How about the attached?

> From: Cojocaru Alexandru <xojoc <at> gmx.com>
> Date: Thu, 6 Dec 2012 03:03:41 +0100
> Subject: [PATCH] cut: avoid a redundant heap allocation
>
> * src/cut.c (set_fields): Don't allocate memory for
> `printable_field' if there are no finite ranges.
> The extra allocation was introduced via commit v8.10-3-g2e636af.
...

Thanks to both of you.
That's a fine bug fix, actually.
Consider that before, this would fail on my 64-bit system:

    $ : | cut -b999999999999999999-
    cut: memory exhausted
    [Exit 1]
    $

Now, it no longer tries to allocate all that memory, so completes normally:

    $ : | cut -b999999999999999999-
    $




Information forwarded to bug-coreutils <at> gnu.org:
bug#13098; Package coreutils. (Fri, 07 Dec 2012 06:42:01 GMT) Full text and rfc822 format available.

Message #19 received at 13098 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: 13098 <at> debbugs.gnu.org
Cc: xojoc <at> gmx.com, P <at> draigBrady.com
Subject: Re: bug#13098: [PATCH] cut.c: Fix memory leak
Date: Fri, 07 Dec 2012 07:41:14 +0100
Jim Meyering wrote:

> Pádraig Brady wrote:
> ...
>> How about the attached?
>
>> From: Cojocaru Alexandru <xojoc <at> gmx.com>
>> Date: Thu, 6 Dec 2012 03:03:41 +0100
>> Subject: [PATCH] cut: avoid a redundant heap allocation
>>
>> * src/cut.c (set_fields): Don't allocate memory for
>> `printable_field' if there are no finite ranges.
>> The extra allocation was introduced via commit v8.10-3-g2e636af.
> ...
>
> Thanks to both of you.
> That's a fine bug fix, actually.
> Consider that before, this would fail on my 64-bit system:
>
>     $ : | cut -b999999999999999999-
>     cut: memory exhausted
>     [Exit 1]
>     $
>
> Now, it no longer tries to allocate all that memory, so completes normally:
>
>     $ : | cut -b999999999999999999-
>     $

Note that on a 32-bit system it fails the same way in both cases:

    $ : | src/cut -b999999999999999999-
    src/cut: byte offset '999999999999999999' is too large
    [Exit 1]

And even with 2^32-1, it'll probably just allocate the memory,
assuming 500MB doesn't cause trouble.
To differentiate portably, we probably have to use ulimit:

Limit VM to 22000k, but make the old cut require about 10x that:

    $ (ulimit -v 22000; :| cut -b$(echo 2^31|bc)-)
    cut: memory exhausted
    [Exit 1]

The new, just-built one passes:

    $ (ulimit -v 22000; :| ./cut -b$(echo 2^31|bc)-)
    $

If someone feels like turning the above into a test
case, please do (and update NEWS).  Otherwise, I'll
get to it over the weekend.




Information forwarded to bug-coreutils <at> gnu.org:
bug#13098; Package coreutils. (Sat, 08 Dec 2012 20:28:02 GMT) Full text and rfc822 format available.

Message #22 received at 13098 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: 13098 <at> debbugs.gnu.org
Cc: xojoc <at> gmx.com, P <at> draigBrady.com
Subject: Re: bug#13098: [PATCH] cut.c: Fix memory leak
Date: Sat, 08 Dec 2012 21:27:12 +0100
Jim Meyering wrote:
> Jim Meyering wrote:
>
>> Pádraig Brady wrote:
>> ...
>>> How about the attached?
>>
>>> From: Cojocaru Alexandru <xojoc <at> gmx.com>
>>> Date: Thu, 6 Dec 2012 03:03:41 +0100
>>> Subject: [PATCH] cut: avoid a redundant heap allocation
>>>
>>> * src/cut.c (set_fields): Don't allocate memory for
>>> `printable_field' if there are no finite ranges.
>>> The extra allocation was introduced via commit v8.10-3-g2e636af.
>> ...
>>
>> Thanks to both of you.
>> That's a fine bug fix, actually.
>> Consider that before, this would fail on my 64-bit system:
>>
>>     $ : | cut -b999999999999999999-
>>     cut: memory exhausted
>>     [Exit 1]
>>     $
>>
>> Now, it no longer tries to allocate all that memory, so completes normally:
>>
>>     $ : | cut -b999999999999999999-
>>     $
>
> Note that on a 32-bit system it fails the same way in both cases:
>
>     $ : | src/cut -b999999999999999999-
>     src/cut: byte offset '999999999999999999' is too large
>     [Exit 1]
>
> And even with 2^32-1, it'll probably just allocate the memory,
> assuming 500MB doesn't cause trouble.
> To differentiate portably, we probably have to use ulimit:
>
> Limit VM to 22000k, but make the old cut require about 10x that:
>
>     $ (ulimit -v 22000; :| cut -b$(echo 2^31|bc)-)
>     cut: memory exhausted
>     [Exit 1]
>
> The new, just-built one passes:
>
>     $ (ulimit -v 22000; :| ./cut -b$(echo 2^31|bc)-)
>     $
>
> If someone feels like turning the above into a test
> case, please do (and update NEWS).  Otherwise, I'll
> get to it over the weekend.

Here it is:

From a2833b8399f56bff6fe08eb212c01a5cb1b74606 Mon Sep 17 00:00:00 2001
From: Jim Meyering <jim <at> meyering.net>
Date: Sat, 8 Dec 2012 12:04:14 -0800
Subject: [PATCH] tests: add test case and note that last week's cut change is
 a bug fix

* tests/misc/cut-huge-to-eol-range.sh: New test, showing that
the change in v8.20-51-g7d03466 is a bug fix after all.
* tests/local.mk (all_tests): Add it.
* NEWS (Bug fixes): Mention it.
---
 NEWS                                |  4 ++++
 tests/local.mk                      |  1 +
 tests/misc/cut-huge-to-eol-range.sh | 30 ++++++++++++++++++++++++++++++
 3 files changed, 35 insertions(+)
 create mode 100755 tests/misc/cut-huge-to-eol-range.sh

diff --git a/NEWS b/NEWS
index 7c17869..a7a5bc3 100644
--- a/NEWS
+++ b/NEWS
@@ -10,6 +10,10 @@ GNU coreutils NEWS                                    -*- outline -*-

 ** Bug fixes

+  cut with a range like "N-" no longer allocates N/8 bytes.  That buffer
+  would never be used, and allocation failure could cause cut to fail.
+  [bug introduced in coreutils-8.10]
+
   cut no longer accepts the invalid range 0-, which made it print empty lines.
   Instead, cut now fails and emits an appropriate diagnostic.
   [This bug was present in "the beginning".]
diff --git a/tests/local.mk b/tests/local.mk
index d5bb6f7..5eeddd5 100644
--- a/tests/local.mk
+++ b/tests/local.mk
@@ -246,6 +246,7 @@ all_tests =					\
   tests/misc/pwd-option.sh			\
   tests/misc/chcon-fail.sh			\
   tests/misc/cut.pl				\
+  tests/misc/cut-huge-to-eol-range.sh		\
   tests/misc/wc.pl				\
   tests/misc/wc-files0-from.pl			\
   tests/misc/wc-files0.sh			\
diff --git a/tests/misc/cut-huge-to-eol-range.sh b/tests/misc/cut-huge-to-eol-range.sh
new file mode 100755
index 0000000..fea8505
--- /dev/null
+++ b/tests/misc/cut-huge-to-eol-range.sh
@@ -0,0 +1,30 @@
+#!/bin/sh
+# Ensure that cut does not allocate mem for a range like -b9999999999999-
+
+# Copyright (C) 2012 Free Software Foundation, Inc.
+
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+. "${srcdir=.}/tests/init.sh"; path_prepend_ ./src
+print_ver_ cut
+require_ulimit_
+
+# 2^31-1
+# From coreutils-8.10 through 8.20, this would make cut try to allocate
+# a 256MiB bit vector.  With a 20MB limit on VM, the following would fail.
+(ulimit -v 20000; : | cut -b2147483647- > err 2>&1) || fail=1
+
+compare /dev/null err || fail=1
+
+Exit $fail
--
1.8.0.1.352.gfb4c622




Information forwarded to bug-coreutils <at> gnu.org:
bug#13098; Package coreutils. (Sun, 09 Dec 2012 12:59:02 GMT) Full text and rfc822 format available.

Message #25 received at 13098 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Jim Meyering <jim <at> meyering.net>
Cc: xojoc <at> gmx.com, 13098 <at> debbugs.gnu.org
Subject: Re: bug#13098: [PATCH] cut.c: Fix memory leak
Date: Sun, 09 Dec 2012 12:57:59 +0000
On 12/08/2012 08:27 PM, Jim Meyering wrote:
> +# 2^31-1

> +# From coreutils-8.10 through 8.20, this would make cut try to allocate
> +# a 256MiB bit vector.  With a 20MB limit on VM, the following would fail.
> +(ulimit -v 20000; : | cut -b2147483647- > err 2>&1) || fail=1
> +
> +compare /dev/null err || fail=1

Perhaps use INT_MAX from getlimits?
In either case +1

thanks,
Pádraig.




Information forwarded to bug-coreutils <at> gnu.org:
bug#13098; Package coreutils. (Sun, 09 Dec 2012 23:17:02 GMT) Full text and rfc822 format available.

Message #28 received at 13098 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Pádraig Brady <P <at> draigBrady.com>
Cc: xojoc <at> gmx.com, 13098 <at> debbugs.gnu.org
Subject: Re: bug#13098: [PATCH] cut.c: Fix memory leak
Date: Mon, 10 Dec 2012 00:16:18 +0100
Pádraig Brady wrote:

> On 12/08/2012 08:27 PM, Jim Meyering wrote:
>> +# 2^31-1
>
>> +# From coreutils-8.10 through 8.20, this would make cut try to allocate
>> +# a 256MiB bit vector.  With a 20MB limit on VM, the following would fail.
>> +(ulimit -v 20000; : | cut -b2147483647- > err 2>&1) || fail=1
>> +
>> +compare /dev/null err || fail=1
>
> Perhaps use INT_MAX from getlimits?
> In either case +1

Good idea.  Thanks for the review.
Pushed.  Here's the delta.

diff --git a/tests/misc/cut-huge-to-eol-range.sh b/tests/misc/cut-huge-to-eol-range.sh
index fea8505..4e3ee03 100755
--- a/tests/misc/cut-huge-to-eol-range.sh
+++ b/tests/misc/cut-huge-to-eol-range.sh
@@ -19,11 +19,11 @@
 . "${srcdir=.}/tests/init.sh"; path_prepend_ ./src
 print_ver_ cut
 require_ulimit_
+getlimits_

-# 2^31-1
 # From coreutils-8.10 through 8.20, this would make cut try to allocate
 # a 256MiB bit vector.  With a 20MB limit on VM, the following would fail.
-(ulimit -v 20000; : | cut -b2147483647- > err 2>&1) || fail=1
+(ulimit -v 20000; : | cut -b$INT_MAX- > err 2>&1) || fail=1

 compare /dev/null err || fail=1




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Mon, 07 Jan 2013 12:24:46 GMT) Full text and rfc822 format available.

This bug report was last modified 11 years and 83 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.