GNU bug report logs - #17553
du unit suggestion

Previous Next

Package: coreutils;

Reported by: Reuben Thomas <rrt <at> sc3d.org>

Date: Thu, 22 May 2014 19:48:02 UTC

Severity: normal

Tags: notabug

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 17553 in the body.
You can then email your comments to 17553 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-coreutils <at> gnu.org:
bug#17553; Package coreutils. (Thu, 22 May 2014 19:48:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Reuben Thomas <rrt <at> sc3d.org>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Thu, 22 May 2014 19:48:03 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Reuben Thomas <rrt <at> sc3d.org>
To: bug-coreutils <bug-coreutils <at> gnu.org>
Subject: du unit suggestion
Date: Thu, 22 May 2014 20:47:29 +0100
[Message part 1 (text/plain, inline)]
It would be helpful to this addle-pated individual if du would output the
same units as it accepts as SIZE inputs, so that one could more readily
tell whether one was getting 1000-based or 1024-based units.

For additional clarity, it would help if for output the suffix were "B" as
at present for 1000-based units and "iB" for 1024-based (and
correspondingly it would be nice if "iB" suffixed units were accepted as
input. As far as input goes, it's backwards-compatible; it's not for output
if other programs are trying to parse the human-readable output, but maybe
that's not a problem.

-- 
http://rrt.sc3d.org
[Message part 2 (text/html, inline)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#17553; Package coreutils. (Thu, 22 May 2014 22:59:02 GMT) Full text and rfc822 format available.

Message #8 received at 17553 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Reuben Thomas <rrt <at> sc3d.org>
Cc: 17553 <at> debbugs.gnu.org
Subject: Re: bug#17553: du unit suggestion
Date: Thu, 22 May 2014 23:58:07 +0100
On 05/22/2014 08:47 PM, Reuben Thomas wrote:
> It would be helpful to this addle-pated individual if du would output the
> same units as it accepts as SIZE inputs, so that one could more readily
> tell whether one was getting 1000-based or 1024-based units.
> 
> For additional clarity, it would help if for output the suffix were "B" as
> at present for 1000-based units and "iB" for 1024-based (and
> correspondingly it would be nice if "iB" suffixed units were accepted as
> input. As far as input goes, it's backwards-compatible; it's not for output
> if other programs are trying to parse the human-readable output, but maybe
> that's not a problem.

Yes this is not ideal, but it half does what you want:
With this file:

  $ truncate -s 1MiB file.in

We can output the appropriate suffixes for a _particular_ power.

  $ du --apparent-size -BKiB file.in
  1024KiB	file.in
  $ du --apparent-size -BKB file.in
  1049kB	file.in

However if we want to auto scale the number with -h we lose the suffix,
and have an ambiguity:

  $ du --apparent-size -h file.in
  1.0M	file.in
  $ du --apparent-size -h --si file.in
  1.1M	file.in

If you wanted to get auto scaling with suffixes you could use
the new numfmt utility which has various number formatting options.
The advantage of that is it concentrates the myriad of number formatting
options in a single location, and allows processing of numbers before
final presentation by numfmt. For example:

  $ du -B1 . | sort -k1,1n | numfmt --to=iec-i | tail -n5
  104Mi ./gnulib/.git/objects/pack
  216Mi ./gnulib/.git/objects
  218Mi ./gnulib/.git
  274Mi ./gnulib
  479Mi .

thanks,
Pádraig.




Information forwarded to bug-coreutils <at> gnu.org:
bug#17553; Package coreutils. (Fri, 23 May 2014 10:14:02 GMT) Full text and rfc822 format available.

Message #11 received at 17553 <at> debbugs.gnu.org (full text, mbox):

From: Reuben Thomas <rrt <at> sc3d.org>
To: Pádraig Brady <P <at> draigbrady.com>
Cc: 17553 <at> debbugs.gnu.org
Subject: Re: bug#17553: du unit suggestion
Date: Fri, 23 May 2014 11:13:02 +0100
[Message part 1 (text/plain, inline)]
On 22 May 2014 23:58, Pádraig Brady <P <at> draigbrady.com> wrote:

> On 05/22/2014 08:47 PM, Reuben Thomas wrote:
> > It would be helpful to this addle-pated individual if du would output the
> > same units as it accepts as SIZE inputs, so that one could more readily
> > tell whether one was getting 1000-based or 1024-based units.
> >
> > For additional clarity, it would help if for output the suffix were "B"
> as
> > at present for 1000-based units and "iB" for 1024-based (and
> > correspondingly it would be nice if "iB" suffixed units were accepted as
> > input. As far as input goes, it's backwards-compatible; it's not for
> output
> > if other programs are trying to parse the human-readable output, but
> maybe
> > that's not a problem.
>
> Yes this is not ideal, but it half does what you want:
> With this file:
>
>   $ truncate -s 1MiB file.in
>
> We can output the appropriate suffixes for a _particular_ power.
>
>   $ du --apparent-size -BKiB file.in
>   1024KiB       file.in
>   $ du --apparent-size -BKB file.in
>   1049kB        file.in
>
> However if we want to auto scale the number with -h we lose the suffix,
> and have an ambiguity:
>
>   $ du --apparent-size -h file.in
>   1.0M  file.in
>   $ du --apparent-size -h --si file.in
>   1.1M  file.in
>
> If you wanted to get auto scaling with suffixes you could use
> the new numfmt utility which has various number formatting options.
> The advantage of that is it concentrates the myriad of number formatting
> options in a single location, and allows processing of numbers before
> final presentation by numfmt. For example:
>
>   $ du -B1 . | sort -k1,1n | numfmt --to=iec-i | tail -n5
>   104Mi ./gnulib/.git/objects/pack
>   216Mi ./gnulib/.git/objects
>   218Mi ./gnulib/.git
>   274Mi ./gnulib
>   479Mi .
>

Thanks for clarifying what is possible. Currently du's behaviour is more
inconsistent than I'd realised. I see that if one wanted more complicated
units it would make sense to use an external utility, but here I'd just
like -h's output to be consistent with --apparent-size's output.

It seems that -iB suffixed units are already accepted as input, which was
my second request, but not documented in the man page. I'd like to fix
this, but without duplicating information. What's best: to add a phrase
saying that SI units are accepted and/or to make a cross-reference to the
relevant node of the info manual?

-- 
http://rrt.sc3d.org
[Message part 2 (text/html, inline)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#17553; Package coreutils. (Fri, 23 May 2014 12:13:02 GMT) Full text and rfc822 format available.

Message #14 received at 17553 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Reuben Thomas <rrt <at> sc3d.org>
Cc: 17553 <at> debbugs.gnu.org
Subject: Re: bug#17553: du unit suggestion
Date: Fri, 23 May 2014 13:12:29 +0100
[Message part 1 (text/plain, inline)]
tl;dr

  You can get what you want currently by doing:

    du() { env du -B1 "$@" | numfmt --to=iec-i --suffix=B; }

  Though there are some potential changes.
  I'm 60:40 against applying the attached block-size-iec-iB.patch
  I'm 60:40 for applying the attached block-size-help-iB.patch

details below...

On 05/23/2014 11:13 AM, Reuben Thomas wrote:
> On 22 May 2014 23:58, Pádraig Brady <P <at> draigbrady.com> wrote:
> 
>> On 05/22/2014 08:47 PM, Reuben Thomas wrote:
>>> It would be helpful to this addle-pated individual if du would output the
>>> same units as it accepts as SIZE inputs, so that one could more readily
>>> tell whether one was getting 1000-based or 1024-based units.
>>>
>>> For additional clarity, it would help if for output the suffix were "B"
>> as
>>> at present for 1000-based units and "iB" for 1024-based (and
>>> correspondingly it would be nice if "iB" suffixed units were accepted as
>>> input. As far as input goes, it's backwards-compatible; it's not for
>> output
>>> if other programs are trying to parse the human-readable output, but
>> maybe
>>> that's not a problem.
>>
>> Yes this is not ideal, but it half does what you want:
>> With this file:
>>
>>   $ truncate -s 1MiB file.in
>>
>> We can output the appropriate suffixes for a _particular_ power.
>>
>>   $ du --apparent-size -BKiB file.in
>>   1024KiB       file.in
>>   $ du --apparent-size -BKB file.in
>>   1049kB        file.in
>>
>> However if we want to auto scale the number with -h we lose the suffix,
>> and have an ambiguity:
>>
>>   $ du --apparent-size -h file.in
>>   1.0M  file.in
>>   $ du --apparent-size -h --si file.in
>>   1.1M  file.in
>>
>> If you wanted to get auto scaling with suffixes you could use
>> the new numfmt utility which has various number formatting options.
>> The advantage of that is it concentrates the myriad of number formatting
>> options in a single location, and allows processing of numbers before
>> final presentation by numfmt. For example:
>>
>>   $ du -B1 . | sort -k1,1n | numfmt --to=iec-i | tail -n5
>>   104Mi ./gnulib/.git/objects/pack
>>   216Mi ./gnulib/.git/objects
>>   218Mi ./gnulib/.git
>>   274Mi ./gnulib
>>   479Mi .
>>
> 
> Thanks for clarifying what is possible. Currently du's behaviour is more
> inconsistent than I'd realised. I see that if one wanted more complicated
> units it would make sense to use an external utility, but here I'd just
> like -h's output to be consistent with --apparent-size's output.

With -B '..B' output you mean?

To show all current output possibilities:

$ dd bs=1MiB count=2 if=/dev/zero of=file.test
$ for B in 1 1000 1024 KB KiB si human; do
    printf '%s (%s)\n' "$(du -B$B file.test)" "$B"
  done
2097152	file.test (1)
2098	file.test (1000)
2048	file.test (1024)
2098kB	file.test (KB)
2048KiB	file.test (KiB)
2.1M	file.test (si)
2.0M	file.test (human)

For backward compat I'm a bit reluctant to have -h (or -Bhuman)
without --si output the 'i' suffix. It is tempting though,
but another reason not to do this, is the current implementation
would also output the 'B' suffix, which might be OK for du,
but we also need to consider ls and df where this is also significant.

I guess we could extend the --block-size option to support this
(simple patch is attached)

$ for B in iec-iB si-B; do printf '%s (%s)\n' "$(src/du -B$B file.test)" "$B"; done
2.0MiB	file.test (iec-iB)
2.1MB	file.test (si-B)

However as mentioned the current implementation of this doesn't support
outputting the 'i' iec distinguishing char without the 'B' suffix.
Also given the messiness in this area I'm reluctant to add to it.
numfmt gives all the flexibility you need here and you can setup:

  du() { env du -B1 "$@" | numfmt --to=iec-i --suffix=B; }

That would be more general and just about as awkward as:

  alias du='du -Biec-iB'

> It seems that -iB suffixed units are already accepted as input, which was
> my second request, but not documented in the man page. I'd like to fix
> this, but without duplicating information. What's best: to add a phrase
> saying that SI units are accepted and/or to make a cross-reference to the
> relevant node of the info manual?

Yes the man page doesn't mention that the integer is optional and
the consequences of that, nor does it mention the 'human' and 'si' possibilities.
Man pages are a tricky to have the right level of info.
BTW having the integer optional was removed from the description in:
  http://debbugs.gnu.org/cgi/bugreport.cgi?bug=9939#53
I've attached a proposed patch to show the -B'MiB' as an option,
while leaving the SIZE description as before.

thanks,
Pádraig.
[block-size-iec-iB.patch (text/x-patch, attachment)]
[block-size-help-iB.patch (text/x-patch, attachment)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#17553; Package coreutils. (Fri, 23 May 2014 13:03:01 GMT) Full text and rfc822 format available.

Message #17 received at 17553 <at> debbugs.gnu.org (full text, mbox):

From: Reuben Thomas <rrt <at> sc3d.org>
To: Pádraig Brady <P <at> draigbrady.com>
Cc: 17553 <at> debbugs.gnu.org
Subject: Re: bug#17553: du unit suggestion
Date: Fri, 23 May 2014 14:02:40 +0100
[Message part 1 (text/plain, inline)]
On 23 May 2014 13:12, Pádraig Brady <P <at> draigbrady.com> wrote:

> tl;dr
>
>   You can get what you want currently by doing:
>
>     du() { env du -B1 "$@" | numfmt --to=iec-i --suffix=B; }
>

Thanks very much, that's certainly good enough for me.

My understanding of the rest of what you wrote is that fixing the
inconsistency in du's output is complicated by backwards compatibility, and
because the code used to do so is used elsewhere in coreutils. Pity. It'd
certainly be nice to see consistent output by default.

With -B '..B' output you mean?
>

Sorry, yes.

Thanks very much for your efforts.
[Message part 2 (text/html, inline)]

Added tag(s) notabug. Request was from Paul Eggert <eggert <at> cs.ucla.edu> to control <at> debbugs.gnu.org. (Mon, 02 Jun 2014 17:52:02 GMT) Full text and rfc822 format available.

bug closed, send any further explanations to 17553 <at> debbugs.gnu.org and Reuben Thomas <rrt <at> sc3d.org> Request was from Paul Eggert <eggert <at> cs.ucla.edu> to control <at> debbugs.gnu.org. (Mon, 02 Jun 2014 17:52:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 01 Jul 2014 11:24:03 GMT) Full text and rfc822 format available.

bug unarchived. Request was from Pádraig Brady <P <at> draigBrady.com> to control <at> debbugs.gnu.org. (Fri, 19 Sep 2014 00:06:03 GMT) Full text and rfc822 format available.

Forcibly Merged 17553 18503. Request was from Pádraig Brady <P <at> draigBrady.com> to control <at> debbugs.gnu.org. (Fri, 19 Sep 2014 00:06:03 GMT) Full text and rfc822 format available.

Disconnected #18503 from all other report(s). Request was from Pádraig Brady <P <at> draigBrady.com> to control <at> debbugs.gnu.org. (Fri, 19 Sep 2014 00:46:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Fri, 17 Oct 2014 11:24:05 GMT) Full text and rfc822 format available.

This bug report was last modified 9 years and 165 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.