GNU bug report logs -
#54785
for floating point, printf should use double like in C instead of long double
Previous Next
To reply to this bug, email your comments to 54785 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-coreutils <at> gnu.org
:
bug#54785
; Package
coreutils
.
(Fri, 08 Apr 2022 09:18:01 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Vincent Lefevre <vincent <at> vinc17.net>
:
New bug report received and forwarded. Copy sent to
bug-coreutils <at> gnu.org
.
(Fri, 08 Apr 2022 09:18:01 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
The printf command assumes that floating-point arguments are long double
values, which can yield surprising results, while most of the time the
double type is assumed by applications (for instance, this is the case
of XPath).
For instance:
$ zsh -fc '/usr/bin/printf "%a\n" $((43./2**22))'
0xa.c0000000000025cp-20
instead of
0xa.cp-20
(Note that ksh uses long double internally, but does not ensure the
round trip back to long double, so that this is incorrect anyway; see
https://unix.stackexchange.com/questions/422122/why-does-0-1-expand-to-0-10000000000000001-in-zsh
by Stephane Chazelas.)
I suppose that the issue is at the parsing level (after parsing the
value as a double, i.e. rounding to a double, the resulting binary
value can internally be stored either in a double or a long double
without changing its value).
I suggest to parse the argument as a "long double" only if the "L"
length modifier is provided, like in C.
--
Vincent Lefèvre <vincent <at> vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#54785
; Package
coreutils
.
(Sat, 09 Apr 2022 19:32:02 GMT)
Full text and
rfc822 format available.
Message #8 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Vincent Lefevre wrote in <https://bugs.gnu.org/54785>:
> $ zsh -fc '/usr/bin/printf "%a\n" $((43./2**22))'
> 0xa.c0000000000025cp-20
>
> instead of
>
> 0xa.cp-20
To summarize, this test case is:
printf '%a\n' 1.0251998901367188e-05
and the problem is that converting 1.0251998901367188e-05 to long double
prints the too-precise "0xa.c0000000000025cp-20", whereas you want it to
convert to double (which matches what most other programs do) and to
print "0xa.cp-20" or equivalent.
> (Note that ksh uses long double internally, but does not ensure the
> round trip back to long double
Yes, ksh messes up here. However, it's more important for Coreutils
printf to be compatible with the GNU shell, and Bash uses long double:
$ echo $BASH_VERSION
5.1.8(1)-release
$ /usr/bin/printf --version | head -n1
printf (GNU coreutils) 8.32
$ printf '%a\n' 1.0251998901367188e-05
0xa.c0000000000025cp-20
$ /usr/bin/printf '%a\n' 1.0251998901367188e-05
0xa.c0000000000025cp-20
> I suggest to parse the argument as a "long double" only if the "L"
> length modifier is provided, like in C.
Thanks, good idea.
I checked, and this also appears to be a POSIX conformance issue. POSIX
says that floating point operands "shall be evaluated as if by the
strtod() function". This means double, not long double.
Whatever decision we make here, we should be consistent with Bash so
I'll cc this email to bug-bash.
I propose that we change both coreutils and Bash to use 'double' rather
than 'long double' here, unless the user specifies the L modifier (e.g.,
"printf '%La\n' ...". I've written up a patch (attached) to Bash 5.2
alpha to do that. Assuming the Bash maintainer likes this proposal, I
plan to implement something similar for Coreutils printf.
[bash-double.diff (text/x-patch, attachment)]
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#54785
; Package
coreutils
.
(Mon, 11 Apr 2022 18:54:01 GMT)
Full text and
rfc822 format available.
Message #11 received at submit <at> debbugs.gnu.org (full text, mbox):
On 4/9/22 3:31 PM, Paul Eggert wrote:
>> I suggest to parse the argument as a "long double" only if the "L"
>> length modifier is provided, like in C.
> Thanks, good idea.
>
> I checked, and this also appears to be a POSIX conformance issue. POSIX
> says that floating point operands "shall be evaluated as if by the
> strtod() function". This means double, not long double.
>
> Whatever decision we make here, we should be consistent with Bash so I'll
> cc this email to bug-bash.
>
> I propose that we change both coreutils and Bash to use 'double' rather
> than 'long double' here, unless the user specifies the L modifier (e.g.,
> "printf '%La\n' ...". I've written up a patch (attached) to Bash 5.2 alpha
> to do that. Assuming the Bash maintainer likes this proposal, I plan to
> implement something similar for Coreutils printf.
It sounds like there are three cases.
1. If the `L' modifier is supplied, as an extension (POSIX doesn't allow
length modifiers for the printf utility), use long double. This would
work in both default and posix modes.
2. In posix mode, use strtod() and double.
3. In default mode, use the existing code to get the highest possible
precision, as the code has done for over 20 years.
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU chet <at> case.edu http://tiswww.cwru.edu/~chet/
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#54785
; Package
coreutils
.
(Tue, 12 Apr 2022 00:28:02 GMT)
Full text and
rfc822 format available.
Message #14 received at submit <at> debbugs.gnu.org (full text, mbox):
On 2022-04-11 14:52:50 -0400, Chet Ramey wrote:
> It sounds like there are three cases.
>
> 1. If the `L' modifier is supplied, as an extension (POSIX doesn't allow
> length modifiers for the printf utility), use long double. This would
> work in both default and posix modes.
>
> 2. In posix mode, use strtod() and double.
>
> 3. In default mode, use the existing code to get the highest possible
> precision, as the code has done for over 20 years.
Do users really need more precision than double in the context of
bash (or Coreutils) printf?
Moreover, if the argument comes from a double written in decimal
(as often done), using more precision will actually show garbage
that was not present initially. This was how I found this issue
with Coreutils printf (as I expected parsing as a double, this
was very disturbing):
cventin% /usr/bin/printf "%a\n" $((12196067*2**(-22)))
0xb.a18e300000000d1p-2
Note also that the "long double" precision depends on the architecture,
so that this may confuse users who work with different architectures.
--
Vincent Lefèvre <vincent <at> vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#54785
; Package
coreutils
.
(Mon, 25 Apr 2022 15:04:02 GMT)
Full text and
rfc822 format available.
Message #17 received at submit <at> debbugs.gnu.org (full text, mbox):
On 4/11/22 11:52, Chet Ramey wrote:
> On 4/9/22 3:31 PM, Paul Eggert wrote:
> It sounds like there are three cases.
>
> 1. If the `L' modifier is supplied, as an extension (POSIX doesn't allow
> length modifiers for the printf utility), use long double. This would
> work in both default and posix modes.
>
> 2. In posix mode, use strtod() and double.
>
> 3. In default mode, use the existing code to get the highest possible
> precision, as the code has done for over 20 years.
That'll fix the POSIX compatibility bug. However, it may be better for
Bash to just do (1) if 'L' is supplied and (2) otherwise, even if this
is less precise than (3). Doing it this simpler way will likely be more
useful for the small number of people who care whether 'printf' uses
'double' or 'long double' internally (and nobody else will care).
Doing it this way is what BSD sh does, and it's good to be compatible
with BSD sh. Similarly for dash and for other shells.
It's also what other GNU programs do. For example, on x86-64:
$ awk 'BEGIN {printf "%.100g\n", 0.1}'
0.1000000000000000055511151231257827021181583404541015625
$ emacs -batch -eval '(message "%.100g" 0.1)'
0.1000000000000000055511151231257827021181583404541015625
$ printf "%.100g\n" 0.1
0.1000000000000000000013552527156068805425093160010874271392822265625
printf is the outlier here, and although its answer is closer to the
mathematical value, that's not as useful as being closer to what most
other apps do.
Perhaps it was OK for sh printf to use long double 20 years ago. I even
had a hand in implementing that.[1] But nowadays it feels like a
misfire. The overwhelming majority of apps that have developed over the
past 20 years that use newer languages like JavaScript and Java, do not
support 'long double'; when interoperating with these apps, using 'long
double' in Bash printf likely causes more trouble than it cures.
[1]:
https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=830de2708207b7e48464e4778b55e582bac49832
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#54785
; Package
coreutils
.
(Mon, 25 Apr 2022 18:23:01 GMT)
Full text and
rfc822 format available.
Message #20 received at submit <at> debbugs.gnu.org (full text, mbox):
On 4/25/22 11:03 AM, Paul Eggert wrote:
> On 4/11/22 11:52, Chet Ramey wrote:
>> On 4/9/22 3:31 PM, Paul Eggert wrote:
>
>> It sounds like there are three cases.
>>
>> 1. If the `L' modifier is supplied, as an extension (POSIX doesn't allow
>> length modifiers for the printf utility), use long double. This would
>> work in both default and posix modes.
>>
>> 2. In posix mode, use strtod() and double.
>>
>> 3. In default mode, use the existing code to get the highest possible
>> precision, as the code has done for over 20 years.
>
> That'll fix the POSIX compatibility bug. However, it may be better for Bash
> to just do (1) if 'L' is supplied and (2) otherwise, even if this is less
> precise than (3). Doing it this simpler way will likely be more useful for
> the small number of people who care whether 'printf' uses 'double' or 'long
> double' internally (and nobody else will care).
Thanks for the input.
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU chet <at> case.edu http://tiswww.cwru.edu/~chet/
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#54785
; Package
coreutils
.
(Mon, 25 Apr 2022 19:07:02 GMT)
Full text and
rfc822 format available.
Message #23 received at 54785 <at> debbugs.gnu.org (full text, mbox):
On 4/25/22 11:22, Chet Ramey wrote:
> Thanks for the input.
You're welcome. Whenever you decide what to do about this, could you
please let us know? I'd like coreutils printf to stay compatible with
Bash printf. Thanks.
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#54785
; Package
coreutils
.
(Mon, 25 Apr 2022 23:52:02 GMT)
Full text and
rfc822 format available.
Message #26 received at submit <at> debbugs.gnu.org (full text, mbox):
On Mon, Apr 25, 2022, at 13:06, Paul Eggert wrote:
>
> I'd like coreutils printf to stay compatible with Bash printf. Thanks.
>
Is there any interest/motivation for consistentizing {coreutils printf, bash printf} with glibc printf? There's a minor but notable inconsistency between them for %a format. See
https://lists.gnu.org/archive/html/coreutils/2022-04/msg00020.html
I asked about this on the coreutils list, but no response.
Thanks,
Glenn
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#54785
; Package
coreutils
.
(Tue, 26 Apr 2022 01:34:02 GMT)
Full text and
rfc822 format available.
Message #29 received at 54785 <at> debbugs.gnu.org (full text, mbox):
On 4/25/22 16:50, Glenn Golden wrote:
>
>
> On Mon, Apr 25, 2022, at 13:06, Paul Eggert wrote:
>>
>> I'd like coreutils printf to stay compatible with Bash printf. Thanks.
>>
>
> Is there any interest/motivation for consistentizing {coreutils printf, bash printf} with glibc printf? There's a minor but notable inconsistency between them for %a format. See
>
> https://lists.gnu.org/archive/html/coreutils/2022-04/msg00020.html
>
> I asked about this on the coreutils list, but no response.
To some extent it's the same problem. If Bash and coreutils printf
change to use 'double', they'll output the same thing that C printf outputs.
But to some extent it's a different problem, as the Bash and coreutils
printf use glibc printf with long double, and the latter isn't working
consistently with double. I suppose filing a glibc bug report might
address this different problem.
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#54785
; Package
coreutils
.
(Wed, 27 Apr 2022 12:12:02 GMT)
Full text and
rfc822 format available.
Message #32 received at 54785 <at> debbugs.gnu.org (full text, mbox):
On Mon, Apr 25, 2022, at 19:33, Paul Eggert wrote:
> On 4/25/22 16:50, Glenn Golden wrote:
>>
>>
>> On Mon, Apr 25, 2022, at 13:06, Paul Eggert wrote:
>>>
>>> I'd like coreutils printf to stay compatible with Bash printf. Thanks.
>>>
>>
>> Is there any interest/motivation for consistentizing {coreutils printf, bash printf} with glibc printf? There's a minor but notable inconsistency between them for %a format. See
>>
>> https://lists.gnu.org/archive/html/coreutils/2022-04/msg00020.html
>>
>> I asked about this on the coreutils list, but no response.
>
> To some extent it's the same problem. If Bash and coreutils printf
> change to use 'double', they'll output the same thing that C printf outputs.
>
> But to some extent it's a different problem, as the Bash and coreutils
> printf use glibc printf with long double, and the latter isn't working
> consistently with double. I suppose filing a glibc bug report might
> address this different problem.
>
Ok, I see what you mean, thanks for the explanation. I'll pose the question (or maybe file a bug report) on the glibc list.
Thanks,
Glenn
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#54785
; Package
coreutils
.
(Wed, 27 Apr 2022 17:26:01 GMT)
Full text and
rfc822 format available.
Message #35 received at 54785 <at> debbugs.gnu.org (full text, mbox):
On 4/27/22 05:10, Glenn Golden wrote:
> Ok, I see what you mean, thanks for the explanation. I'll pose the question (or maybe file a bug report) on the glibc list.
By the way I now think I see a reason for why glibc does things the way
it does: it minimizes output size.
'double' has 53 bits counting the hidden bit, and with 53/4 you have 13
hex digits plus one leading digit that is either 0 (unnormalized) or 1
(normalized).
'long double' has 64 bits and with 64/4 you have 16 hex digits, where
the leading digit is 0-7 (unnormalized), 8-f (normalized).
Any proposal to change 'long double' to always output leading 0 or 1
needs to deal with the fact that this'd lengthen the output string.
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#54785
; Package
coreutils
.
(Wed, 27 Apr 2022 18:53:01 GMT)
Full text and
rfc822 format available.
Message #38 received at 54785 <at> debbugs.gnu.org (full text, mbox):
On Apr 27 2022, Paul Eggert wrote:
> Any proposal to change 'long double' to always output leading 0 or 1 needs
> to deal with the fact that this'd lengthen the output string.
The real reason for not using that format is that 64 bits can be
conveniently formatted without having to shift around the mantissa, and
having to handle the shifted out bits specially. For IEEE quad with its
112 bits of mantissa and the hidden bit, this format is again more
convenient.
--
Andreas Schwab, schwab <at> linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1
"And now for something completely different."
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#54785
; Package
coreutils
.
(Fri, 29 Apr 2022 20:06:02 GMT)
Full text and
rfc822 format available.
Message #41 received at 54785 <at> debbugs.gnu.org (full text, mbox):
On 4/25/22 3:06 PM, Paul Eggert wrote:
> On 4/25/22 11:22, Chet Ramey wrote:
>> Thanks for the input.
>
> You're welcome. Whenever you decide what to do about this, could you please
> let us know? I'd like coreutils printf to stay compatible with Bash printf.
> Thanks.
I think I'm going to stick with the behavior I proposed, fixing the POSIX
conformance issue and preserving backwards compatibility, until I hear more
about whether backwards compatibility is an issue here.
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU chet <at> case.edu http://tiswww.cwru.edu/~chet/
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#54785
; Package
coreutils
.
(Fri, 29 Apr 2022 23:17:01 GMT)
Full text and
rfc822 format available.
Message #44 received at 54785 <at> debbugs.gnu.org (full text, mbox):
On 4/29/22 13:04, Chet Ramey wrote:
>
> I think I'm going to stick with the behavior I proposed, fixing the POSIX
> conformance issue and preserving backwards compatibility, until I hear more
> about whether backwards compatibility is an issue here.
Come to think of it, as far as POSIX is concerned Bash doesn't need to
change what it does. POSIX doesn't require that the shelll printf
command be compiled with any particular environment. It would conform to
POSIX, for example, if Bash's printf were compiled with an IBM floating
point implementation rather than with an IEEE floating point
implementation, so long as Bash's printf parses floating-point strings
the way strtod is supposed to parse strings on an IBM mainframe.
Similarly, Bash's printf can use an 80-bit floating point format if
available; it will still conform to POSIX.
So this isn't a POSIX conformance issue; only a compatibility issue. Is
it more important for the Bash printf to behave like most other shells
and other programs, or is it more important for Bash printf to behave
like it has for the last 18 years or so?
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#54785
; Package
coreutils
.
(Sat, 30 Apr 2022 12:49:01 GMT)
Full text and
rfc822 format available.
Message #47 received at 54785 <at> debbugs.gnu.org (full text, mbox):
On 2022-04-29 16:16:28 -0700, Paul Eggert wrote:
> On 4/29/22 13:04, Chet Ramey wrote:
> > I think I'm going to stick with the behavior I proposed, fixing the POSIX
> > conformance issue and preserving backwards compatibility, until I hear more
> > about whether backwards compatibility is an issue here.
>
> Come to think of it, as far as POSIX is concerned Bash doesn't need to
> change what it does. POSIX doesn't require that the shelll printf command be
> compiled with any particular environment. It would conform to POSIX, for
> example, if Bash's printf were compiled with an IBM floating point
> implementation rather than with an IEEE floating point implementation, so
> long as Bash's printf parses floating-point strings the way strtod is
> supposed to parse strings on an IBM mainframe. Similarly, Bash's printf can
> use an 80-bit floating point format if available; it will still conform to
> POSIX.
Yes, but to be clear, POSIX says:
shall be evaluated as if by the strtod() function if the
corresponding conversion specifier is a, A, e, E, f, F, g, or G
so the number should be regarded as a double-precision number
(type double). Then this number can be stored in a long double
since any double is representable exactly as a long double.
> So this isn't a POSIX conformance issue; only a compatibility issue. Is it
> more important for the Bash printf to behave like most other shells and
> other programs, or is it more important for Bash printf to behave like it
> has for the last 18 years or so?
Concerning the compatibility, the question is: with what?
* If the goal is to communicate with other tools (e.g. zsh,
XPath-based, but also programs that output values in decimal
from double, which is the most common type used in practice),
then double should be used.
* If the goal is to communicate with other machines, then double
is again preferable, since the long double type depends on the
platform (x86, powerpc and aarch64 using 3 different formats).
* Concerning just a bash script running on some machine:
- If printf (without a length modifier) switches to double,
the behavior will change.
- Note that the behavior of the script on different platforms
will be different if long double is used, but will be the
same if double is used.
Note that since bash doesn't support FP arithmetic in its arithmetic
expressions, it is very probable that FP values provided to printf
come from other tools (first point above), thus are probably in
double precision.
--
Vincent Lefèvre <vincent <at> vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#54785
; Package
coreutils
.
(Sat, 30 Apr 2022 16:59:02 GMT)
Full text and
rfc822 format available.
Message #50 received at 54785 <at> debbugs.gnu.org (full text, mbox):
On 4/30/22 05:48, Vincent Lefevre wrote:
> Yes, but to be clear, POSIX says:
>
> shall be evaluated as if by the strtod() function if the
> corresponding conversion specifier is a, A, e, E, f, F, g, or G
>
> so the number should be regarded as a double-precision number
> (type double).
Yes, but POSIX does not require the C type 'double' and the C function
strtod to be implemented via IEEE-754 64-bit floating point. POSIX
allows 'double' and 'strtod' to be implemented via x86-64
extended-precision (80-bit) floating point, or by any other
floating-point type that satisfies some (weak) properties. I see no
requirement that the shell must be implemented as if by the standard c99
command with the default options.
The POSIX requirements on the implementation of 'double' and 'strtod'
are so lax that Bash 'printf' could even use IEEE-754 32-bit floating
point, if it wanted to. One could build Bash with 'gcc -mlong-double=32
-mdouble=32' assuming these options work, and the result would conform
to POSIX. (Not that I'm suggesting this!)
> Concerning the compatibility, the question is: with what?
I agree that it'd be a net win for Bash to use plain 'double' here; your
discussion of the various compatibility plusses of doing that is
compelling to me.
This bug report was last modified 2 years and 2 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.