GNU bug report logs -
#69770
[PATCH] build: strengthen 16 bit float support checks
Previous Next
Reported by: Grisha Levit <grishalevit <at> gmail.com>
Date: Wed, 13 Mar 2024 02:27:02 UTC
Severity: normal
Tags: patch
Done: Pádraig Brady <P <at> draigBrady.com>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 69770 in the body.
You can then email your comments to 69770 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-coreutils <at> gnu.org
:
bug#69770
; Package
coreutils
.
(Wed, 13 Mar 2024 02:27:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Grisha Levit <grishalevit <at> gmail.com>
:
New bug report received and forwarded. Copy sent to
bug-coreutils <at> gnu.org
.
(Wed, 13 Mar 2024 02:27:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
Recent clang provides __bf16 on aarch64 but it is broken.
If built with -O0, the conversion is wrong:
$ printf '\x3F\x80' | od --end=big -An -tfB | tr -d ' '
1.875
If built with -O1 or higher, compilation fails:
fatal error: error in backend: Cannot select: 0xb400007a58d29780: f32 = fp_extend 0xb400007a58d31720
0xb400007a58d31720: bf16,ch = CopyFromReg 0xb400007b78c53720, Register:bf16 %13
0xb400007a58d29470: bf16 = Register %13
In function: print_bfloat
The latter issue does not cause the existing configure test to fail
because the promotion is optimized out.
* configure.ac: Ensure 16 bit float promotion code does not get
optimized out, and produces an expected result.
---
configure.ac | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/configure.ac b/configure.ac
index 248e30ca2..21bee28d1 100644
--- a/configure.ac
+++ b/configure.ac
@@ -569,13 +569,14 @@ ac_c_werror_flag=$cu_save_c_werror_flag
# Test compiler support for half precision floating point types (for od)
AC_MSG_CHECKING([IEEE 16 bit floating point])
- AC_COMPILE_IFELSE(
+ AC_RUN_IFELSE(
[AC_LANG_SOURCE([[
int
main (void)
{
- _Float16 hf;
+ volatile _Float16 hf = 1;
float f = hf; /* Ensure compiler can promote to float. */
+ return !(f == 1.0f);
}
]])
],[
@@ -589,13 +590,14 @@ if test $ieee_16_bit_supported = yes; then
fi
AC_MSG_CHECKING([Brain 16 bit floating point])
- AC_COMPILE_IFELSE(
+ AC_RUN_IFELSE(
[AC_LANG_SOURCE([[
int
main (void)
{
- __bf16 hf;
+ volatile __bf16 hf = 1;
float f = hf; /* Ensure compiler can promote to float. */
+ return !(f == 1.0f);
}
]])
],[
--
2.44.0
Reply sent
to
Pádraig Brady <P <at> draigBrady.com>
:
You have taken responsibility.
(Wed, 13 Mar 2024 12:39:02 GMT)
Full text and
rfc822 format available.
Notification sent
to
Grisha Levit <grishalevit <at> gmail.com>
:
bug acknowledged by developer.
(Wed, 13 Mar 2024 12:39:02 GMT)
Full text and
rfc822 format available.
Message #10 received at 69770-done <at> debbugs.gnu.org (full text, mbox):
On 13/03/2024 02:24, Grisha Levit wrote:
> Recent clang provides __bf16 on aarch64 but it is broken.
>
> If built with -O0, the conversion is wrong:
>
> $ printf '\x3F\x80' | od --end=big -An -tfB | tr -d ' '
> 1.875
>
> If built with -O1 or higher, compilation fails:
>
> fatal error: error in backend: Cannot select: 0xb400007a58d29780: f32 = fp_extend 0xb400007a58d31720
> 0xb400007a58d31720: bf16,ch = CopyFromReg 0xb400007b78c53720, Register:bf16 %13
> 0xb400007a58d29470: bf16 = Register %13
> In function: print_bfloat
>
> The latter issue does not cause the existing configure test to fail
> because the promotion is optimized out.
>
> * configure.ac: Ensure 16 bit float promotion code does not get
> optimized out, and produces an expected result.
Looks good!
I'll follow up with another patch to all these involved checks to be cached / bypassed,
by wrapping them in AC_CACHE_VAL().
Marking this as done.
thanks!
Pádraig
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#69770
; Package
coreutils
.
(Thu, 14 Mar 2024 06:01:02 GMT)
Full text and
rfc822 format available.
Message #13 received at 69770 <at> debbugs.gnu.org (full text, mbox):
On 2024-03-12 19:24, Grisha Levit wrote:
> - AC_COMPILE_IFELSE(
> + AC_RUN_IFELSE(
This sort of change would break cross-compilation, no?
How about leaving it AC_COMPILE_IFELSE, but ensuring that -O1 or better
is used when the compiler supports -O1? That way we don't have to worry
about running the program, because (with the "volatile") clang will
error out.
Alternatively perhaps there's some way to check for the bug using
preprocessor macros like __FLT16_MANT_DIG__, __FLT16_MAX_EXP__,
__clang_major__, and __aarch64__. (This could be more fragile, though,
as clang presumably will fix the bug eventually.)
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#69770
; Package
coreutils
.
(Thu, 14 Mar 2024 13:06:02 GMT)
Full text and
rfc822 format available.
Message #16 received at 69770 <at> debbugs.gnu.org (full text, mbox):
On 14/03/2024 05:59, Paul Eggert wrote:
> On 2024-03-12 19:24, Grisha Levit wrote:
>> - AC_COMPILE_IFELSE(
>> + AC_RUN_IFELSE(
>
> This sort of change would break cross-compilation, no?
It would disable this feature for cross-compilation yes,
but this isn't the first instance of AC_RUN_IFELSE we use.
> How about leaving it AC_COMPILE_IFELSE, but ensuring that -O1 or better
> is used when the compiler supports -O1? That way we don't have to worry
> about running the program, because (with the "volatile") clang will
> error out.
>
> Alternatively perhaps there's some way to check for the bug using
> preprocessor macros like __FLT16_MANT_DIG__, __FLT16_MAX_EXP__,
> __clang_major__, and __aarch64__. (This could be more fragile, though,
> as clang presumably will fix the bug eventually.)
That would probably work for this edge case, but it's brittle
and I'm worried about other combinations of
compiler versions, complier options, and architectures.
For example sse availability is in the mix here,
so this will now correctly avoid this feature with:
CFLAGS=-mno-sse ./configure --quiet
Whereas previously it would have built but produced wrong values.
cheers,
Pádraig.
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#69770
; Package
coreutils
.
(Thu, 14 Mar 2024 13:37:01 GMT)
Full text and
rfc822 format available.
Message #19 received at submit <at> debbugs.gnu.org (full text, mbox):
On 3/14/24 6:03 AM, Pádraig Brady wrote:
> It would disable this feature for cross-compilation yes,
> but this isn't the first instance of AC_RUN_IFELSE we use.
Sorry if this is not the proper place to ask, but would it be possible
to make Autoconf use an emulator when cross-compiling? This issue
would be *less* of a problem in that case.
I am more familiar with CMake where check_c_source_runs is used
performs a similar task [1]. It is essentially a wrapper function
around try_run [2]. CMake allows you to set
CMAKE_CROSSCOMPILING_EMULATOR which will make try_run use an emulator
to run programs instead [3].
I've used this with Wine and QEMU User a few times and it makes
CMake's configure tests work fine (though much slower). If this
functionality seems beneficial then I can look into adding it to
Autoconf.
[1] https://cmake.org/cmake/help/latest/module/CheckCSourceRuns.html
[2] https://github.com/Kitware/CMake/blob/4285dec5f012260337193f29f18899d6cf2536a4/Modules/Internal/CheckSourceRuns.cmake#L93
[3] https://cmake.org/cmake/help/latest/variable/CMAKE_CROSSCOMPILING_EMULATOR.html
Collin
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#69770
; Package
coreutils
.
(Thu, 14 Mar 2024 14:51:02 GMT)
Full text and
rfc822 format available.
Message #22 received at 69770 <at> debbugs.gnu.org (full text, mbox):
On 14/03/2024 13:35, Collin Funk wrote:
> On 3/14/24 6:03 AM, Pádraig Brady wrote:
>> It would disable this feature for cross-compilation yes,
>> but this isn't the first instance of AC_RUN_IFELSE we use.
For completeness I should add that the above check can be
overridden if cross-compiling or whatever like:
./configure utils_cv_ieee_16_bit_supported=yes utils_cv_brain_16_bit_supported=yes
> Sorry if this is not the proper place to ask, but would it be possible
> to make Autoconf use an emulator when cross-compiling? This issue
> would be *less* of a problem in that case.
>
> I am more familiar with CMake where check_c_source_runs is used
> performs a similar task [1]. It is essentially a wrapper function
> around try_run [2]. CMake allows you to set
> CMAKE_CROSSCOMPILING_EMULATOR which will make try_run use an emulator
> to run programs instead [3].
>
> I've used this with Wine and QEMU User a few times and it makes
> CMake's configure tests work fine (though much slower). If this
> functionality seems beneficial then I can look into adding it to
> Autoconf.
>
> [1] https://cmake.org/cmake/help/latest/module/CheckCSourceRuns.html
> [2] https://github.com/Kitware/CMake/blob/4285dec5f012260337193f29f18899d6cf2536a4/Modules/Internal/CheckSourceRuns.cmake#L93
> [3] https://cmake.org/cmake/help/latest/variable/CMAKE_CROSSCOMPILING_EMULATOR.html
Interesting.
Perhaps this is something that could be configured separately
on the build system, through something like binfmt_misc ?
cheers,
Pádraig
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#69770
; Package
coreutils
.
(Thu, 14 Mar 2024 15:54:01 GMT)
Full text and
rfc822 format available.
Message #25 received at 69770 <at> debbugs.gnu.org (full text, mbox):
On 3/14/24 7:48 AM, Pádraig Brady wrote:
> For completeness I should add that the above check can be
> overridden if cross-compiling or whatever like:
>
> ./configure utils_cv_ieee_16_bit_supported=yes utils_cv_brain_16_bit_supported=yes
Ah, thanks. I wasn't aware of this.
> Interesting.
> Perhaps this is something that could be configured separately
> on the build system, through something like binfmt_misc ?
It has been a while, but IIRC on Debian GNU/Linux:
$ sudo apt install binfmt-support wine-binfmt
would allow you to run QEMU and Windows binaries similar to a normal
program, e.g. './emulated-program' would run QEMU or Wine. I assume
other distributions have similar packages.
I typically used this for CMake cross-compiling instead of setting
CMAKE_CROSSCOMPILING_EMULATOR. Then you could also run test programs
that were built as if they were native programs which was nice.
I'd have to look at Autoconf's sources, but I'd imagine something
similar could be done there. You'd just have to check if binfmt is
supported and that cross-compiled binaries could be run.
Binfmt is Linux specific though right? Allowing the user to pass an
emulator explicitly would be more portable in that case. I am willing
to look into it more if it seems beneficial.
Collin
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#69770
; Package
coreutils
.
(Fri, 15 Mar 2024 05:22:01 GMT)
Full text and
rfc822 format available.
Message #28 received at 69770 <at> debbugs.gnu.org (full text, mbox):
On 2024-03-14 06:03, Pádraig Brady wrote:
>
>> How about leaving it AC_COMPILE_IFELSE, but ensuring that -O1 or better
>> is used when the compiler supports -O1? That way we don't have to worry
>> about running the program, because (with the "volatile") clang will
>> error out.
>>
>> Alternatively perhaps there's some way to check for the bug using
>> preprocessor macros like __FLT16_MANT_DIG__, __FLT16_MAX_EXP__,
>> __clang_major__, and __aarch64__. (This could be more fragile, though,
>> as clang presumably will fix the bug eventually.)
>
> That would probably work for this edge case, but it's brittle
> and I'm worried about other combinations of
> compiler versions, complier options, and architectures.
Sure, but one cannot resolve such worries completely, as compiler errors
are inherently brittle: even a runtime test cannot prove their absence.
As I understand it, __bf16 is a real zoo: sometimes hardware supports
it, sometimes it doesn't, sometimes the software emulation library is
there, sometimes it's not, and so forth. Running a program on a
development host is likely to not match the runtime behavior on a
production host, so AC_RUN_IFELSE is reasonably likely to produce a
false positive, which is worse than a false negative.
Another issue, which is somewhat related, is that coreutils's current
macro BF16_SUPPORTED is too broad. Because __bf16 has so many bugs,
currently it's probably a mistake to say __bf16 is fully supported
anywhere. Even GCC is buggy; see for example:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114347
which I just now discovered when writing this email. The question isn't
so much whether __bf16 is supported, it's whether it's supported well
enough to compile and run GNU od.
Come to think of it, for 'od' we might be better off avoiding __bf16
entirely, and simply converting the bits by hand into a 'float'
variable. This would likely be more portable, as it would work even with
compilers that lack __bf16, or that have __bf16 bugs relevant to 'od'.
And that way we could remove the configure-time test entirely. (There
would be endianness issues but they're easily addressible.)
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#69770
; Package
coreutils
.
(Fri, 15 Mar 2024 11:52:02 GMT)
Full text and
rfc822 format available.
Message #31 received at 69770 <at> debbugs.gnu.org (full text, mbox):
On 15/03/2024 05:21, Paul Eggert wrote:
> On 2024-03-14 06:03, Pádraig Brady wrote:
>>
>>> How about leaving it AC_COMPILE_IFELSE, but ensuring that -O1 or better
>>> is used when the compiler supports -O1? That way we don't have to worry
>>> about running the program, because (with the "volatile") clang will
>>> error out.
>>>
>>> Alternatively perhaps there's some way to check for the bug using
>>> preprocessor macros like __FLT16_MANT_DIG__, __FLT16_MAX_EXP__,
>>> __clang_major__, and __aarch64__. (This could be more fragile, though,
>>> as clang presumably will fix the bug eventually.)
>>
>> That would probably work for this edge case, but it's brittle
>> and I'm worried about other combinations of
>> compiler versions, complier options, and architectures.
>
> Sure, but one cannot resolve such worries completely, as compiler errors
> are inherently brittle: even a runtime test cannot prove their absence.
>
> As I understand it, __bf16 is a real zoo: sometimes hardware supports
> it, sometimes it doesn't, sometimes the software emulation library is
> there, sometimes it's not, and so forth. Running a program on a
> development host is likely to not match the runtime behavior on a
> production host, so AC_RUN_IFELSE is reasonably likely to produce a
> false positive, which is worse than a false negative.
>
> Another issue, which is somewhat related, is that coreutils's current
> macro BF16_SUPPORTED is too broad. Because __bf16 has so many bugs,
> currently it's probably a mistake to say __bf16 is fully supported
> anywhere. Even GCC is buggy; see for example:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114347
>
> which I just now discovered when writing this email.
Nice one! (though that should not impact od)
> The question isn't
> so much whether __bf16 is supported, it's whether it's supported well
> enough to compile and run GNU od. >
>
> Come to think of it, for 'od' we might be better off avoiding __bf16
> entirely, and simply converting the bits by hand into a 'float'
> variable. This would likely be more portable, as it would work even with
> compilers that lack __bf16, or that have __bf16 bugs relevant to 'od'.
> And that way we could remove the configure-time test entirely. (There
> would be endianness issues but they're easily addressible.)
Right.
My thinking though was that it was nice code wise to treat __bf16 like
_Float16 (and like float etc.).
More importantly I only see __bf16 support improving going forward,
so preferred to keep the code simpler (and the configure check robust).
Currently both the main compilers, and main architectures are supported
(when not cross-compiling), and tested quite robustly in the configure check.
Another thing that dissuaded me from implementing the conversions ourselves
is the plethora of floating point config options for compilers,
which made me think in general that floating point conversion
was best left to the compiler.
I've just now realized that AC_RUN_IFELSE needs an explicit default
for cross-compiling, and I've just pushed a conservative default
(which may default the other way in future when support broadens).
thanks,
Pádraig.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Sat, 13 Apr 2024 11:24:13 GMT)
Full text and
rfc822 format available.
This bug report was last modified 1 year and 21 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.