GNU bug report logs - #64756
some frequent test failures

Previous Next

Package: automake;

Reported by: Bruno Haible <bruno <at> clisp.org>

Date: Thu, 20 Jul 2023 21:56:01 UTC

Severity: normal

Done: Karl Berry <karl <at> freefriends.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 64756 in the body.
You can then email your comments to 64756 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-automake <at> gnu.org:
bug#64756; Package automake. (Thu, 20 Jul 2023 21:56:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Bruno Haible <bruno <at> clisp.org>:
New bug report received and forwarded. Copy sent to bug-automake <at> gnu.org. (Thu, 20 Jul 2023 21:56:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Bruno Haible <bruno <at> clisp.org>
To: bug-automake <at> gnu.org
Subject: some frequent test failures
Date: Thu, 20 Jul 2023 23:55:33 +0200
[Message part 1 (text/plain, inline)]
Hi,

After checking out autoconf from git (master branch), then
  $ bootstrap
  $ ./configure
  $ make
  $ for n in 2 3 4 5 6 7 8 9; do
      make check ; mv test-suite.log test-suite.log.$n
    done

The various test-suite.log files show different test failures each
time:

$ grep ^FAIL: test-suite.log.2
FAIL: t/aclocal-I-and-install
FAIL: t/backcompat2
FAIL: t/python-prefix
FAIL: t/remake-after-aclocal-m4
FAIL: t/remake-include-configure
FAIL: t/subobj

$ grep ^FAIL: test-suite.log.3
FAIL: t/aclocal-I-and-install
FAIL: t/python-prefix
FAIL: t/remake-after-acinclude-m4
FAIL: t/remake-after-aclocal-m4
FAIL: t/remake-aclocal-version-mismatch

$ grep ^FAIL: test-suite.log.4
FAIL: t/aclocal-I-and-install
FAIL: t/backcompat2
FAIL: t/nodef
FAIL: t/python-prefix
FAIL: t/remake-after-acinclude-m4
FAIL: t/remake-after-aclocal-m4
FAIL: t/remake-include-configure

$ grep ^FAIL: test-suite.log.5
FAIL: t/aclocal-I-and-install
FAIL: t/nodef2
FAIL: t/python-prefix
FAIL: t/remake-after-aclocal-m4
FAIL: t/remake-all-1

$ grep ^FAIL: test-suite.log.6
FAIL: t/aclocal-I-and-install
FAIL: t/backcompat2
FAIL: t/testsuite-summary-reference-log
FAIL: t/python-prefix
FAIL: t/remake-after-acinclude-m4

$ grep ^FAIL: test-suite.log.7
FAIL: t/aclocal-I-and-install
FAIL: t/nodef
FAIL: t/python-prefix

$ grep ^FAIL: test-suite.log.8
FAIL: t/aclocal-I-and-install
FAIL: t/backcompat2
FAIL: t/nodef
FAIL: t/testsuite-summary-reference-log
FAIL: t/python-prefix
FAIL: t/remake-after-configure-ac
FAIL: t/remake-after-aclocal-m4

$ grep ^FAIL: test-suite.log.9
FAIL: t/aclocal-I-and-install
FAIL: t/nodef
FAIL: t/python-prefix
FAIL: t/remake-after-acinclude-m4
FAIL: t/remake-after-aclocal-m4
FAIL: t/remake-all-1

So:
  * t/aclocal-I-and-install and t/python-prefix fail always.
  * The following tests fail randomly:
    t/backcompat2
    t/nodef
    t/nodef2
    t/remake-after-configure-ac
    t/remake-after-acinclude-m4
    t/remake-after-aclocal-m4
    t/remake-aclocal-version-mismatch
    t/remake-include-configure
    t/remake-all-1
    t/subobj
    t/testsuite-summary-reference-log

This is on a system with Linux, glibc 2.35, autoconf 2.71, m4 1.4.19, make 4.3,
perl v5.34.0. No hardware issues (uptime: 20 days).

Any idea?

Bruno
[logs.tar.xz (application/x-xz-compressed-tar, attachment)]

Information forwarded to bug-automake <at> gnu.org:
bug#64756; Package automake. (Fri, 21 Jul 2023 22:36:02 GMT) Full text and rfc822 format available.

Message #8 received at 64756 <at> debbugs.gnu.org (full text, mbox):

From: Karl Berry <karl <at> freefriends.org>
To: bruno <at> clisp.org
Cc: 64756 <at> debbugs.gnu.org
Subject: Re: bug#64756: some frequent test failures
Date: Fri, 21 Jul 2023 16:35:07 -0600
    The various test-suite.log files show different test failures each

Yes. Painful.  I believe this is due to a timing problem with autom4te,
exposed by Automake using fractional second sleeps according to what the
filesystem supports.  It is fixed in the autoconf repository but hasn't
been released.

I also noted this and asked for help earlier this year. Bogdan and
many other contributors looked into it, and Bogdan eventually came up
with a (tiny) patch for autom4te, which Jacob forwarded to autoconf here:
  https://lists.gnu.org/archive/html/automake/2023-03/msg00039.html
and it got installed. I don't know if there is an associated autoconf
bug#, but probably?

For myself, I can report that when I made the change (< to <= in two
places) in my live autom4te, it did not completely fix the timing
failures. I don't know why not. I bootstrapped and installed autoconf
from its git (as of around June 11) and have used that version with
development automake ever since, and the timing problems have stayed gone.

(This fix is what has allowed me to get back to doing any Automake
maintenance at all, so thanks again, Bogdan & everyone!)

Jacob developed a change to allow automake to test whether the autom4te
fix is in place or not, to avoid requiring the latest autoconf, but I
haven't installed it yet -- see thread continuation starting at
https://lists.gnu.org/archive/html/automake/2023-04/msg00002.html.

(If anyone has time to turn the code into the full expected patch, that
would be great.)

Thanks,
Karl




Information forwarded to bug-automake <at> gnu.org:
bug#64756; Package automake. (Mon, 24 Jul 2023 15:37:02 GMT) Full text and rfc822 format available.

Message #11 received at 64756 <at> debbugs.gnu.org (full text, mbox):

From: Bruno Haible <bruno <at> clisp.org>
To: Karl Berry <karl <at> freefriends.org>
Cc: 64756 <at> debbugs.gnu.org
Subject: Re: bug#64756: some frequent test failures
Date: Mon, 24 Jul 2023 17:36:01 +0200
Thanks for the detailed explanations, Karl.

> I bootstrapped and installed autoconf
> from its git (as of around June 11) and have used that version with
> development automake ever since, and the timing problems have stayed gone.

I confirm that with autoconf master, automake's "make check" is reliable:
I ran it 9 times and got the same result 9 times.

Bruno







Information forwarded to bug-automake <at> gnu.org:
bug#64756; Package automake. (Thu, 02 Nov 2023 22:05:01 GMT) Full text and rfc822 format available.

Message #14 received at 64756 <at> debbugs.gnu.org (full text, mbox):

From: Karl Berry <karl <at> freefriends.org>
To: 64756 <at> debbugs.gnu.org
Subject: Re: bug#64756: some frequent test failures
Date: Thu, 2 Nov 2023 16:03:48 -0600
Just to close this out: Bogdan worked up a patch to avoid the
fractional-second timestamps unless the new (current autoconf
development) autom4te is in used. Here is the basic change:
https://git.savannah.gnu.org/cgit/automake.git/commit/?id=b6fa73115d094c8d0da1d6759b6e7c7fca1f8a07
(also appended for the archive)

(Comments tweaked a little in subsequent changes.)

Jim is going to start working towards a release over the next days. 
--thanks, karl.


From b6fa73115d094c8d0da1d6759b6e7c7fca1f8a07 Mon Sep 17 00:00:00 2001
From: Bogdan <bogdro_rep <at> gmx.us>
Date: Wed, 1 Nov 2023 17:40:47 -0700
Subject: m4: fall back to non-fractional timestamps with older autom4te.

* m4/sanity.m4 (_AM_FILESYSTEM_TIMESTAMP_RESOLUTION): if
HiRes is not present in Autom4te/FileUtils.pm, do not consider
fractional sleeps.
---
 m4/sanity.m4 | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/m4/sanity.m4 b/m4/sanity.m4
index db9a1f5..445d1fb 100644
--- a/m4/sanity.m4
+++ b/m4/sanity.m4
@@ -16,10 +16,28 @@ AS_IF([sleep 0.001 2>/dev/null], [am_cv_sleep_fractional_seconds=true], [am_cv_s
 # _AM_FILESYSTEM_TIMESTAMP_RESOLUTION
 # -----------------------------------
 # Determine the filesystem timestamp resolution.  Modern systems are nanosecond
-# capable, but historical systems could be millisecond, second, or even 2-second
-# resolution.
+# capable, but historical systems could have millisecond, second, or even
+# 2-second resolution.
 AC_DEFUN([_AM_FILESYSTEM_TIMESTAMP_RESOLUTION], [dnl
 AC_REQUIRE([_AM_SLEEP_FRACTIONAL_SECONDS])
+#
+# Check if Autom4te uses Time::HiRes. If not, we cannot use fractional sleep,
+# because this sanity test and automated tests will be unreliable due to
+# Autom4te's caching of results and comparing timestamps.
+# More info: long thread around
+#     https://lists.gnu.org/archive/html/automake/2023-04/msg00002.html
+# and https://debbugs.gnu.org/cgi/bugreport.cgi?bug=64756.  
+AC_PATH_PROG([AUTOM4TE], [autom4te])
+if test x"$autom4te_perllibdir" = x; then
+  autom4te_perllibdir=`sed -n \
+   '/autom4te_perllibdir/{s/^.*|| //;s/;$//;s/^.//;s/.$//;p;q}' <$AUTOM4TE`
+fi
+if grep HiRes "$autom4te_perllibdir"/Autom4te/FileUtils.pm >/dev/null; then
+  :
+else
+  am_cv_sleep_fractional_seconds=false
+fi
+
 AC_CACHE_CHECK([the filesystem timestamp resolution], am_cv_filesystem_timestamp_resolution, [dnl
 # Use names that lexically sort older-first when the timestamps are equal.
 rm -f conftest.file.a conftest.file.b
-- 
cgit v1.1





Reply sent to Karl Berry <karl <at> freefriends.org>:
You have taken responsibility. (Thu, 02 Nov 2023 22:05:02 GMT) Full text and rfc822 format available.

Notification sent to Bruno Haible <bruno <at> clisp.org>:
bug acknowledged by developer. (Thu, 02 Nov 2023 22:05:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-automake <at> gnu.org:
bug#64756; Package automake. (Thu, 30 Nov 2023 10:39:01 GMT) Full text and rfc822 format available.

Message #22 received at 64756 <at> debbugs.gnu.org (full text, mbox):

From: Bruno Haible <bruno <at> clisp.org>
To: Karl Berry <karl <at> freefriends.org>
Cc: 64756 <at> debbugs.gnu.org
Subject: Re: bug#64756: some frequent test failures
Date: Thu, 30 Nov 2023 11:37:58 +0100
Karl Berry wrote:
>     The various test-suite.log files show different test failures each
> 
> Yes. Painful.  I believe this is due to a timing problem with autom4te,
> exposed by Automake using fractional second sleeps according to what the
> filesystem supports.  It is fixed in the autoconf repository but hasn't
> been released.
> 
> I also noted this and asked for help earlier this year. Bogdan and
> many other contributors looked into it, and Bogdan eventually came up
> with a (tiny) patch for autom4te, which Jacob forwarded to autoconf here:
>   https://lists.gnu.org/archive/html/automake/2023-03/msg00039.html
> and it got installed.

Thanks again for explaining. I encountered what I believe is another
manifestation of the same bug:

In some circumstances, the GNU gettext 'autopoint-3' test fails. More
exactly, an 'automake -a -c' invocation fails with the error messages:
  PREFIX/share/automake-1.16/am/depend2.am: error: am__fastdepCC does not appear in AM_CONDITIONAL
  PREFIX/share/automake-1.16/am/depend2.am:   The usual way to define 'am__fastdepCC' is to add 'AC_PROG_CC'
  PREFIX/share/automake-1.16/am/depend2.am:   to 'configure.ac' and run 'aclocal' and 'autoconf' again
  PREFIX/share/automake-1.16/am/depend2.am: error: AMDEP does not appear in AM_CONDITIONAL
  PREFIX/share/automake-1.16/am/depend2.am:   The usual way to define 'AMDEP' is to add one of the compiler tests
  PREFIX/share/automake-1.16/am/depend2.am:     AC_PROG_CC, AC_PROG_CXX, AC_PROG_OBJC, AC_PROG_OBJCXX,
  PREFIX/share/automake-1.16/am/depend2.am:     AM_PROG_AS, AM_PROG_GCJ, AM_PROG_UPC
  PREFIX/share/automake-1.16/am/depend2.am:   to 'configure.ac' and run 'aclocal' and 'autoconf' again
This is with Automake 1.16.5 and Autoconf 2.71 on an ext4 file system.
The failures occur with probability of ca. 36%.
When I add an 'rm -rf autom4te.cache' command before 'automake -a -c',
the failures disappear.

This is consistent with the analysis from
  https://lists.gnu.org/archive/html/automake/2023-03/msg00039.html
"Bogdan appears to have traced the issue to autom4te caching".

Until a new Autoconf release is public, I'll go with this workaround
to remove the autom4te.cache before invoking automake.

Bruno







Information forwarded to bug-automake <at> gnu.org:
bug#64756; Package automake. (Thu, 30 Nov 2023 22:42:02 GMT) Full text and rfc822 format available.

Message #25 received at 64756 <at> debbugs.gnu.org (full text, mbox):

From: Karl Berry <karl <at> freefriends.org>
To: bruno <at> clisp.org
Cc: 64756 <at> debbugs.gnu.org
Subject: Re: bug#64756: some frequent test failures
Date: Thu, 30 Nov 2023 15:40:56 -0700
Hi Bruno,

    This is with Automake 1.16.5 and Autoconf 2.71 on an ext4 file system.
    [...]
    When I add an 'rm -rf autom4te.cache' command before 'automake -a -c',
    the failures disappear.

FWIW, in the Automake development sources, there is now a test to see if
autom4te is the new version (thanks to Bogdan and Jacob). Thus it should
work with both old and new autom4te. I don't know when there will be a
new Automake release. Hopefully sooner rather than later ... --best, karl.

2023-11-01  Bogdan  <bogdro_rep <at> gmx.us>

	m4: fall back to non-fractional timestamps with older autom4te.

	* m4/sanity.m4 (_AM_FILESYSTEM_TIMESTAMP_RESOLUTION): if
	HiRes is not present in Autom4te/FileUtils.pm, do not consider
	fractional sleeps.






Information forwarded to bug-automake <at> gnu.org:
bug#64756; Package automake. (Sat, 02 Dec 2023 23:01:02 GMT) Full text and rfc822 format available.

Message #28 received at 64756 <at> debbugs.gnu.org (full text, mbox):

From: Karl Berry <karl <at> freefriends.org>
To: vapier <at> gentoo.org
Cc: bogdro_rep <at> gmx.us, jcb62281 <at> gmail.com, 64756 <at> debbugs.gnu.org
Subject: Re: rhel8 test failure confirmation?
Date: Sat, 2 Dec 2023 16:00:21 -0700
(Trying to switch to add to
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=64756.)

    this doesn't work on systems that wrap `autom4te`.  

Had no idea anyone did that.

	grep: /Autom4te/FileUtils.pm: No such file or directory

Oops.

    seems like the only reliable option is to invoke autom4te.

If you can complete the patch ...

FWIW, I know that "version" (development sources)
autoconf (GNU Autoconf) 2.72c.24-8e728
has the fix, because that's what I've been using.
I guess checking for 2.72d and later would suffice, though,
since that release is more or less imminent. --thanks, karl.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 31 Dec 2023 12:24:08 GMT) Full text and rfc822 format available.

This bug report was last modified 129 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.