GNU bug report logs - #20597
GNU tar fails test suite

Previous Next

Package: guix;

Reported by: Andrew Patterson <ajpatter <at> uwaterloo.ca>

Date: Sun, 17 May 2015 17:50:10 UTC

Severity: normal

Done: ludo <at> gnu.org (Ludovic Courtès)

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 20597 in the body.
You can then email your comments to 20597 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guix <at> gnu.org:
bug#20597; Package guix. (Sun, 17 May 2015 17:50:10 GMT) Full text and rfc822 format available.

Acknowledgement sent to Andrew Patterson <ajpatter <at> uwaterloo.ca>:
New bug report received and forwarded. Copy sent to bug-guix <at> gnu.org. (Sun, 17 May 2015 17:50:12 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Andrew Patterson <ajpatter <at> uwaterloo.ca>
To: bug-guix <at> gnu.org
Subject: GNU tar fails test suite
Date: Sun, 17 May 2015 03:23:50 -0400
[Message part 1 (text/plain, inline)]
When trying to build tar using the installation medium, the build fails
when running the test suite.

The command given was:

guix build -K tar --no-substitutes

The testsuite.log is attached.
[testsuite.log (text/x-log, attachment)]

Information forwarded to bug-guix <at> gnu.org:
bug#20597; Package guix. (Tue, 19 May 2015 15:43:01 GMT) Full text and rfc822 format available.

Message #8 received at 20597 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Andrew Patterson <ajpatter <at> uwaterloo.ca>
Cc: 20597 <at> debbugs.gnu.org
Subject: Re: bug#20597: GNU tar fails test suite
Date: Tue, 19 May 2015 17:42:45 +0200
Andrew Patterson <ajpatter <at> uwaterloo.ca> skribis:

> When trying to build tar using the installation medium, the build fails
> when running the test suite.

I suppose this is Guix 0.8.2 on top of another distribution, right?  Did
you install from source or from the binary tarball?  Did you enable
substitutes (info "(guix) Substitutes")?

I just rebuilt it on my laptop (on GuixSD) and
/gnu/store/hs7ldmg9ix8irdw5gj4vr55ml8m08723-tar-1.28 (x86_64-linux)
builds fine; same on Hydra.

> ## ---------------------- ##
> ## Detailed failed tests. ##
> ## ---------------------- ##
>
> #                             -*- compilation -*-
> 161. remfiles08a.at:28: testing remove-files deleting two subdirs in -c/non-incr. mode ...
> ./remfiles08a.at:31:
> mkdir gnu
> (cd gnu
> TEST_TAR_FORMAT=gnu
> export TEST_TAR_FORMAT
> TAR_OPTIONS="-H gnu"
> export TAR_OPTIONS
> rm -rf *
>
> mkdir foo
> mkdir bar
> echo foo/foo_file > foo/foo_file
> echo bar/bar_file > bar/bar_file
> decho A
> tar -cvf foo.tar --remove-files -C foo . -C ../bar .
> decho B
> find .
> )
> --- -	2015-05-17 06:46:28.337894209 +0000
> +++ /tmp/nix-build-tar-1.28.drv-0/tar-1.28/tests/testsuite.dir/at-groups/161/stdout	2015-05-17 06:46:28.329525786 +0000
> @@ -6,4 +6,5 @@
>  B
>  .
>  ./foo.tar
> +./bar

Sounds like ‘bar’ is expected to be removed but is not.

> 161. remfiles08a.at:28: 161. remove-files deleting two subdirs in -c/non-incr. mode (remfiles08a.at:28): FAILED (remfiles08a.at:31)
>
> #                             -*- compilation -*-
> 163. remfiles08c.at:28: testing remove-files deleting two subdirs in -r mode ...
> ./remfiles08c.at:31:
> mkdir gnu
> (cd gnu
> TEST_TAR_FORMAT=gnu
> export TEST_TAR_FORMAT
> TAR_OPTIONS="-H gnu"
> export TAR_OPTIONS
> rm -rf *
>
>
> test -z "`sort < /dev/null 2>&1`" || exit 77
>
> mkdir foo
> mkdir bar
> echo foo/foo_file > foo/foo_file
> echo bar/bar_file > bar/bar_file
> tar -cf foo.tar -C foo . -C ../bar .
> decho A
> find . | sort
> decho B
> tar -rvf foo.tar --remove-files -C foo . -C ../bar .
> decho C
> find .
> )
> --- -	2015-05-17 06:46:28.468176642 +0000
> +++ /tmp/nix-build-tar-1.28.drv-0/tar-1.28/tests/testsuite.dir/at-groups/163/stdout	2015-05-17 06:46:28.457525791 +0000
> @@ -13,4 +13,5 @@
>  C
>  .
>  ./foo.tar
> +./bar
>  
> 163. remfiles08c.at:28: 163. remove-files deleting two subdirs in -r mode (remfiles08c.at:28): FAILED (remfiles08c.at:31)

Same story here.

I don’t fully understand the tests, but they seem to be testing a
deterministic property.

Now, there are several tests creating files/directories call ‘bar’; they
may run in parallel, and it’s not clear to me whether or not they’re
using separate directories.

Does the build succeed if you run it another time with:

  guix build tar -K -c 1

Thanks in advance,
Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#20597; Package guix. (Sun, 24 May 2015 11:35:03 GMT) Full text and rfc822 format available.

Message #11 received at 20597 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Andy Patterson <ajpatter <at> uwaterloo.ca>
Cc: bug-gnulib <at> gnu.org, 20597 <at> debbugs.gnu.org
Subject: ‘unlinkat’ bug in Linux 4.0.2 leads to tar
 test failure
Date: Sun, 24 May 2015 13:33:49 +0200
(Please keep 20597 <at> debbugs.gnu.org Cc'd.)
(Gnulib: please scroll further down for the ‘unlinkat’ issue.)

Andy Patterson <ajpatter <at> uwaterloo.ca> skribis:

> > I suppose this is Guix 0.8.2 on top of another distribution, right?  Did
> > you install from source or from the binary tarball?  Did you enable
> > substitutes (info "(guix) Substitutes")?
> 
> I was using the USB install medium in a live environment.

So this is on GuixSD 0.8.2.  ‘test-suite.log’ indeed mentions
Linux-libre 4.0.2.

> I had substitutes enabled (I'm pretty sure they're enabled by default
> here, but I also enabled them manually just to be sure). I wasn't able
> to install anything with substitutes enabled; it would always stall
> while trying to update the substitutes list from hydra. When my
> network went down briefly, it informed me that it was still at 0.0%
> before exiting. I think that this is probably a separate issue, but
> which which I was less concerned about since I didn't want to use
> substitutes anyway.

OK.

hydra.gnu.org is unfortunately too often overloaded these days, so you
probably arrived on a bad day.  Nevertheless, the solution to this
specific issue is for you to use substitutes to circumvent the bug
described below.

>> Does the build succeed if you run it another time with:
>>
>>   guix build tar -K -c 1
>
> I tried this (with --no-substitutes), but I don't think the test suite
> actually runs in parallel. I didn't notice any difference in that regard
> when it was running; it seemed to take up the same amount of time with
> or without -c 1. I had the same tests fail with the flag enabled.

Oh you must be right.  Looking at tests/Makefile.in, I see:

--8<---------------cut here---------------start------------->8---
check-local: atconfig atlocal $(TESTSUITE)
	$(SHELL) $(TESTSUITE) $(TESTSUITEFLAGS)
--8<---------------cut here---------------end--------------->8---

... which shows that ./testsuite is not automatically passed -j,
contrary to what I thought.

<http://lists.gnu.org/archive/html/bug-tar/2014-08/msg00010.html>
reports a similar issue but on a different OS.

I just tried this in a GuixSD VM with Linux-libre 4.0.2:

--8<---------------cut here---------------start------------->8---
  mkdir foo
  mkdir bar
  echo foo/foo_file > foo/foo_file
  echo bar/bar_file > bar/bar_file
  tar -cvf foo.tar --remove-files -C foo . -C ../bar .
  find .
  stat bar
--8<---------------cut here---------------end--------------->8---

And indeed, it fails (that is, ‘bar’ is left behind.)  It works fine on
4.0.4-gnu though.

On 4.0.2-gnu, I strace’d the ‘tar’ command above:

--8<---------------cut here---------------start------------->8---
openat(AT_FDCWD, "foo", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 4

[...]

openat(4, ".", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_NOFOLLOW|O_CLOEXEC) = 5

[...]

openat(5, "foo_file", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_NOFOLLOW|O_CLOEXEC) = 6

[...]

openat(4, "../bar", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 5
newfstatat(5, ".", {st_mode=S_IFDIR|0755, st_size=60, ...}, AT_SYMLINK_NOFOLLOW) = 0
openat(5, ".", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_NOFOLLOW|O_CLOEXEC) = 6

[...]

openat(6, "bar_file", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_NOFOLLOW|O_CLOEXEC) = 7
fstat(7, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
write(1, "./bar_file\n", 11)            = 11
read(7, "x\n", 2)                       = 2
fstat(7, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
close(7)                                = 0
fstat(6, {st_mode=S_IFDIR|0755, st_size=60, ...}) = 0
brk(0x1a34000)                          = 0x1a34000
close(6)                                = 0
write(3, "./\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 10240) = 10240
close(3)                                = 0
unlinkat(4, "foo_file", 0)              = 0
unlinkat(AT_FDCWD, "foo", AT_REMOVEDIR) = 0
unlinkat(5, "bar_file", 0)              = 0
unlinkat(4, "../bar", AT_REMOVEDIR)     = -1 ENOENT (No such file or directory)
--8<---------------cut here---------------end--------------->8---

Contrast this with the same thing on 4.0.4-gnu:

--8<---------------cut here---------------start------------->8---
unlinkat(4, "foo_file", 0)              = 0
unlinkat(AT_FDCWD, "foo", AT_REMOVEDIR) = 0
unlinkat(5, "bar_file", 0)              = 0
unlinkat(4, "../bar", AT_REMOVEDIR)     = 0
--8<---------------cut here---------------end--------------->8---

So this looks like a 4.0.2 kernel bug that Gnulib’s unlinkat should
perhaps work around.

Thoughts?

Thanks,
Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#20597; Package guix. (Sun, 24 May 2015 11:59:02 GMT) Full text and rfc822 format available.

Message #14 received at 20597 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Ludovic Courtès <ludo <at> gnu.org>, 
 Andy Patterson <ajpatter <at> uwaterloo.ca>
Cc: 20597 <at> debbugs.gnu.org, bug-gnulib <at> gnu.org
Subject: Re: ‘unlinkat’ bug in Linux 4.0.2 leads to tar test failure
Date: Sun, 24 May 2015 12:57:56 +0100
On 24/05/15 12:33, Ludovic Courtès wrote:
> (Please keep 20597 <at> debbugs.gnu.org Cc'd.)
> (Gnulib: please scroll further down for the ‘unlinkat’ issue.)
> 
> Andy Patterson <ajpatter <at> uwaterloo.ca> skribis:
> 
>>> I suppose this is Guix 0.8.2 on top of another distribution, right?  Did
>>> you install from source or from the binary tarball?  Did you enable
>>> substitutes (info "(guix) Substitutes")?
>>
>> I was using the USB install medium in a live environment.
> 
> So this is on GuixSD 0.8.2.  ‘test-suite.log’ indeed mentions
> Linux-libre 4.0.2.
> 
>> I had substitutes enabled (I'm pretty sure they're enabled by default
>> here, but I also enabled them manually just to be sure). I wasn't able
>> to install anything with substitutes enabled; it would always stall
>> while trying to update the substitutes list from hydra. When my
>> network went down briefly, it informed me that it was still at 0.0%
>> before exiting. I think that this is probably a separate issue, but
>> which which I was less concerned about since I didn't want to use
>> substitutes anyway.
> 
> OK.
> 
> hydra.gnu.org is unfortunately too often overloaded these days, so you
> probably arrived on a bad day.  Nevertheless, the solution to this
> specific issue is for you to use substitutes to circumvent the bug
> described below.
> 
>>> Does the build succeed if you run it another time with:
>>>
>>>   guix build tar -K -c 1
>>
>> I tried this (with --no-substitutes), but I don't think the test suite
>> actually runs in parallel. I didn't notice any difference in that regard
>> when it was running; it seemed to take up the same amount of time with
>> or without -c 1. I had the same tests fail with the flag enabled.
> 
> Oh you must be right.  Looking at tests/Makefile.in, I see:
> 
> --8<---------------cut here---------------start------------->8---
> check-local: atconfig atlocal $(TESTSUITE)
> 	$(SHELL) $(TESTSUITE) $(TESTSUITEFLAGS)
> --8<---------------cut here---------------end--------------->8---
> 
> ... which shows that ./testsuite is not automatically passed -j,
> contrary to what I thought.
> 
> <http://lists.gnu.org/archive/html/bug-tar/2014-08/msg00010.html>
> reports a similar issue but on a different OS.
> 
> I just tried this in a GuixSD VM with Linux-libre 4.0.2:
> 
> --8<---------------cut here---------------start------------->8---
>   mkdir foo
>   mkdir bar
>   echo foo/foo_file > foo/foo_file
>   echo bar/bar_file > bar/bar_file
>   tar -cvf foo.tar --remove-files -C foo . -C ../bar .
>   find .
>   stat bar
> --8<---------------cut here---------------end--------------->8---
> 
> And indeed, it fails (that is, ‘bar’ is left behind.)  It works fine on
> 4.0.4-gnu though.
> 
> On 4.0.2-gnu, I strace’d the ‘tar’ command above:
> 
> --8<---------------cut here---------------start------------->8---
> openat(AT_FDCWD, "foo", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 4
> 
> [...]
> 
> openat(4, ".", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_NOFOLLOW|O_CLOEXEC) = 5
> 
> [...]
> 
> openat(5, "foo_file", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_NOFOLLOW|O_CLOEXEC) = 6
> 
> [...]
> 
> openat(4, "../bar", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 5
> newfstatat(5, ".", {st_mode=S_IFDIR|0755, st_size=60, ...}, AT_SYMLINK_NOFOLLOW) = 0
> openat(5, ".", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_NOFOLLOW|O_CLOEXEC) = 6
> 
> [...]
> 
> openat(6, "bar_file", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_NOFOLLOW|O_CLOEXEC) = 7
> fstat(7, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
> write(1, "./bar_file\n", 11)            = 11
> read(7, "x\n", 2)                       = 2
> fstat(7, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
> close(7)                                = 0
> fstat(6, {st_mode=S_IFDIR|0755, st_size=60, ...}) = 0
> brk(0x1a34000)                          = 0x1a34000
> close(6)                                = 0
> write(3, "./\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 10240) = 10240
> close(3)                                = 0
> unlinkat(4, "foo_file", 0)              = 0
> unlinkat(AT_FDCWD, "foo", AT_REMOVEDIR) = 0
> unlinkat(5, "bar_file", 0)              = 0
> unlinkat(4, "../bar", AT_REMOVEDIR)     = -1 ENOENT (No such file or directory)
> --8<---------------cut here---------------end--------------->8---
> 
> Contrast this with the same thing on 4.0.4-gnu:
> 
> --8<---------------cut here---------------start------------->8---
> unlinkat(4, "foo_file", 0)              = 0
> unlinkat(AT_FDCWD, "foo", AT_REMOVEDIR) = 0
> unlinkat(5, "bar_file", 0)              = 0
> unlinkat(4, "../bar", AT_REMOVEDIR)     = 0
> --8<---------------cut here---------------end--------------->8---
> 
> So this looks like a 4.0.2 kernel bug that Gnulib’s unlinkat should
> perhaps work around.
> 
> Thoughts?

Maybe. How widely deployed was 4.0.2 (It's not used in Red Hat land for example).
How many versions was the bug present for?
If it was just a fleeting issue, then there is less incentive to workaround.

cheers,
Pádraig





Information forwarded to bug-guix <at> gnu.org:
bug#20597; Package guix. (Sun, 24 May 2015 13:54:02 GMT) Full text and rfc822 format available.

Message #17 received at 20597 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Pádraig Brady <P <at> draigBrady.com>
Cc: 20597 <at> debbugs.gnu.org, bug-gnulib <at> gnu.org,
 Andy Patterson <ajpatter <at> uwaterloo.ca>
Subject: Re: ‘unlinkat’ bug in Linux 4.0.2 leads to
 tar test failure
Date: Sun, 24 May 2015 15:53:07 +0200
Pádraig Brady <P <at> draigBrady.com> skribis:

> On 24/05/15 12:33, Ludovic Courtès wrote:

[...]

>> unlinkat(4, "foo_file", 0)              = 0
>> unlinkat(AT_FDCWD, "foo", AT_REMOVEDIR) = 0
>> unlinkat(5, "bar_file", 0)              = 0
>> unlinkat(4, "../bar", AT_REMOVEDIR)     = -1 ENOENT (No such file or directory)
>> --8<---------------cut here---------------end--------------->8---
>> 
>> Contrast this with the same thing on 4.0.4-gnu:
>> 
>> --8<---------------cut here---------------start------------->8---
>> unlinkat(4, "foo_file", 0)              = 0
>> unlinkat(AT_FDCWD, "foo", AT_REMOVEDIR) = 0
>> unlinkat(5, "bar_file", 0)              = 0
>> unlinkat(4, "../bar", AT_REMOVEDIR)     = 0
>> --8<---------------cut here---------------end--------------->8---
>> 
>> So this looks like a 4.0.2 kernel bug that Gnulib’s unlinkat should
>> perhaps work around.
>> 
>> Thoughts?
>
> Maybe. How widely deployed was 4.0.2 (It's not used in Red Hat land for example).
> How many versions was the bug present for?

I don’t know, and I haven’t been able to find traces of a fix in that
area in the kernel.

OTOH, after rereading the analysis at
<http://lists.gnu.org/archive/html/bug-tar/2014-08/msg00010.html>, it
may be that the 4.0.2 behavior is POSIX-conforming, in which case we’d
rather fix tar (or its tests) instead:

  The BSD behavior appears to be in line with POSIX.  unlinkat() with
  AT_REMOVEDIR is equivalent to rmdir(), whose specification says:

    If one or more processes have the directory open when the last
    link is removed, the dot and dot-dot entries, if present, shall
    be removed before rmdir() returns and no new entries may be created
    in the directory, but the directory shall not be removed until
    all references to the directory are closed.

  Without "..", the path resolution of the subsequent unlinkat() call
  should--or at least can--fail.

WDYT?

Thanks,
Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#20597; Package guix. (Sun, 24 May 2015 14:20:03 GMT) Full text and rfc822 format available.

Message #20 received at 20597 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 20597 <at> debbugs.gnu.org, bug-gnulib <at> gnu.org,
 Andy Patterson <ajpatter <at> uwaterloo.ca>
Subject: Re: ‘unlinkat’ bug in Linux 4.0.2 leads to tar test failure
Date: Sun, 24 May 2015 15:19:03 +0100
On 24/05/15 14:53, Ludovic Courtès wrote:
> Pádraig Brady <P <at> draigBrady.com> skribis:
> 
>> On 24/05/15 12:33, Ludovic Courtès wrote:
> 
> [...]
> 
>>> unlinkat(4, "foo_file", 0)              = 0
>>> unlinkat(AT_FDCWD, "foo", AT_REMOVEDIR) = 0
>>> unlinkat(5, "bar_file", 0)              = 0
>>> unlinkat(4, "../bar", AT_REMOVEDIR)     = -1 ENOENT (No such file or directory)
>>> --8<---------------cut here---------------end--------------->8---
>>>
>>> Contrast this with the same thing on 4.0.4-gnu:
>>>
>>> --8<---------------cut here---------------start------------->8---
>>> unlinkat(4, "foo_file", 0)              = 0
>>> unlinkat(AT_FDCWD, "foo", AT_REMOVEDIR) = 0
>>> unlinkat(5, "bar_file", 0)              = 0
>>> unlinkat(4, "../bar", AT_REMOVEDIR)     = 0
>>> --8<---------------cut here---------------end--------------->8---
>>>
>>> So this looks like a 4.0.2 kernel bug that Gnulib’s unlinkat should
>>> perhaps work around.
>>>
>>> Thoughts?
>>
>> Maybe. How widely deployed was 4.0.2 (It's not used in Red Hat land for example).
>> How many versions was the bug present for?
> 
> I don’t know, and I haven’t been able to find traces of a fix in that
> area in the kernel.
> 
> OTOH, after rereading the analysis at
> <http://lists.gnu.org/archive/html/bug-tar/2014-08/msg00010.html>, it
> may be that the 4.0.2 behavior is POSIX-conforming, in which case we’d
> rather fix tar (or its tests) instead:
> 
>   The BSD behavior appears to be in line with POSIX.  unlinkat() with
>   AT_REMOVEDIR is equivalent to rmdir(), whose specification says:
> 
>     If one or more processes have the directory open when the last
>     link is removed, the dot and dot-dot entries, if present, shall
>     be removed before rmdir() returns and no new entries may be created
>     in the directory, but the directory shall not be removed until
>     all references to the directory are closed.
> 
>   Without "..", the path resolution of the subsequent unlinkat() call
>   should--or at least can--fail.
> 
> WDYT?

Yes I agree, either behavior is possible

thanks,
Pádraig




Information forwarded to bug-guix <at> gnu.org:
bug#20597; Package guix. (Sun, 24 May 2015 14:49:02 GMT) Full text and rfc822 format available.

Message #23 received at 20597 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Pádraig Brady <P <at> draigBrady.com>, Ludovic Courtès <ludo <at> gnu.org>
Cc: bug-gnulib <at> gnu.org, Andy Patterson <ajpatter <at> uwaterloo.ca>,
 20597 <at> debbugs.gnu.org
Subject: Re: ‘unlinkat’ bug in Linux 4.0.2 leads to tar test failure
Date: Sun, 24 May 2015 07:48:26 -0700
Pádraig Brady wrote:
> Yes I agree, either behavior is possible

In that case let's change the test case to not run afoul of this 
implementtion-defined behavior.  I don't think we need to change tar, as it's a 
contrived test case that users are not likely to run into.




Information forwarded to bug-guix <at> gnu.org:
bug#20597; Package guix. (Mon, 25 May 2015 12:52:02 GMT) Full text and rfc822 format available.

Message #26 received at 20597 <at> debbugs.gnu.org (full text, mbox):

From: Andy Patterson <ajpatter <at> uwaterloo.ca>
To: Ludovic Courtès <ludo <at> gnu.org>
Subject: Re: ‘unlinkat’ bug in Linux 4.0.2 leads to tar test failure
Date: Sun, 24 May 2015 17:43:51 -0400
Hi Ludo,
> hydra.gnu.org is unfortunately too often overloaded these days, so you
> probably arrived on a bad day.  Nevertheless, the solution to this
> specific issue is for you to use substitutes to circumvent the bug
> described below.

It seems to always be busy for me; I tried on different days and at
different times of day, but always got stuck at the same place. I
believe that disabling substitutes will essentially be a requirement if
I want to install Guix, until the server becomes available again.

Thanks for your work on this.

On 05/24/2015 07:33 AM, Ludovic Courtès wrote:
> (Please keep 20597 <at> debbugs.gnu.org Cc'd.)
> (Gnulib: please scroll further down for the ‘unlinkat’ issue.)
>
> Andy Patterson <ajpatter <at> uwaterloo.ca> skribis:
>
>>> I suppose this is Guix 0.8.2 on top of another distribution, right?  Did
>>> you install from source or from the binary tarball?  Did you enable
>>> substitutes (info "(guix) Substitutes")?
>> I was using the USB install medium in a live environment.
> So this is on GuixSD 0.8.2.  ‘test-suite.log’ indeed mentions
> Linux-libre 4.0.2.
>
>> I had substitutes enabled (I'm pretty sure they're enabled by default
>> here, but I also enabled them manually just to be sure). I wasn't able
>> to install anything with substitutes enabled; it would always stall
>> while trying to update the substitutes list from hydra. When my
>> network went down briefly, it informed me that it was still at 0.0%
>> before exiting. I think that this is probably a separate issue, but
>> which which I was less concerned about since I didn't want to use
>> substitutes anyway.
> OK.
>
> hydra.gnu.org is unfortunately too often overloaded these days, so you
> probably arrived on a bad day.  Nevertheless, the solution to this
> specific issue is for you to use substitutes to circumvent the bug
> described below.
>
>>> Does the build succeed if you run it another time with:
>>>
>>>   guix build tar -K -c 1
>> I tried this (with --no-substitutes), but I don't think the test suite
>> actually runs in parallel. I didn't notice any difference in that regard
>> when it was running; it seemed to take up the same amount of time with
>> or without -c 1. I had the same tests fail with the flag enabled.
> Oh you must be right.  Looking at tests/Makefile.in, I see:
>
> --8<---------------cut here---------------start------------->8---
> check-local: atconfig atlocal $(TESTSUITE)
> 	$(SHELL) $(TESTSUITE) $(TESTSUITEFLAGS)
> --8<---------------cut here---------------end--------------->8---
>
> ... which shows that ./testsuite is not automatically passed -j,
> contrary to what I thought.
>
> <http://lists.gnu.org/archive/html/bug-tar/2014-08/msg00010.html>
> reports a similar issue but on a different OS.
>
> I just tried this in a GuixSD VM with Linux-libre 4.0.2:
>
> --8<---------------cut here---------------start------------->8---
>   mkdir foo
>   mkdir bar
>   echo foo/foo_file > foo/foo_file
>   echo bar/bar_file > bar/bar_file
>   tar -cvf foo.tar --remove-files -C foo . -C ../bar .
>   find .
>   stat bar
> --8<---------------cut here---------------end--------------->8---
>
> And indeed, it fails (that is, ‘bar’ is left behind.)  It works fine on
> 4.0.4-gnu though.
>
> On 4.0.2-gnu, I strace’d the ‘tar’ command above:
>
> --8<---------------cut here---------------start------------->8---
> openat(AT_FDCWD, "foo", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 4
>
> [...]
>
> openat(4, ".", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_NOFOLLOW|O_CLOEXEC) = 5
>
> [...]
>
> openat(5, "foo_file", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_NOFOLLOW|O_CLOEXEC) = 6
>
> [...]
>
> openat(4, "../bar", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 5
> newfstatat(5, ".", {st_mode=S_IFDIR|0755, st_size=60, ...}, AT_SYMLINK_NOFOLLOW) = 0
> openat(5, ".", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_NOFOLLOW|O_CLOEXEC) = 6
>
> [...]
>
> openat(6, "bar_file", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_NOFOLLOW|O_CLOEXEC) = 7
> fstat(7, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
> write(1, "./bar_file\n", 11)            = 11
> read(7, "x\n", 2)                       = 2
> fstat(7, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
> close(7)                                = 0
> fstat(6, {st_mode=S_IFDIR|0755, st_size=60, ...}) = 0
> brk(0x1a34000)                          = 0x1a34000
> close(6)                                = 0
> write(3, "./\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 10240) = 10240
> close(3)                                = 0
> unlinkat(4, "foo_file", 0)              = 0
> unlinkat(AT_FDCWD, "foo", AT_REMOVEDIR) = 0
> unlinkat(5, "bar_file", 0)              = 0
> unlinkat(4, "../bar", AT_REMOVEDIR)     = -1 ENOENT (No such file or directory)
> --8<---------------cut here---------------end--------------->8---
>
> Contrast this with the same thing on 4.0.4-gnu:
>
> --8<---------------cut here---------------start------------->8---
> unlinkat(4, "foo_file", 0)              = 0
> unlinkat(AT_FDCWD, "foo", AT_REMOVEDIR) = 0
> unlinkat(5, "bar_file", 0)              = 0
> unlinkat(4, "../bar", AT_REMOVEDIR)     = 0
> --8<---------------cut here---------------end--------------->8---
>
> So this looks like a 4.0.2 kernel bug that Gnulib’s unlinkat should
> perhaps work around.
>
> Thoughts?
>
> Thanks,
> Ludo’.





Information forwarded to bug-guix <at> gnu.org:
bug#20597; Package guix. (Mon, 25 May 2015 12:56:02 GMT) Full text and rfc822 format available.

Message #29 received at 20597 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Andy Patterson <ajpatter <at> uwaterloo.ca>
Cc: 20597 <at> debbugs.gnu.org
Subject: Re: ‘unlinkat’ bug in Linux 4.0.2 leads to
 tar test failure
Date: Mon, 25 May 2015 14:55:06 +0200
(Please keep 20597 <at> debbugs.gnu.org copied.)

Andy Patterson <ajpatter <at> uwaterloo.ca> skribis:

>> hydra.gnu.org is unfortunately too often overloaded these days, so you
>> probably arrived on a bad day.  Nevertheless, the solution to this
>> specific issue is for you to use substitutes to circumvent the bug
>> described below.
>
> It seems to always be busy for me; I tried on different days and at
> different times of day, but always got stuck at the same place. I
> believe that disabling substitutes will essentially be a requirement if
> I want to install Guix, until the server becomes available again.

hydra.gnu.org is occasionally very slow, but it’s available most of the
time.  If you never managed to get anything from it, could it be that
there’s something else preventing you from accessing it, such as an
incorrect network configuration or firewall rules?

Thanks,
Ludo’.




Reply sent to ludo <at> gnu.org (Ludovic Courtès):
You have taken responsibility. (Mon, 15 Jun 2015 22:31:04 GMT) Full text and rfc822 format available.

Notification sent to Andrew Patterson <ajpatter <at> uwaterloo.ca>:
bug acknowledged by developer. (Mon, 15 Jun 2015 22:31:06 GMT) Full text and rfc822 format available.

Message #34 received at 20597-done <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Andrew Patterson <ajpatter <at> uwaterloo.ca>
Cc: 20597-done <at> debbugs.gnu.org
Subject: Re: bug#20597: GNU tar fails test suite
Date: Tue, 16 Jun 2015 00:30:41 +0200
Andrew Patterson <ajpatter <at> uwaterloo.ca> skribis:

> Failed tests:
> GNU tar 1.28 test suite test groups:
>
>  NUM: FILE-NAME:LINE     TEST-GROUP-NAME
>       KEYWORDS
>
>  161: remfiles08a.at:28  remove-files deleting two subdirs in -c/non-incr. mode
>       create remove-files remfiles08 remfiles08a
>  163: remfiles08c.at:28  remove-files deleting two subdirs in -r mode
>       create append remove-files remfiles08 remfiles08c

Commit d3b4c13 (in core-updates) skips these tests, as suggested at
<http://lists.gnu.org/archive/html/bug-tar/2015-06/msg00001.html>.

Thanks,
Ludo’.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 14 Jul 2015 11:24:05 GMT) Full text and rfc822 format available.

This bug report was last modified 8 years and 291 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.