GNU bug report logs - #41607
Deleted store items are not actually deleted

Previous Next

Package: guix;

Reported by: Leo Famulari <leo <at> famulari.name>

Date: Fri, 29 May 2020 19:10:02 UTC

Severity: normal

Done: Chris Marusich <cmmarusich <at> gmail.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 41607 in the body.
You can then email your comments to 41607 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guix <at> gnu.org:
bug#41607; Package guix. (Fri, 29 May 2020 19:10:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Leo Famulari <leo <at> famulari.name>:
New bug report received and forwarded. Copy sent to bug-guix <at> gnu.org. (Fri, 29 May 2020 19:10:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Leo Famulari <leo <at> famulari.name>
To: Stephen Scheck <singularsyntax <at> gmail.com>
Cc: bug-guix <at> gnu.org
Subject: Deleted store items are not actually deleted
Date: Fri, 29 May 2020 15:09:42 -0400
From the discussion "Guix Docker image inflation" [0] on help-guix:

On Fri, May 29, 2020 at 02:29:53PM -0400, Stephen Scheck wrote:
> # Now try to delete it...
> root <at> localhost /gnu/store# guix gc --delete
> /gnu/store/x7ns2xcp8lfg24zq7gr3y8ffczn1nsxp-guix-d79c917f2-modules
> finding garbage collector roots...
> [0 MiB] deleting
> '/gnu/store/x7ns2xcp8lfg24zq7gr3y8ffczn1nsxp-guix-d79c917f2-modules'
> deleting `/gnu/store/trash'
> deleting unused links...
> note: currently hard linking saves 1181.36 MiB
> 
> # Still there...
> root <at> localhost /gnu/store# du -hs
> /gnu/store/x7ns2xcp8lfg24zq7gr3y8ffczn1nsxp-guix-d79c917f2-modules
> 210M /gnu/store/x7ns2xcp8lfg24zq7gr3y8ffczn1nsxp-guix-d79c917f2-modules

Okay, something is definitely not right.

[0] https://lists.gnu.org/archive/html/help-guix/2020-05/msg00235.html




Information forwarded to bug-guix <at> gnu.org:
bug#41607; Package guix. (Fri, 29 May 2020 23:37:01 GMT) Full text and rfc822 format available.

Message #8 received at 41607 <at> debbugs.gnu.org (full text, mbox):

From: zimoun <zimon.toutoune <at> gmail.com>
To: Stephen Scheck <singularsyntax <at> gmail.com>, 41607 <at> debbugs.gnu.org
Cc: Marius Bakke <marius <at> gnu.org>, Leo Famulari <leo <at> famulari.name>
Subject: bug#41607: Deleted store items are not actually deleted
Date: Sat, 30 May 2020 01:36:06 +0200
-help-guix

On Sat, 30 May 2020 at 00:11, Stephen Scheck <singularsyntax <at> gmail.com> wrote:
>> Do you have '/var/' in your Docker image?  Because it looks like the same than:

> If you'd like, you can fetch the exact same image and look around yourself:
>
> docker pull singularsyntax/guix:master-a5374cd # same as singularsyntax/guix:latest

Thanks!
I have already did with singularsyntax/guix-bootstrap-alpine but then
I hit some issues... Well another story. :-)

Why there is (in both images) 13 "layers"?


> CONTAINER=`docker run --detach --tty --privileged singularsyntax/guix:master-a5374cd`
> docker exec --interactive --tty $CONTAINER /run/current-system/profile/bin/bash --login

--8<---------------cut here---------------start------------->8---
for f in $(guix gc --list-dead); do du -sh $f; done | sort -h
[...]
103M    /gnu/store/flcfw8pcs2s0wnm78lp1jwid5lglky3j-guix-packages-base
103M    /gnu/store/hl8h9krwiqwn51gydc13jz4846gq55pr-guix-packages-base
106M    /gnu/store/mnvkhj7crqcd0d1xa0zqa3hd1f5hmv83-guix-packages-base
107M    /gnu/store/6sggbpgg0zkbgxwf3wa2j15dis8z7cr1-guix-packages-base
107M    /gnu/store/m0fv2xmfif5pxnfb1bscfvgyfx0x6xdc-guix-packages-base
107M    /gnu/store/n339sr8c63f0nzja6yl8zfwy1jklj19j-guix-packages-base
107M    /gnu/store/plaay02w581vx9ilyiv93sl1lw54n7h5-guix-packages-base
205M    /gnu/store/5mhn1ynxvy7jihsknsnv3yspkkvc0r5s-guix-2e59ae238-modules
205M    /gnu/store/8z9qc2bvq8azc08p4miq77yf2agk07aq-guix-843e77205-modules
205M    /gnu/store/brbwlbnx56ms50kklyqk9fsf0xkwjjf9-guix-498e2e669-modules
205M    /gnu/store/d3h4b7nvnms8d03ddi9b481dlxpykl7l-guix-5e3d16994-modules
205M    /gnu/store/ibgjq1ampj8bldrabbsnwik2sr0gg3as-guix-a43fe7acd-modules
205M    /gnu/store/l3amdz5xyhflg5wdzlxr2685dq5glic2-guix-527ab3125-modules
210M    /gnu/store/0vwg9aqzs5xrk10vcs4dl105s3f42ilf-guix-b1affd477-modules
210M    /gnu/store/47aack48aczpzm635axsy4jf2pvmwrv0-guix-ef1d475b0-modules
210M    /gnu/store/hz2rn2l0jixg91q4rsdcwc489y71ll29-guix-05e1edf22-modules
210M    /gnu/store/mych9fchln22pbhpc5syxyymx4hz496y-guix-8bd0b533b-modules
210M    /gnu/store/x7ns2xcp8lfg24zq7gr3y8ffczn1nsxp-guix-d79c917f2-modules
--8<---------------cut here---------------end--------------->8---

Well, there is 11 generations which corresponds to

--8<---------------cut here---------------start------------->8---
guix describe
Generation 12   May 29 2020 22:51:28    (current)
  guix a5374cd
    repository URL: https://git.savannah.gnu.org/git/guix.git
    branch: master
    commit: a5374cde918cfeae5c16b43b9f2dd2b24bc3564d
--8<---------------cut here---------------end--------------->8---


But "docker pull" downloaded 13 layers, well because these numbers are
closed, is it meaningful?  Really naive question.
Other said, is it not something from the Docker side?

Well, it is like Guix deletes something but this something is empty
(from the Guix side):

--8<---------------cut here---------------start------------->8---
guix gc -D /gnu/store/x7ns2xcp8lfg24zq7gr3y8ffczn1nsxp-guix-d79c917f2-modules
finding garbage collector roots...
[0 MiB] deleting
'/gnu/store/x7ns2xcp8lfg24zq7gr3y8ffczn1nsxp-guix-d79c917f2-modules'
deleting `/gnu/store/trash'
deleting unused links...
note: currently hard linking saves 1181.36 MiB

du -sh /gnu/store/x7ns2xcp8lfg24zq7gr3y8ffczn1nsxp-guix-d79c917f2-modules
210M    /gnu/store/x7ns2xcp8lfg24zq7gr3y8ffczn1nsxp-guix-d79c917f2-modules
--8<---------------cut here---------------end--------------->8---

abd the "real" target is still something on the filesytem side (du -sh).


Bah, I am not enough familiar with Docker...


Good night,
simon




Information forwarded to bug-guix <at> gnu.org:
bug#41607; Package guix. (Sun, 31 May 2020 04:57:01 GMT) Full text and rfc822 format available.

Message #11 received at 41607 <at> debbugs.gnu.org (full text, mbox):

From: Chris Marusich <cmmarusich <at> gmail.com>
To: Leo Famulari <leo <at> famulari.name>
Cc: 41607 <at> debbugs.gnu.org, Stephen Scheck <singularsyntax <at> gmail.com>
Subject: Re: bug#41607: Deleted store items are not actually deleted
Date: Sat, 30 May 2020 21:56:38 -0700
[Message part 1 (text/plain, inline)]
Hi Leo, Stephen, and others,

I originally wrote a very detailed email describing my investigation of
this issue and the results.  However, I accidentally deleted it, so
please bear with me as I write a rough summary instead.

Leo Famulari <leo <at> famulari.name> writes:

>>From the discussion "Guix Docker image inflation" [0] on help-guix:
>
> On Fri, May 29, 2020 at 02:29:53PM -0400, Stephen Scheck wrote:
>> # Now try to delete it...
>> root <at> localhost /gnu/store# guix gc --delete
>> /gnu/store/x7ns2xcp8lfg24zq7gr3y8ffczn1nsxp-guix-d79c917f2-modules
>> finding garbage collector roots...
>> [0 MiB] deleting
>> '/gnu/store/x7ns2xcp8lfg24zq7gr3y8ffczn1nsxp-guix-d79c917f2-modules'
>> deleting `/gnu/store/trash'
>> deleting unused links...
>> note: currently hard linking saves 1181.36 MiB
>> 
>> # Still there...
>> root <at> localhost /gnu/store# du -hs
>> /gnu/store/x7ns2xcp8lfg24zq7gr3y8ffczn1nsxp-guix-d79c917f2-modules
>> 210M /gnu/store/x7ns2xcp8lfg24zq7gr3y8ffczn1nsxp-guix-d79c917f2-modules
>
> Okay, something is definitely not right.
>
> [0] https://lists.gnu.org/archive/html/help-guix/2020-05/msg00235.html

There are two problems.  One is that Stephen's images are getting larger
every day.  The other is that Guix is failing to GC dead store items in
the Docker container.  The latter does not cause the former.

The reason Guix is failing to GC dead items in the Docker container is
because those dead items are not on the "top layer", so Docker returns
an EXDEV error:

https://docs.docker.com/storage/storagedriver/overlayfs-driver/

"Renaming directories: Calling rename(2) for a directory is allowed only
when both the source and the destination path are on the top
layer. Otherwise, it returns EXDEV error ('cross-device link not
permitted'). Your application needs to be designed to handle EXDEV and
fall back to a 'copy and unlink' strategy."

You can observe this by running guix-daemon with strace in the
container, and watching what happens when you try to delete one of the
offending store items (make sure it is a directory).  For example:

--8<---------------cut here---------------start------------->8---
685   rename("/gnu/store/xib50iqk3w1gw9l770mad59m9bi3bcpc-manual-database", "/gnu/store/trash/xib50iqk3w1gw9l770mad59m9bi3bcpc-manual-database") = -1 EXDEV (Invalid cross-device link)
--8<---------------cut here---------------end--------------->8---

In most cases, when guix-daemon GC's a dead directory, it does this
(see: nix/libstore/gc.cc):

- Create a trash directory (usually /gnu/store/trash)
- Move dead directories into the trash directory.
- Delete the trash directory.

The trash directory is on the "top layer" because it gets created in the
running container.  However, in practice many store items from lower
layers are made dead when Stephen's script runs "guix pull" and deletes
the old profiles.  If any of those store items were directories,
guix-daemon will fail to GC them because of an XDEV error.  If this is
confusing to you, I suggest you experiment with Docker a little bit, and
look closely at the steps that Stephen's script is running.  I outlined
this in the email I accidentally deleted, but I'm a little too tired to
reproduce it all a second time.  I hope you'll understand.

Should Guix do anything about this?  We could change guix-daemon to take
correct action in the face of an XDEV error.  We could also improve the
logging, since currently it silently swallows the XDEV error.

However, even if we make those changes and Guix is able to GC the dead
store items, it would not prevent Stephen's images from growing in size
without bound.  There would still be many store items that came from a
prior image (i.e., a lower layer), became dead after running "guix pull"
and deleting old profile generations, and still exist in the prior
layer, even though they are not visible in the running container.  This
is due to Docker's design, in which the visible file system is the
result of stitching together all the layers with overlayfs.

To work around the issue, Stephen can build the images from the same
base image, rather than daisy-chaining new images from old ones.  That
way, they would not accumulate layers without bound.

-- 
Chris
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-guix <at> gnu.org:
bug#41607; Package guix. (Sun, 31 May 2020 08:25:02 GMT) Full text and rfc822 format available.

Message #14 received at 41607 <at> debbugs.gnu.org (full text, mbox):

From: zimoun <zimon.toutoune <at> gmail.com>
To: Stephen Scheck <singularsyntax <at> gmail.com>, 41607 <at> debbugs.gnu.org
Cc: Chris Marusich <cmmarusich <at> gmail.com>, help-guix <help-guix <at> gnu.org>
Subject: Re: Guix Docker image inflation
Date: Sun, 31 May 2020 10:24:25 +0200
Dear Stephen,

Follow ups of
https://lists.gnu.org/archive/html/help-guix/2020-05/msg00249.html
and bug#41607 CC http://issues.guix.gnu.org/41607


On Sat, 30 May 2020 at 19:02, Stephen Scheck <singularsyntax <at> gmail.com> wrote:

> You can convince yourself of this by doing something like the following:
>
>     docker run <some-linux-image>
>     docker exec <container-id> dd if=/dev/urandom of=/RANDOM-DATA
> bs=1048576 count=1024
>     docker commit <container-id>
>     docker exec <container-id> rm /RANDOM-DATA
>     docker commit <container-id>

It does not convince myself and maybe I am doing wrongly but it is not
what I am observing for an example with more than 2 'commits'.  Here
my session, based on your images rename "fresh" because it will happen
on any image.

--8<---------------cut here---------------start------------->8---
$ docker images
REPOSITORY                      TAG                 IMAGE ID
 CREATED             SIZE
fresh                           latest              c36cef8306d5
 3 weeks ago         1.06GB
singularsyntax/guix-bootstrap   1.1.0-alpine-3.11   c36cef8306d5
 3 weeks ago         1.06GB

$ CONTAINER=`docker run --detach --tty --privileged fresh`
$ docker exec --interactive --tty $CONTAINER /bin/sh
/ # dd if=/dev/urandom of=/DATA bs=1234567 count=1024
dd if=/dev/urandom of=/DATA bs=1234567 count=1024
1024+0 records in
1024+0 records out
/ # exit
exit

$ HASH=`docker commit $CONTAINER` && docker tag $HASH add-data
$ docker stop $CONTAINER && docker rm $CONTAINER
cb89992b76ace2afe5dc6e082c8de121c483dfeeb688d89849713e2cf90b68c7
cb89992b76ace2afe5dc6e082c8de121c483dfeeb688d89849713e2cf90b68c7

$ CONTAINER=`docker run --detach --tty --privileged add-data`
$ docker exec --interactive --tty $CONTAINER /bin/sh
/ # rm /DATA
rm /DATA
/ # dd if=/dev/urandom of=/OTHER bs=1234567 count=1024
dd if=/dev/urandom of=/OTHER bs=1234567 count=1024
1024+0 records in
1024+0 records out
/ # exit
exit

$ HASH=`docker commit $CONTAINER` && docker tag $HASH rm-data-add-other
$ docker stop $CONTAINER && docker rm $CONTAINER
93e9afe593226ec29669efe8515b47487f455d4ad5e012cc67372c2549ec7256
93e9afe593226ec29669efe8515b47487f455d4ad5e012cc67372c2549ec7256

$ CONTAINER=`docker run --detach --tty --privileged rm-data-add-other`
$ docker exec --interactive --tty $CONTAINER /bin/sh
/ # rm /OTHER
rm /OTHER
/ # exit
exit

$ HASH=`docker commit $CONTAINER` && docker tag $HASH rm-other

$ docker stop $CONTAINER && docker rm $CONTAINER
469b341c2f394ef05f5f43a5d96239853e3552d979028a457a9bdd1096a8fab4
469b341c2f394ef05f5f43a5d96239853e3552d979028a457a9bdd1096a8fab4

$ docker images
REPOSITORY                      TAG                 IMAGE ID
 CREATED              SIZE
rm-other                        latest              b80d300aa755
 23 seconds ago       3.59GB
rm-data-add-other               latest              de551eac1d55
 About a minute ago   3.59GB
add-data                        latest              6a563daccccd
 3 minutes ago        2.32GB
fresh                           latest              c36cef8306d5
 3 weeks ago          1.06GB
singularsyntax/guix-bootstrap   1.1.0-alpine-3.11   c36cef8306d5
 3 weeks ago          1.06GB

$ CONTAINER=`docker run --detach --tty --privileged rm-other`
$ docker exec --interactive --tty $CONTAINER /bin/sh
/ # ls /
ls /
bin    dev    etc    gnu    home   lib    media  mnt    opt    proc
root   run    sbin   srv    sys    tmp    usr    var
/ # exit
exit
--8<---------------cut here---------------end--------------->8---

> You'll end up with two new images - the first one should be about 1 GB
> larger than the base image,
> the second one the same size.

As you see, the image 'rm-other' does not container either /DATA or
/OTHER and its size is not the same than the initial one 'fresh'.  So
I do not know if the correct Docker terminology is "layer" because the
issue is definitely on the Docker side and not on the Guix side.


Cheers,
simon




Information forwarded to bug-guix <at> gnu.org:
bug#41607; Package guix. (Sun, 31 May 2020 09:28:02 GMT) Full text and rfc822 format available.

Message #17 received at 41607 <at> debbugs.gnu.org (full text, mbox):

From: zimoun <zimon.toutoune <at> gmail.com>
To: Chris Marusich <cmmarusich <at> gmail.com>
Cc: 41607 <at> debbugs.gnu.org, Stephen Scheck <singularsyntax <at> gmail.com>,
 Leo Famulari <leo <at> famulari.name>
Subject: Re: bug#41607: Deleted store items are not actually deleted
Date: Sun, 31 May 2020 11:27:36 +0200
Hi Chris,

Thank you for the detailed explanations.

On Sun, 31 May 2020 at 06:57, Chris Marusich <cmmarusich <at> gmail.com> wrote:

> The trash directory is on the "top layer" because it gets created in the
> running container.  However, in practice many store items from lower
> layers are made dead when Stephen's script runs "guix pull" and deletes
> the old profiles.  If any of those store items were directories,
> guix-daemon will fail to GC them because of an XDEV error.  If this is
> confusing to you, I suggest you experiment with Docker a little bit, and
> look closely at the steps that Stephen's script is running.  I outlined
> this in the email I accidentally deleted, but I'm a little too tired to
> reproduce it all a second time.  I hope you'll understand.
>
> Should Guix do anything about this?  We could change guix-daemon to take
> correct action in the face of an XDEV error.  We could also improve the
> logging, since currently it silently swallows the XDEV error.

What is the correct action?  Because somehow these items cannot be
removed of the Docker chained images.
However, the logging could report the ``error", e.g., "guix gc" lists
all the items and their size and currently it says "[0MiB] deleting"
so instead it could say "[XDEV error] not valid" with some words about
that in the manual, section "Invking guix gc".  WDYT?



Cheers,
simon




Information forwarded to bug-guix <at> gnu.org:
bug#41607; Package guix. (Sun, 31 May 2020 10:51:02 GMT) Full text and rfc822 format available.

Message #20 received at 41607 <at> debbugs.gnu.org (full text, mbox):

From: Vincent Legoll <vincent.legoll <at> gmail.com>
To: zimoun <zimon.toutoune <at> gmail.com>,
 Stephen Scheck <singularsyntax <at> gmail.com>, 41607 <at> debbugs.gnu.org
Cc: help-guix <help-guix <at> gnu.org>
Subject: Re: Guix Docker image inflation
Date: Sun, 31 May 2020 12:50:26 +0200
Hello,

maybe you can try:

docker export <CONTAINER ID> | docker import - img_name

This should flatten the layers back to a single one.

-- 
Vincent Legoll





Information forwarded to bug-guix <at> gnu.org:
bug#41607; Package guix. (Sun, 31 May 2020 17:59:01 GMT) Full text and rfc822 format available.

Message #23 received at 41607 <at> debbugs.gnu.org (full text, mbox):

From: zimoun <zimon.toutoune <at> gmail.com>
To: Vincent Legoll <vincent.legoll <at> gmail.com>
Cc: 41607 <at> debbugs.gnu.org, help-guix <help-guix <at> gnu.org>,
 Stephen Scheck <singularsyntax <at> gmail.com>
Subject: Re: Guix Docker image inflation
Date: Sun, 31 May 2020 19:58:13 +0200
Hi Vincent,

On Sun, 31 May 2020 at 12:50, Vincent Legoll <vincent.legoll <at> gmail.com> wrote:

> docker export <CONTAINER ID> | docker import - img_name

I do not know if it really works here.  Maybe I am doing incorrectly...

--8<---------------cut here---------------start------------->8---
$ docker images --format "{{.Size}}\t{{.Repository}}"
959MB    4reexport
960MB    3clean
960MB    2remove-hello
959MB    1install-hello
578MB    0new-fresh
1.06GB    fresh
1.06GB    singularsyntax/guix-bootstrap
--8<---------------cut here---------------end--------------->8---

Well, and the interesting part is:

--8<---------------cut here---------------start------------->8---
$ CONTAINER=`docker run --detach --tty --privileged 4reexport`
$ docker exec --interactive --tty $CONTAINER /bin/sh
/ # /root/.config/guix/current/bin/guix gc --list-live | grep hello
/root/.config/guix/current/bin/guix gc --list-live | grep hello
finding garbage collector roots...
determining live/dead paths...
/ # /root/.config/guix/current/bin/guix gc --list-dead | grep hello
/root/.config/guix/current/bin/guix gc --list-dead | grep hello
finding garbage collector roots...
determining live/dead paths...
/gnu/store/kg9mirg6xbvzcp0a98v7326n1nvvwgsj-hello-2.10
/ # /root/.config/guix/current/bin/guix gc --references
/gnu/store/kg9mirg6xbvzcp0a98v7326n1nvvwgsj-hello-2.10
/root/.config/guix/current/bin/guix gc --references
/gnu/store/kg9mirg6xbvzcp0a98v7326n1nvvwgsj-he
llo-2.10
guix gc: error: path
`/gnu/store/kg9mirg6xbvzcp0a98v7326n1nvvwgsj-hello-2.10' is not valid
/ # exit
--8<---------------cut here---------------end--------------->8---



Just for the record, the commands run:

--8<---------------cut here---------------start------------->8---
$ CONTAINER=`docker run --detach --tty --privileged fresh`
$ CMD='CMD "/root/.config/guix/current/bin/guix-daemon"
"--build-users-group=guixbuild"'
$ docker export $CONTAINER \
   | docker import --change $CMD - 0new-fresh

$ CONTAINER=`docker run --detach --tty --privileged 0new-fresh`
$ docker exec --interactive --tty $CONTAINER /bin/sh
/ # /root/.config/guix/current/bin/guix install hello
/ # exit

$ docker stop $CONTAINER
$ HASH=`docker commit $CONTAINER` && docker tag $HASH 1install-hello

$ CONTAINER=`docker run --detach --tty --privileged 1install-hello`
$ docker exec --interactive --tty $CONTAINER /bin/sh
/ # /root/.config/guix/current/bin/guix remove hello
/ # exit

$ docker stop $CONTAINER
$ HASH=`docker commit $CONTAINER` && docker tag $HASH 2remove-hello

$ CONTAINER=`docker run --detach --tty --privileged 2remove-hello`
$ docker exec --interactive --tty $CONTAINER /bin/sh
/ # /root/.config/guix/current/bin/guix pull -d
/ # /root/.config/guix/current/bin/guix package -d
/ # /root/.config/guix/current/bin/guix gc
/ # exit
$ docker stop $CONTAINER
$ HASH=`docker commit $CONTAINER` && docker tag $HASH 3clean

$ CONTAINER=`docker run --detach --tty --privileged 3clean`
$ docker export $CONTAINER | docker import --change $CMD - 4reexport
--8<---------------cut here---------------end--------------->8---

where I cheated with $CMD which does not as is but the full 'CMD...'
has to be typed after '--change'.


All the best,
simon




Information forwarded to bug-guix <at> gnu.org:
bug#41607; Package guix. (Thu, 04 Jun 2020 11:59:02 GMT) Full text and rfc822 format available.

Message #26 received at 41607 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Chris Marusich <cmmarusich <at> gmail.com>
Cc: 41607 <at> debbugs.gnu.org, Stephen Scheck <singularsyntax <at> gmail.com>,
 Leo Famulari <leo <at> famulari.name>
Subject: Re: bug#41607: Deleted store items are not actually deleted
Date: Thu, 04 Jun 2020 13:58:17 +0200
Hi,

Chris Marusich <cmmarusich <at> gmail.com> skribis:

> The reason Guix is failing to GC dead items in the Docker container is
> because those dead items are not on the "top layer", so Docker returns
> an EXDEV error:
>
> https://docs.docker.com/storage/storagedriver/overlayfs-driver/
>
> "Renaming directories: Calling rename(2) for a directory is allowed only
> when both the source and the destination path are on the top
> layer. Otherwise, it returns EXDEV error ('cross-device link not
> permitted'). Your application needs to be designed to handle EXDEV and
> fall back to a 'copy and unlink' strategy."
>
> You can observe this by running guix-daemon with strace in the
> container, and watching what happens when you try to delete one of the
> offending store items (make sure it is a directory).  For example:
>
> 685   rename("/gnu/store/xib50iqk3w1gw9l770mad59m9bi3bcpc-manual-database", "/gnu/store/trash/xib50iqk3w1gw9l770mad59m9bi3bcpc-manual-database") = -1 EXDEV (Invalid cross-device link)
>
> In most cases, when guix-daemon GC's a dead directory, it does this
> (see: nix/libstore/gc.cc):
>
> - Create a trash directory (usually /gnu/store/trash)
> - Move dead directories into the trash directory.
> - Delete the trash directory.
>
> The trash directory is on the "top layer" because it gets created in the
> running container.  However, in practice many store items from lower
> layers are made dead when Stephen's script runs "guix pull" and deletes
> the old profiles.  If any of those store items were directories,
> guix-daemon will fail to GC them because of an XDEV error.  If this is
> confusing to you, I suggest you experiment with Docker a little bit, and
> look closely at the steps that Stephen's script is running.  I outlined
> this in the email I accidentally deleted, but I'm a little too tired to
> reproduce it all a second time.  I hope you'll understand.

Interesting, thanks for the analysis!

> Should Guix do anything about this?  We could change guix-daemon to take
> correct action in the face of an XDEV error.  We could also improve the
> logging, since currently it silently swallows the XDEV error.

I guess we could delete recursively right away upon EXDEV.  It should be
just two lines of code, right?

> To work around the issue, Stephen can build the images from the same
> base image, rather than daisy-chaining new images from old ones.  That
> way, they would not accumulate layers without bound.

Maybe that too.

Thanks,
Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#41607; Package guix. (Thu, 04 Jun 2020 18:51:01 GMT) Full text and rfc822 format available.

Message #29 received at 41607 <at> debbugs.gnu.org (full text, mbox):

From: Chris Marusich <cmmarusich <at> gmail.com>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 41607 <at> debbugs.gnu.org, Stephen Scheck <singularsyntax <at> gmail.com>,
 Leo Famulari <leo <at> famulari.name>
Subject: Re: bug#41607: Deleted store items are not actually deleted
Date: Thu, 04 Jun 2020 11:50:05 -0700
[Message part 1 (text/plain, inline)]
Ludovic Courtès <ludo <at> gnu.org> writes:

>> Should Guix do anything about this?  We could change guix-daemon to take
>> correct action in the face of an XDEV error.  We could also improve the
>> logging, since currently it silently swallows the XDEV error.
>
> I guess we could delete recursively right away upon EXDEV.  It should be
> just two lines of code, right?

I'll try making the change and report back.  Yes, there are other cases
where we immediately delete without moving into the trash directory
(e.g., when the trash directory fails to be created), so it seems OK.

-- 
Chris
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-guix <at> gnu.org:
bug#41607; Package guix. (Fri, 05 Jun 2020 09:33:01 GMT) Full text and rfc822 format available.

Message #32 received at 41607 <at> debbugs.gnu.org (full text, mbox):

From: Christopher Marusich <cmmarusich <at> gmail.com>
To: Chris Marusich <cmmarusich <at> gmail.com>
Cc: 41607 <at> debbugs.gnu.org, Ludovic Courtès <ludo <at> gnu.org>,
 Stephen Scheck <singularsyntax <at> gmail.com>, Leo Famulari <leo <at> famulari.name>
Subject: Re: bug#41607: Deleted store items are not actually deleted
Date: Fri, 05 Jun 2020 02:32:20 -0700
[Message part 1 (text/plain, inline)]
Chris Marusich <cmmarusich <at> gmail.com> writes:

> Ludovic Courtès <ludo <at> gnu.org> writes:
>
>>> Should Guix do anything about this?  We could change guix-daemon to take
>>> correct action in the face of an XDEV error.  We could also improve the
>>> logging, since currently it silently swallows the XDEV error.
>>
>> I guess we could delete recursively right away upon EXDEV.  It should be
>> just two lines of code, right?
>
> I'll try making the change and report back.  Yes, there are other cases
> where we immediately delete without moving into the trash directory
> (e.g., when the trash directory fails to be created), so it seems OK.

Here is a patch.  Turns out it's was just a one line change!  If nobody
has any further feedback on it, I'll go ahead and merge it to the master
branch in the next couple days.

I tested it in one of the Docker containers provided by Stephen which
exhibited the problem.  I built the new Guix inside the Docker container
and verified that a path which was previously unable to be GC'd due to
the EXDEV error, was now able to be successfully GC'd.

My understanding is that the only reason why the guix-daemon attempts to
move dead directories to the trash directory is to save time on
deleting, since large directories could take a while to fully delete.
If there is any reason why it might be unsafe to delete the directories
directly in case of EXDEV (I cannot think of any), please let me know.

-- 
Chris
[0001-daemon-Handle-EXDEV-when-moving-to-trash-directory.patch (text/x-patch, attachment)]
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-guix <at> gnu.org:
bug#41607; Package guix. (Fri, 05 Jun 2020 09:37:02 GMT) Full text and rfc822 format available.

Message #35 received at 41607 <at> debbugs.gnu.org (full text, mbox):

From: zimoun <zimon.toutoune <at> gmail.com>
To: Christopher Marusich <cmmarusich <at> gmail.com>
Cc: 41607 <at> debbugs.gnu.org, Stephen Scheck <singularsyntax <at> gmail.com>
Subject: Re: bug#41607: Deleted store items are not actually deleted
Date: Fri, 5 Jun 2020 11:36:17 +0200
Hi Chris,

On Fri, 5 Jun 2020 at 11:33, Christopher Marusich <cmmarusich <at> gmail.com> wrote:
> Chris Marusich <cmmarusich <at> gmail.com> writes:
> > Ludovic Courtès <ludo <at> gnu.org> writes:

> Here is a patch.  Turns out it's was just a one line change!  If nobody
> has any further feedback on it, I'll go ahead and merge it to the master
> branch in the next couple days.

Really cool!  How can I test it?


All the best,
simon




Information forwarded to bug-guix <at> gnu.org:
bug#41607; Package guix. (Fri, 05 Jun 2020 16:23:02 GMT) Full text and rfc822 format available.

Message #38 received at 41607 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Christopher Marusich <cmmarusich <at> gmail.com>
Cc: 41607 <at> debbugs.gnu.org, Stephen Scheck <singularsyntax <at> gmail.com>,
 Leo Famulari <leo <at> famulari.name>
Subject: Re: bug#41607: Deleted store items are not actually deleted
Date: Fri, 05 Jun 2020 18:21:56 +0200
Howdy!

Christopher Marusich <cmmarusich <at> gmail.com> skribis:

> Here is a patch.  Turns out it's was just a one line change!  If nobody
> has any further feedback on it, I'll go ahead and merge it to the master
> branch in the next couple days.

Yay!

> From 505481a6a22819a42320f693988c3f8e13ded080 Mon Sep 17 00:00:00 2001
> From: Chris Marusich <cmmarusich <at> gmail.com>
> Date: Thu, 4 Jun 2020 23:26:19 -0700
> Subject: [PATCH] daemon: Handle EXDEV when moving to trash directory.
>
> Fixes <https://bugs.gnu.org/41607>.
> Reported by Stephen Scheck <singularsyntax <at> gmail.com>.
>
> * nix/libstore/gc.cc (LocalStore::deletePathRecursive): When we try to
> move a dead directory into the trashDir using rename(2) but it returns
> an EXDEV error, just delete the directory instead.  This can happen in a
> Docker container when the directory is not on the "top layer".
> ---
>  nix/libstore/gc.cc | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/nix/libstore/gc.cc b/nix/libstore/gc.cc
> index 8bc4e01eb0..845fe278c7 100644
> --- a/nix/libstore/gc.cc
> +++ b/nix/libstore/gc.cc
> @@ -455,7 +455,10 @@ void LocalStore::deletePathRecursive(GCState & state, const Path & path)
>                  throw SysError(format("unable to rename `%1%' to `%2%'") % path % tmp);
>              state.bytesInvalidated += size;
>          } catch (SysError & e) {
> -            if (e.errNo == ENOSPC) {
> +            // In a Docker container, rename(2) returns EXDEV when the source
> +            // and destination are not both on the "top layer".  See:
> +            // https://bugs.gnu.org/41607
> +            if (e.errNo == ENOSPC || e.errNo == EXDEV) {
>                  printMsg(lvlInfo, format("note: can't create move `%1%': %2%") % path % e.msg());
>                  deleteGarbage(state, path);
>              }

For consistency with (most) of the code, I’d suggest a /* */ comment.

Otherwise LGTM, thank you!

Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#41607; Package guix. (Fri, 05 Jun 2020 17:06:01 GMT) Full text and rfc822 format available.

Message #41 received at 41607 <at> debbugs.gnu.org (full text, mbox):

From: Stephen Scheck <singularsyntax <at> gmail.com>
To: Christopher Marusich <cmmarusich <at> gmail.com>
Cc: 41607 <at> debbugs.gnu.org, Ludovic Courtès <ludo <at> gnu.org>,
 zimoun <zimon.toutoune <at> gmail.com>, Leo Famulari <leo <at> famulari.name>
Subject: Re: bug#41607: Deleted store items are not actually deleted
Date: Fri, 5 Jun 2020 12:05:24 -0400
[Message part 1 (text/plain, inline)]
Very cool - thanks Chris!

In the meantime, I've updated my build script to externalize the Guix
environment from the Docker container.
So far the daily builds are staying nice and small at ~197MB and one layer.
The images and updated script are
here if anybody is curious:

https://hub.docker.com/r/singularsyntax/guix/tags
https://gitlab.com/singularsyntax-docker-hub/guix/-/blob/master/.gitlab-ci.yml

GitLab allows you to cache files between job stages and even full pipeline
runs. I first tried to cache /var/guix
and /gnu/store and mount them inside a container in which to perform `guix
pull` etc. However, it seems
that handling hard links on externally mounted file systems from within a
container is problematic. I think
passing `--disable-deduplication` to guix-daemon might resolve it, but I
couldn't figure out where/how to
change the Shepherd configuration to do that. So instead, I just copy the
directories into and out of the
container at start and end of the process. It seems to work. Mounting would
probably speed up the process
quite a bit if I could make it work.

Cheers,
-Steve

On Fri, Jun 5, 2020 at 5:32 AM Christopher Marusich <cmmarusich <at> gmail.com>
wrote:

> Chris Marusich <cmmarusich <at> gmail.com> writes:
>
> > Ludovic Courtès <ludo <at> gnu.org> writes:
> >
> >>> Should Guix do anything about this?  We could change guix-daemon to
> take
> >>> correct action in the face of an XDEV error.  We could also improve the
> >>> logging, since currently it silently swallows the XDEV error.
> >>
> >> I guess we could delete recursively right away upon EXDEV.  It should be
> >> just two lines of code, right?
> >
> > I'll try making the change and report back.  Yes, there are other cases
> > where we immediately delete without moving into the trash directory
> > (e.g., when the trash directory fails to be created), so it seems OK.
>
> Here is a patch.  Turns out it's was just a one line change!  If nobody
> has any further feedback on it, I'll go ahead and merge it to the master
> branch in the next couple days.
>
> I tested it in one of the Docker containers provided by Stephen which
> exhibited the problem.  I built the new Guix inside the Docker container
> and verified that a path which was previously unable to be GC'd due to
> the EXDEV error, was now able to be successfully GC'd.
>
> My understanding is that the only reason why the guix-daemon attempts to
> move dead directories to the trash directory is to save time on
> deleting, since large directories could take a while to fully delete.
> If there is any reason why it might be unsafe to delete the directories
> directly in case of EXDEV (I cannot think of any), please let me know.
>
> --
> Chris
>
[Message part 2 (text/html, inline)]

Reply sent to Chris Marusich <cmmarusich <at> gmail.com>:
You have taken responsibility. (Sun, 07 Jun 2020 01:32:02 GMT) Full text and rfc822 format available.

Notification sent to Leo Famulari <leo <at> famulari.name>:
bug acknowledged by developer. (Sun, 07 Jun 2020 01:32:02 GMT) Full text and rfc822 format available.

Message #46 received at 41607-close <at> debbugs.gnu.org (full text, mbox):

From: Chris Marusich <cmmarusich <at> gmail.com>
To: Ludovic Courtès <ludo <at> gnu.org>,  Stephen Scheck
 <singularsyntax <at> gmail.com>,  zimoun <zimon.toutoune <at> gmail.com>
Cc: 41607-close <at> debbugs.gnu.org, Leo Famulari <leo <at> famulari.name>
Subject: Re: bug#41607: Deleted store items are not actually deleted
Date: Sat, 06 Jun 2020 18:31:08 -0700
[Message part 1 (text/plain, inline)]
Hi everyone,

I have committed the fix in d445c30ea6 and updated the guix package
definition in ecbde6505c to ensure that the next time "guix pull" is
run, the new guix-daemon version will be used.

Ludovic Courtès <ludo <at> gnu.org> writes:

> For consistency with (most) of the code, I’d suggest a /* */ comment.

OK - I did this in d445c30ea6.

Stephen Scheck <singularsyntax <at> gmail.com> writes:

> I've updated my build script to externalize the Guix environment from
> the Docker container.  So far the daily builds are staying nice and
> small at ~197MB and one layer.  The images and updated script are here
> if anybody is curious:
>
> https://hub.docker.com/r/singularsyntax/guix/tags
> https://gitlab.com/singularsyntax-docker-hub/guix/-/blob/master/.gitlab-ci.yml

Neat!  I'm glad you were figure out a way to to keep the image from
growing in size.  It's good to know of a way to do that.

Thank you for your help and have a nice day!  I'm closing this bug
report.

-- 
Chris
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-guix <at> gnu.org:
bug#41607; Package guix. (Sun, 07 Jun 2020 10:08:02 GMT) Full text and rfc822 format available.

Message #49 received at 41607-close <at> debbugs.gnu.org (full text, mbox):

From: zimoun <zimon.toutoune <at> gmail.com>
To: Chris Marusich <cmmarusich <at> gmail.com>
Cc: 41607-close <at> debbugs.gnu.org,
 Ludovic Courtès <ludo <at> gnu.org>,
 Stephen Scheck <singularsyntax <at> gmail.com>, Leo Famulari <leo <at> famulari.name>
Subject: Re: bug#41607: Deleted store items are not actually deleted
Date: Sun, 7 Jun 2020 12:07:29 +0200
Hi Chris,

Thank you.

On Sun, 7 Jun 2020 at 03:31, Chris Marusich <cmmarusich <at> gmail.com> wrote:

> I have committed the fix in d445c30ea6 and updated the guix package
> definition in ecbde6505c to ensure that the next time "guix pull" is
> run, the new guix-daemon version will be used.

Sorry to be late, I have things like that:

-8<---------------cut here---------------start------------->8---
[548 MiB] deleting '/gnu/store/dkzivzn17qilmqdfpyps62b395wxhshh-openssl-1.1.1f'
note: can't create move
`/gnu/store/dkzivzn17qilmqdfpyps62b395wxhshh-openssl-1.1 unable to
rename `/gnu/store/dkzivzn17qilmqdfpyps62b395wxhshh-openssl-1.1.1f' to
`/gnu/store/trash/dkzivzn17qilmqdfpyps62b395wxhshh-openssl-1.1.1f':
Invalid cross-device link
[553 MiB] deleting '/gnu/store/yaajjri3cks8rhkb29z08d3q5waww0dx-packages.scm'
--8<---------------cut here---------------end--------------->8---

I should do something wrong...


All the best,
simon




Information forwarded to bug-guix <at> gnu.org:
bug#41607; Package guix. (Mon, 08 Jun 2020 07:44:02 GMT) Full text and rfc822 format available.

Message #52 received at 41607 <at> debbugs.gnu.org (full text, mbox):

From: Chris Marusich <cmmarusich <at> gmail.com>
To: zimoun <zimon.toutoune <at> gmail.com>
Cc: 41607-close <at> debbugs.gnu.org, 41607 <at> debbugs.gnu.org,
 Ludovic Courtès <ludo <at> gnu.org>,
 Stephen Scheck <singularsyntax <at> gmail.com>, Leo Famulari <leo <at> famulari.name>
Subject: Re: bug#41607: Deleted store items are not actually deleted
Date: Mon, 08 Jun 2020 00:43:48 -0700
[Message part 1 (text/plain, inline)]
Hi zimoun,

zimoun <zimon.toutoune <at> gmail.com> writes:

> On Sun, 7 Jun 2020 at 03:31, Chris Marusich <cmmarusich <at> gmail.com> wrote:
>
>> I have committed the fix in d445c30ea6 and updated the guix package
>> definition in ecbde6505c to ensure that the next time "guix pull" is
>> run, the new guix-daemon version will be used.
>
> Sorry to be late, I have things like that:
>
> [548 MiB] deleting '/gnu/store/dkzivzn17qilmqdfpyps62b395wxhshh-openssl-1.1.1f'
> note: can't create move
> `/gnu/store/dkzivzn17qilmqdfpyps62b395wxhshh-openssl-1.1 unable to
> rename `/gnu/store/dkzivzn17qilmqdfpyps62b395wxhshh-openssl-1.1.1f' to
> `/gnu/store/trash/dkzivzn17qilmqdfpyps62b395wxhshh-openssl-1.1.1f':
> Invalid cross-device link
> [553 MiB] deleting '/gnu/store/yaajjri3cks8rhkb29z08d3q5waww0dx-packages.scm'

That looks normal to me.  Previously, the cross-device link error would
be silently ignored and the directory would not be deleted - no message
would be printed about the cross-device link error.

Now, however, the cross-device link error is reported, and the directory
is deleted.  If you find that the directory is not actually deleted,
please let me know so I can look into it more.  When I tested it on my
end, the directory was in fact deleted.

-- 
Chris
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-guix <at> gnu.org:
bug#41607; Package guix. (Mon, 08 Jun 2020 07:44:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Mon, 06 Jul 2020 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 3 years and 294 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.