GNU bug report logs - #44261
running a daemon with userns in relocateble pack breaks

Previous Next

Package: guix;

Reported by: Jan Nieuwenhuizen <janneke <at> gnu.org>

Date: Tue, 27 Oct 2020 19:50:01 UTC

Severity: important

Done: Ludovic Courtès <ludo <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 44261 in the body.
You can then email your comments to 44261 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guix <at> gnu.org:
bug#44261; Package guix. (Tue, 27 Oct 2020 19:50:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Jan Nieuwenhuizen <janneke <at> gnu.org>:
New bug report received and forwarded. Copy sent to bug-guix <at> gnu.org. (Tue, 27 Oct 2020 19:50:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Jan Nieuwenhuizen <janneke <at> gnu.org>
To: bug-guix <at> gnu.org
Subject: running a daemon with userns in relocateble pack breaks
Date: Tue, 27 Oct 2020 20:49:19 +0100
[Message part 1 (text/plain, inline)]
Hi!

As mentioned on IRC, running a daemon from a guix relocatable pack on a
foreign distro using the user namespace feature is troublesome: it looks
as if the daemon "loses" (its view of) the file-system once the parent
process that creates the daemon exits.

I'm attatching a package description for a test package "vork".  It
builds a program "test" that forks the program "daemon".

The daemon program reads a character from /dev/urandom, prints it,
and sleeps for a second; 10 times.

The "test" parent program exits after 5 seconds.  When the parent
program exits, the daemon crashes.

To reproduce, put "vork.scm" in a fresh directory and do something like:

--8<---------------cut here---------------start------------->8---
fakeroot tar xf $(GUIX_PACKAGE_PATH=. guix pack --relocatable\
  --symlink=/gnu/bin=bin guile shepherd vork --no-offload)
guix gc -D $(guix build -f vork.scm)
touch /tmp/daemon.log
tail -f /tmp/daemon.log &
GUILE_LOAD_COMPILED_PATH=$PWD/$(ls -1d gnu/store/*profile)/lib/guile/3.0/ccache\
:$PWD/$(ls -1d gnu/store/*profile)/lib/guile/3.0/site-ccache gnu/bin/test
--8<---------------cut here---------------end--------------->8---

this gives something like

--8<---------------cut here---------------start------------->8---
.daemon-start
daemon: 10 ?
.daemon: 9 ?
.daemon: 8 T
.daemon: 7 ^O
.daemon: 6 O

exit
20:42:38 janneke <at> dundal:~/src/guix/master/vork [env]
$ 20:42:38 janneke <at> dundal:~/src/guix/master/vork [env]
$ Backtrace:
Exception thrown while printing backtrace:
In procedure public-lookup: Module named (system repl debug) does not exist
--8<---------------cut here---------------end--------------->8---

Greetings,
Janneke

[vork.scm (application/octet-stream, attachment)]
[Message part 3 (text/plain, inline)]
-- 
Jan Nieuwenhuizen <janneke <at> gnu.org> | GNU LilyPond http://lilypond.org
Freelance IT http://JoyofSource.com | Avatar® http://AvatarAcademy.com

Information forwarded to bug-guix <at> gnu.org:
bug#44261; Package guix. (Tue, 27 Oct 2020 20:10:02 GMT) Full text and rfc822 format available.

Message #8 received at 44261 <at> debbugs.gnu.org (full text, mbox):

From: Jan Nieuwenhuizen <janneke <at> gnu.org>
To: 44261 <at> debbugs.gnu.org
Cc: ludo <at> gnu.org
Subject: Re: bug#44261: running a daemon with userns in relocateble pack breaks
Date: Tue, 27 Oct 2020 21:09:02 +0100
[Message part 1 (text/plain, inline)]
Jan Nieuwenhuizen writes:

Hi!

I tried the hint from Ludovic to use MS_PRIVATE in the attached patch
and that works for me; not sure if we want a test and even less sure how
to write that...

Janneke

[0001-pack-Support-running-of-daemons-in-user-namespace-ba.patch (text/x-patch, inline)]
From fd3104608c3fa6a2375b6c7df0862e5479976b39 Mon Sep 17 00:00:00 2001
From: "Jan (janneke) Nieuwenhuizen" <janneke <at> gnu.org>
Date: Tue, 27 Oct 2020 20:55:11 +0100
Subject: [PATCH] pack: Support running of daemons in user namespace-based
 relocation.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Content-Transfer-Encoding: 8bit
Content-Type: text/plain; charset=UTF-8

Add relocation via ld.so and fakechroot.
Fixes <https://bugs.gnu.org/44261>.

* gnu/packages/aux-files/run-in-namespace.c (bind_mount): Add 'MS_PRIVATE' to
avoid unmounting the bind mount when parent process exits.

Co-authored-by: Ludovic Courtès <ludo <at> gnu.org>
---
 gnu/packages/aux-files/run-in-namespace.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gnu/packages/aux-files/run-in-namespace.c b/gnu/packages/aux-files/run-in-namespace.c
index 52a16a5362..67cea4fcd5 100644
--- a/gnu/packages/aux-files/run-in-namespace.c
+++ b/gnu/packages/aux-files/run-in-namespace.c
@@ -1,5 +1,6 @@
 /* GNU Guix --- Functional package management for GNU
    Copyright (C) 2018, 2019, 2020 Ludovic Courtès <ludo <at> gnu.org>
+   Copyright (C) 2020 Jan (janneke) Nieuwenhuizen <janneke <at> gnu.org>
 
    This file is part of GNU Guix.
 
@@ -138,7 +139,7 @@ bind_mount (const char *source, const struct dirent *entry,
     close (open (target, O_WRONLY | O_CREAT));
 
   return mount (source, target, "none",
-		MS_BIND | MS_REC | MS_RDONLY, NULL);
+		MS_BIND | MS_PRIVATE | MS_REC | MS_RDONLY, NULL);
 }
 
 #if HAVE_EXEC_WITH_LOADER
-- 
Jan Nieuwenhuizen <janneke <at> gnu.org> | GNU LilyPond http://lilypond.org
Freelance IT http://JoyofSource.com | Avatar® http://AvatarAcademy.com

[Message part 3 (text/plain, inline)]
-- 
Jan Nieuwenhuizen <janneke <at> gnu.org> | GNU LilyPond http://lilypond.org
Freelance IT http://JoyofSource.com | Avatar® http://AvatarAcademy.com

Severity set to 'important' from 'normal' Request was from Ludovic Courtès <ludo <at> gnu.org> to control <at> debbugs.gnu.org. (Fri, 30 Oct 2020 16:20:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-guix <at> gnu.org:
bug#44261; Package guix. (Fri, 30 Oct 2020 21:34:02 GMT) Full text and rfc822 format available.

Message #13 received at 44261 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Jan Nieuwenhuizen <janneke <at> gnu.org>
Cc: 44261 <at> debbugs.gnu.org
Subject: Re: bug#44261: running a daemon with userns in relocateble pack breaks
Date: Fri, 30 Oct 2020 22:33:42 +0100
[Message part 1 (text/plain, inline)]
Hello!

As discussed on IRC, my initial advice about MS_PRIVATE was misguided.
The real issue is the “rm_rf (new_root);” call, which removes the root
directory and thus leaves child processes (the daemon) with nothing.

The attached patch adds a test loosely based on yours and a fix for
that.  The fix (for the “userns” engine) is to make NEW_ROOT a tmpfs,
such that upon completion, all we need to do is to unmount it and remove
it; it lives on as the root file system of child processes.

In the “fakechroot” case, we have to leave NEW_ROOT behind, which is not
great but acceptable (it’s user-owned, #o700, and it’s under /tmp).  The
test only checks the “userns” engine.

If you confirm that it works for you and looks reasonable, we can apply
it.

Thanks,
Ludo’.

[Message part 2 (text/x-patch, inline)]
diff --git a/gnu/packages/aux-files/run-in-namespace.c b/gnu/packages/aux-files/run-in-namespace.c
index 52a16a5362..1d64ef9f44 100644
--- a/gnu/packages/aux-files/run-in-namespace.c
+++ b/gnu/packages/aux-files/run-in-namespace.c
@@ -41,6 +41,7 @@
 #include <fcntl.h>
 #include <dirent.h>
 #include <sys/syscall.h>
+#include <sys/prctl.h>
 
 /* Whether we're building the ld.so/libfakechroot wrapper.  */
 #define HAVE_EXEC_WITH_LOADER						\
@@ -258,11 +259,20 @@ exec_in_user_namespace (const char *store, int argc, char *argv[])
 {
   /* Spawn @WRAPPED_PROGRAM@ in a separate namespace where STORE is
      bind-mounted in the right place.  */
-  int err;
+  int err, is_tmpfs;
   char *new_root = mkdtemp (strdup ("/tmp/guix-exec-XXXXXX"));
   char *new_store = concat (new_root, original_store);
   char *cwd = get_current_dir_name ();
 
+  /* Become the new parent of grand-children when their parent dies.  */
+  prctl (PR_SET_CHILD_SUBREAPER, 1);
+
+  /* Optionally, make NEW_ROOT a tmpfs.  That way, if we have to leave it
+     behind because there are sub-processes still running when this wrapper
+     exits, it's OK.  */
+  err = mount ("none", new_root, "tmpfs", 0, NULL);
+  is_tmpfs = (err == 0);
+
   /* Create a child with separate namespaces and set up bind-mounts from
      there.  That way, bind-mounts automatically disappear when the child
      exits, which simplifies cleanup for the parent.  Note: clone is more
@@ -300,6 +310,7 @@ exec_in_user_namespace (const char *store, int argc, char *argv[])
       /* Failure: user namespaces not supported.  */
       fprintf (stderr, "%s: error: 'clone' failed: %m\n", argv[0]);
       rm_rf (new_root);
+      free (new_root);
       break;
 
     default:
@@ -312,10 +323,27 @@ exec_in_user_namespace (const char *store, int argc, char *argv[])
 	write_id_map (child, "uid_map", getuid ());
 	write_id_map (child, "gid_map", getgid ());
 
-	int status;
+	int status, status_other;
 	waitpid (child, &status, 0);
-	chdir ("/");			  /* avoid EBUSY */
-	rm_rf (new_root);
+
+	if (is_tmpfs)
+	  {
+	    /* NEW_ROOT lives on in child processes and we no longer need it
+	       to exist as an empty directory in the global namespace.  */
+	    umount (new_root);
+	    rmdir (new_root);
+	  }
+	/* Check whether there are child processes left.  If there are none,
+	   we can remove NEW_ROOT just fine.  Conversely, if there are
+	   processes left (for example because this wrapper's child forked),
+	   we have to leave NEW_ROOT behind so that those processes can still
+	   access their root file system (XXX).  */
+	else if (waitpid (-1 , &status_other, WNOHANG) == -1)
+	  {
+	    chdir ("/");			  /* avoid EBUSY */
+	    rm_rf (new_root);
+	  }
+
 	free (new_root);
 
 	if (WIFEXITED (status))
@@ -490,6 +518,9 @@ exec_with_loader (const char *store, int argc, char *argv[])
 
   setenv ("FAKECHROOT_BASE", new_root, 1);
 
+  /* Become the new parent of grand-children when their parent dies.  */
+  prctl (PR_SET_CHILD_SUBREAPER, 1);
+
   pid_t child = fork ();
   switch (child)
     {
@@ -507,11 +538,18 @@ exec_with_loader (const char *store, int argc, char *argv[])
 
     default:
       {
-  	int status;
+  	int status, status_other;
 	waitpid (child, &status, 0);
-	chdir ("/");			  /* avoid EBUSY */
-	rm_rf (new_root);
-	free (new_root);
+
+	/* If there are child processes still running, leave NEW_ROOT around
+	   so they can still access it.  XXX: In that case NEW_ROOT is left
+	   behind.  */
+	if (waitpid (-1 , &status_other, WNOHANG) == -1)
+	  {
+	    chdir ("/");			  /* avoid EBUSY */
+	    rm_rf (new_root);
+	    free (new_root);
+	  }
 
 	close (2);			/* flushing stderr should be silent */
 
diff --git a/tests/guix-pack-relocatable.sh b/tests/guix-pack-relocatable.sh
index a960ecd209..88cbe63b59 100644
--- a/tests/guix-pack-relocatable.sh
+++ b/tests/guix-pack-relocatable.sh
@@ -58,6 +58,19 @@ run_without_store ()
     fi
 }
 
+# Wait for the given file to show up.  Error out if it doesn't show up in a
+# timely fashion.
+wait_for_file ()
+{
+    i=0
+    while ! test -f "$1" && test $i -lt 20
+    do
+	sleep 0.3
+	i=`expr $i + 1`
+    done
+    test -f "$1"
+}
+
 test_directory="`mktemp -d`"
 export test_directory
 trap 'chmod -Rf +w "$test_directory"; rm -rf "$test_directory"' EXIT
@@ -129,6 +142,65 @@ case "`uname -m`" in
 	;;
 esac
 
+if unshare -r true
+then
+    # Check what happens if the wrapped binary forks and leaves child
+    # processes behind, like a daemon.  The root file system should remain
+    # available to those child processes.  See <https://bugs.gnu.org/44261>.
+    cat > "$test_directory/manifest.scm" <<EOF
+(use-modules (guix))
+
+(define daemon
+  (program-file "daemon"
+                #~(begin
+                    (use-modules (ice-9 match)
+                                 (ice-9 ftw))
+
+                    (call-with-output-file "parent-store"
+                      (lambda (port)
+                        (write (scandir (ungexp (%store-prefix)))
+                               port)))
+
+                    (match (primitive-fork)
+                      (0 (sigaction SIGHUP (const #t))
+                         (call-with-output-file "pid"
+                           (lambda (port)
+                             (display (getpid) port)))
+                         (pause)
+                         (call-with-output-file "child-store"
+                           (lambda (port)
+                             (write (scandir (ungexp (%store-prefix)))
+                                    port))))
+                      (_ #t)))))
+
+(define package
+  (computed-file "package"
+                 #~(let ((out (ungexp output)))
+                     (mkdir out)
+                     (mkdir (string-append out "/bin"))
+                     (symlink (ungexp daemon)
+                              (string-append out "/bin/daemon")))))
+
+(manifest (list (manifest-entry
+                  (name "daemon")
+                  (version "0")
+                  (item package))))
+EOF
+
+    tarball="$(guix pack -S /bin=bin -R -m "$test_directory/manifest.scm")"
+    (cd "$test_directory"; tar xf "$tarball")
+
+    # Run '/bin/daemon', which forks, then wait for the child, send it SIGHUP
+    # so that it dumps its view of the store, and make sure the child and
+    # parent both see the same store contents.
+    (cd "$test_directory"; run_without_store ./bin/daemon)
+    wait_for_file "$test_directory/pid"
+    kill -HUP $(cat "$test_directory/pid")
+    diff -u "$test_directory/parent-store" "$test_directory/child-store"
+
+    chmod -Rf +w "$test_directory"; rm -rf "$test_directory"/*
+fi
+
 # Ensure '-R' works with outputs other than "out".
 tarball="`guix pack -R -S /share=share groff:doc`"
 (cd "$test_directory"; tar xf "$tarball")

Information forwarded to bug-guix <at> gnu.org:
bug#44261; Package guix. (Fri, 30 Oct 2020 22:06:01 GMT) Full text and rfc822 format available.

Message #16 received at 44261 <at> debbugs.gnu.org (full text, mbox):

From: Jan Nieuwenhuizen <janneke <at> gnu.org>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 44261 <at> debbugs.gnu.org
Subject: Re: bug#44261: running a daemon with userns in relocateble pack breaks
Date: Fri, 30 Oct 2020 23:05:08 +0100
Ludovic Courtès writes:

Hi!

> As discussed on IRC, my initial advice about MS_PRIVATE was misguided.
> The real issue is the “rm_rf (new_root);” call, which removes the root
> directory and thus leaves child processes (the daemon) with nothing.

Yes, I'm not entirely sure what I thought to see yesterday; anyway the
rm_rf (new_root) is indeed the thing that makes the daemon crash.

> The attached patch adds a test loosely based on yours and a fix for
> that.  The fix (for the “userns” engine) is to make NEW_ROOT a tmpfs,
> such that upon completion, all we need to do is to unmount it and remove
> it; it lives on as the root file system of child processes.
>
> In the “fakechroot” case, we have to leave NEW_ROOT behind, which is not
> great but acceptable (it’s user-owned, #o700, and it’s under /tmp).  The
> test only checks the “userns” engine.

Yes, I think this is acceptable.

> If you confirm that it works for you and looks reasonable, we can apply
> it.

Yes, this works.  The test and also my reproducer now work fine.

Thanks a lot!
Janneke

-- 
Jan Nieuwenhuizen <janneke <at> gnu.org> | GNU LilyPond http://lilypond.org
Freelance IT http://JoyofSource.com | Avatar® http://AvatarAcademy.com




Reply sent to Ludovic Courtès <ludo <at> gnu.org>:
You have taken responsibility. (Sat, 31 Oct 2020 22:20:01 GMT) Full text and rfc822 format available.

Notification sent to Jan Nieuwenhuizen <janneke <at> gnu.org>:
bug acknowledged by developer. (Sat, 31 Oct 2020 22:20:02 GMT) Full text and rfc822 format available.

Message #21 received at 44261-done <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Jan Nieuwenhuizen <janneke <at> gnu.org>
Cc: 44261-done <at> debbugs.gnu.org
Subject: Re: bug#44261: running a daemon with userns in relocateble pack breaks
Date: Sat, 31 Oct 2020 23:19:49 +0100
Hi,

Jan Nieuwenhuizen <janneke <at> gnu.org> skribis:

> Ludovic Courtès writes:

[...]

>> The attached patch adds a test loosely based on yours and a fix for
>> that.  The fix (for the “userns” engine) is to make NEW_ROOT a tmpfs,
>> such that upon completion, all we need to do is to unmount it and remove
>> it; it lives on as the root file system of child processes.
>>
>> In the “fakechroot” case, we have to leave NEW_ROOT behind, which is not
>> great but acceptable (it’s user-owned, #o700, and it’s under /tmp).  The
>> test only checks the “userns” engine.
>
> Yes, I think this is acceptable.
>
>> If you confirm that it works for you and looks reasonable, we can apply
>> it.
>
> Yes, this works.  The test and also my reproducer now work fine.

Thanks for checking, I pushed the fix as
bfe82fe2f6e9f34c0774fe2114cdc7e937ba8bd2.

Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#44261; Package guix. (Sun, 01 Nov 2020 06:08:01 GMT) Full text and rfc822 format available.

Message #24 received at 44261-done <at> debbugs.gnu.org (full text, mbox):

From: Jan Nieuwenhuizen <janneke <at> gnu.org>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 44261-done <at> debbugs.gnu.org
Subject: Re: bug#44261: running a daemon with userns in relocateble pack breaks
Date: Sun, 01 Nov 2020 07:07:47 +0100
Ludovic Courtès writes:

Hello,

> Jan Nieuwenhuizen <janneke <at> gnu.org> skribis:
>
>> Ludovic Courtès writes:
>
> [...]
>
>>> If you confirm that it works for you and looks reasonable, we can apply
>>> it.
>>
>> Yes, this works.  The test and also my reproducer now work fine.
>
> Thanks for checking, I pushed the fix as
> bfe82fe2f6e9f34c0774fe2114cdc7e937ba8bd2.

\o/

Thank you
Janneke.

-- 
Jan Nieuwenhuizen <janneke <at> gnu.org> | GNU LilyPond http://lilypond.org
Freelance IT http://JoyofSource.com | Avatar® http://AvatarAcademy.com




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 29 Nov 2020 12:24:06 GMT) Full text and rfc822 format available.

This bug report was last modified 3 years and 120 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.