GNU bug report logs - #34407
Shepherd won't close socket on exit

Previous Next

Package: guix;

Reported by: nly <nly <at> disroot.org>

Date: Sat, 9 Feb 2019 19:58:01 UTC

Severity: normal

Done: Ludovic Courtès <ludo <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 34407 in the body.
You can then email your comments to 34407 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guix <at> gnu.org:
bug#34407; Package guix. (Sat, 09 Feb 2019 19:58:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to nly <nly <at> disroot.org>:
New bug report received and forwarded. Copy sent to bug-guix <at> gnu.org. (Sat, 09 Feb 2019 19:58:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: nly <nly <at> disroot.org>
To: bug-guix <at> gnu.org
Subject: Shepherd won't close socket on exit
Date: Sun, 10 Feb 2019 01:26:55 +0530
Shepherd does not close the socket when exiting with `herd stop root`.
"bind: Address already in use"

~$ rm /run/user/1000/shepherd/socket
~$ shepherd
Service root has been started.
~$ herd status
error: connect: /run/user/1000/shepherd/socket: Connection refused
~$ shepherd
Service root has been started.
~$ Backtrace:
           3 (primitive-load "/run/current-system/profile/bin/shepherd")
In shepherd.scm:
   250:24  2 (main . _)
     48:6  1 (open-server-socket _)
In unknown file:
           0 (bind #<input-output: socket 13> #(1 "/run/user/1000/shepherd/socket") #)

ERROR: In procedure bind:
In procedure bind: Address already in use
  C-c C-c
~$ 




Information forwarded to bug-guix <at> gnu.org:
bug#34407; Package guix. (Sat, 09 Feb 2019 20:22:02 GMT) Full text and rfc822 format available.

Message #8 received at 34407 <at> debbugs.gnu.org (full text, mbox):

From: nly <nly <at> disroot.org>
To: 34407 <at> debbugs.gnu.org
Subject: Re: bug#34407: Acknowledgement (Shepherd won't close socket on exit)
Date: Sun, 10 Feb 2019 01:51:47 +0530
Looks like I pasted something wrong in the previous message. I realized
after I saw it on the mail.

This time I've checked it twice. Left the shepherd in a weird limbo
where connection is refused to the old socket and cannot create new
connection. 

Ofcourse, i can `rm /run/user/1000/shepherd/socket`
--------------------------------------------------------------------------------
nly <at> uf ~$ herd status
error: connect: /run/user/1000/shepherd/socket: No such file or directory
nly <at> uf ~$ shepherd
Service root has been started.
nly <at> uf ~$ herd status
Started:
 + root
Stopped:
 - icecat
 - jack
 - mpv
 - mpv-jack
 - tor
 - transmission
nly <at> uf ~$ herd stop root
nly <at> uf ~$ herd status
error: connect: /run/user/1000/shepherd/socket: Connection refused
nly <at> uf ~$ shepherd
Service root has been started.
nly <at> uf ~$ Backtrace:
           3 (primitive-load "/run/current-system/profile/bin/shepherd")
In shepherd.scm:
   250:24  2 (main . _)
     48:6  1 (open-server-socket _)
In unknown file:
           0 (bind #<input-output: socket 13> #(1 "/run/user/1000/shepherd/socket") #)

ERROR: In procedure bind:
In procedure bind: Address already in use
  C-c C-c
nly <at> uf ~$ herd status
error: connect: /run/user/1000/shepherd/socket: Connection refused




Information forwarded to bug-guix <at> gnu.org:
bug#34407; Package guix. (Wed, 13 Feb 2019 23:06:02 GMT) Full text and rfc822 format available.

Message #11 received at 34407 <at> debbugs.gnu.org (full text, mbox):

From: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>
To: nly <nly <at> disroot.org>
Cc: 34407 <at> debbugs.gnu.org
Subject: Re: bug#34407: Acknowledgement (Shepherd won't close socket on exit)
Date: Wed, 13 Feb 2019 18:05:11 -0500
Hi,

nly <nly <at> disroot.org> writes:

> Looks like I pasted something wrong in the previous message. I realized
> after I saw it on the mail.
>
> This time I've checked it twice. Left the shepherd in a weird limbo
> where connection is refused to the old socket and cannot create new
> connection. 
>
> Ofcourse, i can `rm /run/user/1000/shepherd/socket`
> --------------------------------------------------------------------------------
> nly <at> uf ~$ herd status
> error: connect: /run/user/1000/shepherd/socket: No such file or directory
> nly <at> uf ~$ shepherd
> Service root has been started.
> nly <at> uf ~$ herd status
> Started:
>  + root
> Stopped:
>  - icecat
>  - jack
>  - mpv
>  - mpv-jack
>  - tor
>  - transmission
> nly <at> uf ~$ herd stop root
> nly <at> uf ~$ herd status
> error: connect: /run/user/1000/shepherd/socket: Connection refused
> nly <at> uf ~$ shepherd
> Service root has been started.
> nly <at> uf ~$ Backtrace:
>            3 (primitive-load "/run/current-system/profile/bin/shepherd")
> In shepherd.scm:
>    250:24  2 (main . _)
>      48:6  1 (open-server-socket _)
> In unknown file:
>            0 (bind #<input-output: socket 13> #(1 "/run/user/1000/shepherd/socket") #)
>
> ERROR: In procedure bind:
> In procedure bind: Address already in use
>   C-c C-c
> nly <at> uf ~$ herd status
> error: connect: /run/user/1000/shepherd/socket: Connection refused

This has been annoying me as well; my current workaround is to put this
in my ~/.xsession:

--8<---------------cut here---------------start------------->8---
# Start user services
rm -f /run/user/1000/shepherd/socket
shepherd
--8<---------------cut here---------------end--------------->8---

Maxim




Information forwarded to bug-guix <at> gnu.org:
bug#34407; Package guix. (Sun, 17 Feb 2019 03:39:02 GMT) Full text and rfc822 format available.

Message #14 received at 34407 <at> debbugs.gnu.org (full text, mbox):

From: iyzsong <at> member.fsf.org (宋文武)
To: 34407 <at> debbugs.gnu.org
Cc: guix-devel <at> gnu.org
Subject: [PATCH] shepherd: Delete the socket file upon exit.
Date: Sun, 17 Feb 2019 11:38:16 +0800
[Message part 1 (text/plain, inline)]
Yes, I have the 'rm /run/user/1000/shepherd/socket' workaround in my session
script too...

According to 'man 2 bind', the socket pathname should be deleted when no
longer required, so a patch to fix this bug:

[0001-shepherd-Delete-the-socket-file-upon-exit.patch (text/x-patch, inline)]
From f171f6adb2fc6ee3bf4d25378c2e7bba109b43d8 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?=E5=AE=8B=E6=96=87=E6=AD=A6?= <iyzsong <at> member.fsf.org>
Date: Sun, 17 Feb 2019 11:27:28 +0800
Subject: [PATCH] shepherd: Delete the socket file upon exit.

Fixes <https://bugs.gnu.org/34407>.

* modules/shepherd.scm (call-with-server-socket): New procedure.
(main): Use it instead of 'open-server-socket'.
---
 modules/shepherd.scm | 65 ++++++++++++++++++++++++++------------------
 1 file changed, 39 insertions(+), 26 deletions(-)

diff --git a/modules/shepherd.scm b/modules/shepherd.scm
index e241e7a..314b989 100644
--- a/modules/shepherd.scm
+++ b/modules/shepherd.scm
@@ -49,6 +49,17 @@
       (listen sock 10)
       sock)))
 
+(define (call-with-server-socket file-name proc)
+  "Call PROC, passing it a listening socket at FILE-NAME and deleting the
+socket file at FILE-NAME upon exit of PROC.  Return the values of PROC."
+  (let ((sock (open-server-socket file-name)))
+    (dynamic-wind
+      noop
+      (lambda () (proc sock))
+      (lambda ()
+        (close sock)
+        (delete-file file-name)))))
+
 
 ;; Main program.
 (define (main . args)
@@ -256,32 +267,34 @@
           ;; Get commands from the standard input port.
           (process-textual-commands (current-input-port))
           ;; Process the data arriving at a socket.
-          (let ((sock   (open-server-socket socket-file)))
-
-            ;; Possibly write out our PID, which means we're ready to accept
-            ;; connections.  XXX: What if we daemonized already?
-            (match pid-file
-              ((? string? file)
-               (with-atomic-file-output pid-file
-                 (cute display (getpid) <>)))
-              (#t (display (getpid)))
-              (_  #t))
-
-            (let next-command ()
-              (define (read-from sock)
-                (match (accept sock)
-                  ((command-source . client-address)
-                   (setvbuf command-source (buffering-mode block) 1024)
-                   (process-connection command-source))
-                  (_ #f)))
-              (match (select (list sock) (list) (list) (if poll-services? 0.5 #f))
-                (((sock) _ _)
-                 (read-from sock))
-                (_
-                 #f))
-              (when poll-services?
-                (check-for-dead-services))
-              (next-command)))))))
+          (call-with-server-socket
+           socket-file
+           (lambda (sock)
+
+             ;; Possibly write out our PID, which means we're ready to accept
+             ;; connections.  XXX: What if we daemonized already?
+             (match pid-file
+               ((? string? file)
+                (with-atomic-file-output pid-file
+                  (cute display (getpid) <>)))
+               (#t (display (getpid)))
+               (_  #t))
+
+             (let next-command ()
+               (define (read-from sock)
+                 (match (accept sock)
+                   ((command-source . client-address)
+                    (setvbuf command-source (buffering-mode block) 1024)
+                    (process-connection command-source))
+                   (_ #f)))
+               (match (select (list sock) (list) (list) (if poll-services? 0.5 #f))
+                 (((sock) _ _)
+                  (read-from sock))
+                 (_
+                  #f))
+               (when poll-services?
+                 (check-for-dead-services))
+               (next-command))))))))
 
 ;; Start all of SERVICES, which is a list of canonical names (FIXME?),
 ;; but in a order where all dependencies are fulfilled before we
-- 
2.19.2


Information forwarded to bug-guix <at> gnu.org:
bug#34407; Package guix. (Tue, 19 Feb 2019 19:10:01 GMT) Full text and rfc822 format available.

Message #17 received at 34407 <at> debbugs.gnu.org (full text, mbox):

From: Danny Milosavljevic <dannym <at> scratchpost.org>
To: iyzsong <at> member.fsf.org (宋文武)
Cc: guix-devel <at> gnu.org, 34407 <at> debbugs.gnu.org
Subject: Re: [PATCH] shepherd: Delete the socket file upon exit.
Date: Tue, 19 Feb 2019 20:08:59 +0100
[Message part 1 (text/plain, inline)]
On Sun, 17 Feb 2019 11:38:16 +0800
iyzsong <at> member.fsf.org (宋文武) wrote:

> Yes, I have the 'rm /run/user/1000/shepherd/socket' workaround in my session
> script too...
> 
> According to 'man 2 bind', the socket pathname should be deleted when no
> longer required, so a patch to fix this bug:

Hmm, I guess you can do that.

But /run is supposed to be a tmpfs and elogind is supposed to rm -rf /run/user/1000
after all sessions of that user terminated in any case, so how is it left over
in the first place?

If the deletion in the case above doesn't work, please report a bug.

If that patch is only in order to enable users to restart user's shepherd
without exiting all their sessions, then I guess that's ok--although unusual.

Does your patch do the right thing if the user's shepherd is already
running? (i.e. keep the socket file)
[Message part 2 (application/pgp-signature, inline)]

Information forwarded to bug-guix <at> gnu.org:
bug#34407; Package guix. (Sat, 23 Feb 2019 08:54:01 GMT) Full text and rfc822 format available.

Message #20 received at 34407 <at> debbugs.gnu.org (full text, mbox):

From: iyzsong <at> member.fsf.org (宋文武)
To: Danny Milosavljevic <dannym <at> scratchpost.org>
Cc: guix-devel <at> gnu.org, 34407 <at> debbugs.gnu.org
Subject: Re: [PATCH] shepherd: Delete the socket file upon exit.
Date: Sat, 23 Feb 2019 16:53:00 +0800
Danny Milosavljevic <dannym <at> scratchpost.org> writes:

> On Sun, 17 Feb 2019 11:38:16 +0800
> iyzsong <at> member.fsf.org (宋文武) wrote:
>
>> Yes, I have the 'rm /run/user/1000/shepherd/socket' workaround in my session
>> script too...
>> 
>> According to 'man 2 bind', the socket pathname should be deleted when no
>> longer required, so a patch to fix this bug:
>
> Hmm, I guess you can do that.
>
> But /run is supposed to be a tmpfs and elogind is supposed to rm -rf /run/user/1000
> after all sessions of that user terminated in any case, so how is it left over
> in the first place?
>

Well, maybe the elogind version I used didn't have this feature, or I
had another user session running...

> If the deletion in the case above doesn't work, please report a bug.

Thanks, good to know, and it indeed works.

>
> If that patch is only in order to enable users to restart user's shepherd
> without exiting all their sessions, then I guess that's ok--although unusual.
>
> Does your patch do the right thing if the user's shepherd is already
> running? (i.e. keep the socket file)

Yes, it deletes the socket file at exit (not at startup).




Reply sent to Ludovic Courtès <ludo <at> gnu.org>:
You have taken responsibility. (Mon, 08 Apr 2019 09:00:03 GMT) Full text and rfc822 format available.

Notification sent to nly <nly <at> disroot.org>:
bug acknowledged by developer. (Mon, 08 Apr 2019 09:00:03 GMT) Full text and rfc822 format available.

Message #25 received at 34407-done <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: iyzsong <at> member.fsf.org (宋文武)
Cc: guix-devel <at> gnu.org, 34407-done <at> debbugs.gnu.org
Subject: Re: bug#34407: [PATCH] shepherd: Delete the socket file upon exit.
Date: Mon, 08 Apr 2019 10:58:57 +0200
Hello,

iyzsong <at> member.fsf.org (宋文武) skribis:

> Yes, I have the 'rm /run/user/1000/shepherd/socket' workaround in my session
> script too...

I never had to do that because /run is wiped at boot time, like Danny
wrote.

> According to 'man 2 bind', the socket pathname should be deleted when no
> longer required, so a patch to fix this bug:
>
> From f171f6adb2fc6ee3bf4d25378c2e7bba109b43d8 Mon Sep 17 00:00:00 2001
> From: =?UTF-8?q?=E5=AE=8B=E6=96=87=E6=AD=A6?= <iyzsong <at> member.fsf.org>
> Date: Sun, 17 Feb 2019 11:27:28 +0800
> Subject: [PATCH] shepherd: Delete the socket file upon exit.
>
> Fixes <https://bugs.gnu.org/34407>.
>
> * modules/shepherd.scm (call-with-server-socket): New procedure.
> (main): Use it instead of 'open-server-socket'.

Pushed, thanks!

Ludo’.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Mon, 06 May 2019 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 4 years and 356 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.