GNU bug report logs - #44669
Shepherd loses track of elogind

Previous Next

Package: guix;

Reported by: Marius Bakke <marius <at> gnu.org>

Date: Sun, 15 Nov 2020 21:52:02 UTC

Severity: normal

Done: Marius Bakke <marius <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 44669 in the body.
You can then email your comments to 44669 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guix <at> gnu.org:
bug#44669; Package guix. (Sun, 15 Nov 2020 21:52:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Marius Bakke <marius <at> gnu.org>:
New bug report received and forwarded. Copy sent to bug-guix <at> gnu.org. (Sun, 15 Nov 2020 21:52:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Marius Bakke <marius <at> gnu.org>
To: bug-guix <at> gnu.org
Subject: Shepherd loses track of elogind
Date: Sun, 15 Nov 2020 22:51:17 +0100
[Message part 1 (text/plain, inline)]
Hello,

On a newly-installed i7 system, Shepherd believes that the "elogind"
service is not running.  Yet there is an 'elogind-daemon' process,
spawned by PID 1, preventing subsequent "herd start elogind" invocations
from succeeding.
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-guix <at> gnu.org:
bug#44669; Package guix. (Mon, 16 Nov 2020 14:59:01 GMT) Full text and rfc822 format available.

Message #8 received at 44669 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Marius Bakke <marius <at> gnu.org>
Cc: 44669 <at> debbugs.gnu.org
Subject: Re: bug#44669: Shepherd loses track of elogind
Date: Mon, 16 Nov 2020 15:58:38 +0100
Hi Marius,

Marius Bakke <marius <at> gnu.org> skribis:

> On a newly-installed i7 system, Shepherd believes that the "elogind"
> service is not running.  Yet there is an 'elogind-daemon' process,
> spawned by PID 1, preventing subsequent "herd start elogind" invocations
> from succeeding.

Could you show the relevant /var/log/messages bits?  That should show
when/why elogind stopped.

That’s from 1.2.0rc1?

Thanks,
Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#44669; Package guix. (Mon, 16 Nov 2020 17:38:02 GMT) Full text and rfc822 format available.

Message #11 received at 44669 <at> debbugs.gnu.org (full text, mbox):

From: Marius Bakke <marius <at> gnu.org>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 44669 <at> debbugs.gnu.org
Subject: Re: bug#44669: Shepherd loses track of elogind
Date: Mon, 16 Nov 2020 18:37:20 +0100
[Message part 1 (text/plain, inline)]
Ludovic Courtès <ludo <at> gnu.org> writes:

> Hi Marius,
>
> Marius Bakke <marius <at> gnu.org> skribis:
>
>> On a newly-installed i7 system, Shepherd believes that the "elogind"
>> service is not running.  Yet there is an 'elogind-daemon' process,
>> spawned by PID 1, preventing subsequent "herd start elogind" invocations
>> from succeeding.
>
> Could you show the relevant /var/log/messages bits?  That should show
> when/why elogind stopped.

Indeed.  It was because I had 'sddm-service-type' configured, which
attempted to communicate with "org.freedesktop.login1" over D-Bus, which
in turn autostarted elogind before shepherd had gotten around to it.

[elogind.log (text/plain, inline)]
Nov 15 21:16:18 localhost dbus-daemon[427]: [system] Activating service name='org.freedesktop.login1' requested by ':1.2' (uid=0 pid=449 comm="/gnu/store/x577n8rs9zcf6ri4aka4pccyj74qxhwh-sddm-0") (using servicehelper)
Nov 15 21:16:18 localhost vmunix: [   46.137561] elogind-daemon[462]: New seat seat0.
Nov 15 21:16:18 localhost vmunix: [   46.138052] elogind-daemon[462]: Watching system buttons on /dev/input/event2 (Power Button)
Nov 15 21:16:18 localhost vmunix: [   46.193372] elogind-daemon[462]: Watching system buttons on /dev/input/event1 (Lid Switch)
Nov 15 21:16:18 localhost vmunix: [   46.193428] elogind-daemon[462]: Watching system buttons on /dev/input/event0 (Sleep Button)
Nov 15 21:16:18 localhost avahi-daemon[444]: Server startup complete. Host name is sirius.local. Local service cookie is 3083842416.
Nov 15 21:16:18 localhost vmunix: [   46.496547] elogind-daemon[462]: Watching system buttons on /dev/input/event3 (AT Translated Set 2 keyboard)
Nov 15 21:16:18 localhost vmunix: [   46.496598] elogind-daemon[462]: Watching system buttons on /dev/input/event4 (ThinkPad Extra Buttons)
Nov 15 21:16:18 localhost dbus-daemon[427]: [system] Successfully activated service 'org.freedesktop.login1'
Nov 15 21:16:18 localhost vmunix: [   46.498084] elogind-daemon[462]: New session c1 of user marius.
Nov 15 21:16:18 localhost shepherd[1]: Service avahi-daemon has been started.
Nov 15 21:16:18 localhost shepherd[1]: Service mcron has been started.
Nov 15 21:16:18 localhost shepherd[1]: Service elogind has been started.
Nov 15 21:16:18 localhost shepherd[1]: Respawning elogind.
[Message part 3 (text/plain, inline)]
> That’s from 1.2.0rc1?

Yes, and also 'master'.  The initial i3 install with 1.2.0rc1 went fine,
it was when I switched to SDDM + autologin (+ sway) that it failed.

Now I no longer use SDDM (or any DM), but I was able to work around it
by adding #:pid-file:

[diff (text/x-patch, inline)]
diff --git a/gnu/services/desktop.scm b/gnu/services/desktop.scm
index 265cf9f35f..6b7d832a44 100644
--- a/gnu/services/desktop.scm
+++ b/gnu/services/desktop.scm
@@ -770,7 +770,8 @@ seats.)"
                    #:environment-variables
                    (list (string-append "ELOGIND_CONF_FILE="
                                         #$(elogind-configuration-file
-                                           config)))))
+                                           config)))
+                   #:pid-file "/run/systemd/elogind.pid"))
          (stop #~(make-kill-destructor)))))
 
 (define elogind-service-type
[Message part 5 (text/plain, inline)]
The race between D-Bus and elogind should probably be handled by having
org.freedesktop.login1 consumers depend on the 'elogind' service instead?
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-guix <at> gnu.org:
bug#44669; Package guix. (Mon, 16 Nov 2020 17:50:02 GMT) Full text and rfc822 format available.

Message #14 received at 44669 <at> debbugs.gnu.org (full text, mbox):

From: Marius Bakke <marius <at> gnu.org>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 44669 <at> debbugs.gnu.org
Subject: Re: bug#44669: Shepherd loses track of elogind
Date: Mon, 16 Nov 2020 18:49:28 +0100
[Message part 1 (text/plain, inline)]
Marius Bakke <marius <at> gnu.org> writes:

> Ludovic Courtès <ludo <at> gnu.org> writes:
>
>> Hi Marius,
>>
>> Marius Bakke <marius <at> gnu.org> skribis:
>>
>>> On a newly-installed i7 system, Shepherd believes that the "elogind"
>>> service is not running.  Yet there is an 'elogind-daemon' process,
>>> spawned by PID 1, preventing subsequent "herd start elogind" invocations
>>> from succeeding.
>>
>> Could you show the relevant /var/log/messages bits?  That should show
>> when/why elogind stopped.
>
> Indeed.  It was because I had 'sddm-service-type' configured, which
> attempted to communicate with "org.freedesktop.login1" over D-Bus, which
> in turn autostarted elogind before shepherd had gotten around to it.

Interestingly I suspected this exact scenario and checked the PPID of
the running elogind process, which was '1'.  When I then found that
adding #:pid-file worked, I did not bother checking the log ...

I would have expected D-Bus to be the parent PID.
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-guix <at> gnu.org:
bug#44669; Package guix. (Tue, 17 Nov 2020 08:32:01 GMT) Full text and rfc822 format available.

Message #17 received at 44669 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Marius Bakke <marius <at> gnu.org>
Cc: 44669 <at> debbugs.gnu.org
Subject: Re: bug#44669: Shepherd loses track of elogind
Date: Tue, 17 Nov 2020 09:31:32 +0100
Hi!

Marius Bakke <marius <at> gnu.org> skribis:

> Indeed.  It was because I had 'sddm-service-type' configured, which
> attempted to communicate with "org.freedesktop.login1" over D-Bus, which
> in turn autostarted elogind before shepherd had gotten around to it.

Oh.

> Now I no longer use SDDM (or any DM), but I was able to work around it
> by adding #:pid-file:
>
> diff --git a/gnu/services/desktop.scm b/gnu/services/desktop.scm
> index 265cf9f35f..6b7d832a44 100644
> --- a/gnu/services/desktop.scm
> +++ b/gnu/services/desktop.scm
> @@ -770,7 +770,8 @@ seats.)"
>                     #:environment-variables
>                     (list (string-append "ELOGIND_CONF_FILE="
>                                          #$(elogind-configuration-file
> -                                           config)))))
> +                                           config)))
> +                   #:pid-file "/run/systemd/elogind.pid"))
>           (stop #~(make-kill-destructor)))))

LGTM.  Now, if elogind is started behind the shepherd’s back, there’s
still a race: shepherd removes the PID file before spawning the process,
and then waits for that PID file to show up.  Chances are shepherd will
not notice that another elogind is already running, and thus the service
will fail to start.

> The race between D-Bus and elogind should probably be handled by having
> org.freedesktop.login1 consumers depend on the 'elogind' service instead?

Yes, we could do that.  Note that the only reason we just don’t let
elogind be bus-activated is so it can handle events like lid close even
before someone has attempted to log in (commit
94a881178af9a9a918ce6de55641daa245c92e73,
<https://issues.guix.gnu.org/27580>).

Thanks,
Ludo’.




Reply sent to Marius Bakke <marius <at> gnu.org>:
You have taken responsibility. (Wed, 18 Nov 2020 21:42:02 GMT) Full text and rfc822 format available.

Notification sent to Marius Bakke <marius <at> gnu.org>:
bug acknowledged by developer. (Wed, 18 Nov 2020 21:42:03 GMT) Full text and rfc822 format available.

Message #22 received at 44669-done <at> debbugs.gnu.org (full text, mbox):

From: Marius Bakke <marius <at> gnu.org>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 44669-done <at> debbugs.gnu.org
Subject: Re: bug#44669: Shepherd loses track of elogind
Date: Wed, 18 Nov 2020 22:41:43 +0100
[Message part 1 (text/plain, inline)]
Ludovic Courtès <ludo <at> gnu.org> skriver:

>> Now I no longer use SDDM (or any DM), but I was able to work around it
>> by adding #:pid-file:
>>
>> diff --git a/gnu/services/desktop.scm b/gnu/services/desktop.scm
>> index 265cf9f35f..6b7d832a44 100644
>> --- a/gnu/services/desktop.scm
>> +++ b/gnu/services/desktop.scm
>> @@ -770,7 +770,8 @@ seats.)"
>>                     #:environment-variables
>>                     (list (string-append "ELOGIND_CONF_FILE="
>>                                          #$(elogind-configuration-file
>> -                                           config)))))
>> +                                           config)))
>> +                   #:pid-file "/run/systemd/elogind.pid"))
>>           (stop #~(make-kill-destructor)))))
>
> LGTM.  Now, if elogind is started behind the shepherd’s back, there’s
> still a race: shepherd removes the PID file before spawning the process,
> and then waits for that PID file to show up.  Chances are shepherd will
> not notice that another elogind is already running, and thus the service
> will fail to start.

Right.  If Shepherd actually deletes the PID file before attempting to
start the service, I think I just "won" the race in my testing...

>> The race between D-Bus and elogind should probably be handled by having
>> org.freedesktop.login1 consumers depend on the 'elogind' service instead?
>
> Yes, we could do that.  Note that the only reason we just don’t let
> elogind be bus-activated is so it can handle events like lid close even
> before someone has attempted to log in (commit
> 94a881178af9a9a918ce6de55641daa245c92e73,
> <https://issues.guix.gnu.org/27580>).

Interesting.  I wonder what other workarounds there are for this.

For now, I made SDDM simply depend on elogind in commit
0ae9bbe4f5f89e6f597bdb1f6df646fc5f504876.
[signature.asc (application/pgp-signature, inline)]

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 17 Dec 2020 12:24:09 GMT) Full text and rfc822 format available.

This bug report was last modified 3 years and 151 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.