GNU bug report logs - #22548
Kernel panic after system reconfiguration

Previous Next

Package: guix;

Reported by: Albin <albin <at> fripost.org>

Date: Wed, 3 Feb 2016 18:32:02 UTC

Severity: normal

Merged with 22545

Done: ludo <at> gnu.org (Ludovic Courtès)

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 22548 in the body.
You can then email your comments to 22548 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guix <at> gnu.org:
bug#22548; Package guix. (Wed, 03 Feb 2016 18:32:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Albin <albin <at> fripost.org>:
New bug report received and forwarded. Copy sent to bug-guix <at> gnu.org. (Wed, 03 Feb 2016 18:32:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Albin <albin <at> fripost.org>
To: bug-guix <at> gnu.org
Subject: Kernel panic after system reconfiguration
Date: Wed, 3 Feb 2016 19:18:41 +0100
Hi,

With no other changes I just ran guix pull and guix system reconfigure
on my MacBook2,1 which created an unbootable system.

After having completed the reconfiguration I tried to halt and reboot
the system but got this error message each time :

> error: connect: /var/run/shepherd/socket: file or directory does not exist

I did a hard shutdown and rebooted.

Here is a picture of the kernel panic screen:
https://lut.im/h3kmF9hN8D/pnbWoVVQWj7QYPkr.jpg

This is my system configuration:
http://paste.lisp.org/display/306452

The OS was quite bootable after my last reconfiguration on January 26.

Albin







Merged 22545 22548. Request was from ludo <at> gnu.org (Ludovic Courtès) to control <at> debbugs.gnu.org. (Wed, 03 Feb 2016 21:07:01 GMT) Full text and rfc822 format available.

Information forwarded to bug-guix <at> gnu.org:
bug#22548; Package guix. (Wed, 03 Feb 2016 21:29:02 GMT) Full text and rfc822 format available.

Message #10 received at 22548 <at> debbugs.gnu.org (full text, mbox):

From: Albin <albin <at> fripost.org>
To: 22548 <at> debbugs.gnu.org
Subject: Re: Kernel panic after system reconfiguration
Date: Wed, 3 Feb 2016 22:08:03 +0100
Hi again,

I got rid of the kernel panic by removing the following from the config
and reconfiguring (as suggested by Mark Weaver):

> (swap-devices '("/swapfile"))

It would be nice to be able to enable swap again though. On my system it
needs to be done with a swap file.

Albin

Den 2016-02-03 kl. 19:18, skrev Albin:
> Hi,
> 
> With no other changes I just ran guix pull and guix system reconfigure
> on my MacBook2,1 which created an unbootable system.
> 
> After having completed the reconfiguration I tried to halt and reboot
> the system but got this error message each time :
> 
>> error: connect: /var/run/shepherd/socket: file or directory does not exist
> 
> I did a hard shutdown and rebooted.
> 
> Here is a picture of the kernel panic screen:
> https://lut.im/h3kmF9hN8D/pnbWoVVQWj7QYPkr.jpg
> 
> This is my system configuration:
> http://paste.lisp.org/display/306452
> 
> The OS was quite bootable after my last reconfiguration on January 26.
> 
> Albin
> 
> 
> 




Information forwarded to bug-guix <at> gnu.org:
bug#22548; Package guix. (Wed, 03 Feb 2016 22:06:01 GMT) Full text and rfc822 format available.

Message #13 received at 22548 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Albin <albin <at> fripost.org>
Cc: 22548 <at> debbugs.gnu.org
Subject: Re: bug#22548: Kernel panic after system reconfiguration
Date: Wed, 03 Feb 2016 23:05:18 +0100
Albin <albin <at> fripost.org> skribis:

> With no other changes I just ran guix pull and guix system reconfigure
> on my MacBook2,1 which created an unbootable system.
>
> After having completed the reconfiguration I tried to halt and reboot
> the system but got this error message each time :
>
>> error: connect: /var/run/shepherd/socket: file or directory does not exist
>
> I did a hard shutdown and rebooted.

Apologies, that’s the aftermath of the dmd → shepherd transition.

The solution was to run /run/booted-system/profile/sbin/reboot, which
would have been able to the running dmd (whereas after reconfiguration,
‘reboot’ was the new Shepherd client, which cannot talk to the old dmd.)

> Here is a picture of the kernel panic screen:
> https://lut.im/h3kmF9hN8D/pnbWoVVQWj7QYPkr.jpg
>
> This is my system configuration:
> http://paste.lisp.org/display/306452
>
> The OS was quite bootable after my last reconfiguration on January 26.

I don’t see anything immediately obvious.  I tried your config in ‘guix
system vm’ and it works fine, so perhaps the problem has to do with
device mapping or similar.

How did you set up the encrypted root partition?  Did you use ‘guix
system reconfigure --no-grub’ and a hand-made grub.cfg?

Thanks,
Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#22548; Package guix. (Wed, 03 Feb 2016 22:15:02 GMT) Full text and rfc822 format available.

Message #16 received at 22548 <at> debbugs.gnu.org (full text, mbox):

From: Mark H Weaver <mhw <at> netris.org>
To: Albin <albin <at> fripost.org>
Cc: 22548 <at> debbugs.gnu.org
Subject: Re: bug#22548: Kernel panic after system reconfiguration
Date: Wed, 03 Feb 2016 17:14:02 -0500
Albin <albin <at> fripost.org> writes:

> Hi again,
>
> I got rid of the kernel panic by removing the following from the config
> and reconfiguring (as suggested by Mark Weaver):
>
>> (swap-devices '("/swapfile"))
>
> It would be nice to be able to enable swap again though. On my system it
> needs to be done with a swap file.

I suspect this never worked, but that before the error was silently
ignored.  In my case, I had:

  (swap-devices '("/dev/disk/by-label/jojen-swap"))

and /dev/disk went away at some point due to another problem.  For a
long time, I simply had no swap.  With the dmd -> shepherd transition,
it started causing a fatal error during boot, leading to a kernel panic.
Unfortunately, the error message scrolled off the screen very quickly,
obscured by a useless kernel backtrace.

      Mark




Information forwarded to bug-guix <at> gnu.org:
bug#22548; Package guix. (Wed, 03 Feb 2016 23:12:02 GMT) Full text and rfc822 format available.

Message #19 received at 22548 <at> debbugs.gnu.org (full text, mbox):

From: Albin <albin <at> fripost.org>
To: Mark H Weaver <mhw <at> netris.org>
Cc: 22548 <at> debbugs.gnu.org
Subject: Re: bug#22548: Kernel panic after system reconfiguration
Date: Wed, 3 Feb 2016 23:45:39 +0100
Hi,

Den 2016-02-03 kl. 23:14, skrev Mark H Weaver:
> Albin <albin <at> fripost.org> writes:
> 
>> Hi again,
>>
>> I got rid of the kernel panic by removing the following from the config
>> and reconfiguring (as suggested by Mark Weaver):
>>
>>> (swap-devices '("/swapfile"))
>>
>> It would be nice to be able to enable swap again though. On my system it
>> needs to be done with a swap file.
> 
> I suspect this never worked, but that before the error was silently
> ignored.  In my case, I had:
> 
>   (swap-devices '("/dev/disk/by-label/jojen-swap"))
> 
> and /dev/disk went away at some point due to another problem.  For a
> long time, I simply had no swap.  With the dmd -> shepherd transition,
> it started causing a fatal error during boot, leading to a kernel panic.
> Unfortunately, the error message scrolled off the screen very quickly,
> obscured by a useless kernel backtrace.
> 
>       Mark
> 

Mark is correct: swap was never enabled in the first place.  I tested
this by booting an old configuration and entering `cat /proc/swaps`,
which returned an empty table.

Case closed, I guess!

Albin




Information forwarded to bug-guix <at> gnu.org:
bug#22548; Package guix. (Thu, 04 Feb 2016 12:56:02 GMT) Full text and rfc822 format available.

Message #22 received at 22548 <at> debbugs.gnu.org (full text, mbox):

From: Alex Kost <alezost <at> gmail.com>
To: Mark H Weaver <mhw <at> netris.org>
Cc: Albin <albin <at> fripost.org>, 22548 <at> debbugs.gnu.org
Subject: Re: bug#22548: Kernel panic after system reconfiguration
Date: Thu, 04 Feb 2016 15:55:37 +0300
Mark H Weaver (2016-02-04 01:14 +0300) wrote:

> Albin <albin <at> fripost.org> writes:
>
>> Hi again,
>>
>> I got rid of the kernel panic by removing the following from the config
>> and reconfiguring (as suggested by Mark Weaver):
>>
>>> (swap-devices '("/swapfile"))
>>
>> It would be nice to be able to enable swap again though. On my system it
>> needs to be done with a swap file.
>
> I suspect this never worked, but that before the error was silently
> ignored.  In my case, I had:
>
>   (swap-devices '("/dev/disk/by-label/jojen-swap"))
>
> and /dev/disk went away at some point due to another problem.  For a
> long time, I simply had no swap.  With the dmd -> shepherd transition,
> it started causing a fatal error during boot, leading to a kernel panic.
> Unfortunately, the error message scrolled off the screen very quickly,
> obscured by a useless kernel backtrace.

I faced the same kernel panic as I also had "/dev/disk/..." swap device.

Obviously it didn't work for some time when dmd was the init system
(because on GuixSD there is no "/dev/disk/" since… I don't know when as
I've never noticed it before).

And as reported by several people on #guix (I count at least 4 including
me and Mark) a wrong swap device leads to a kernel panic if shepherd is
used as the init system.

Until I realized that it was a wrong swap, I made bisecting on shepherd
to find out which commit introduced this bug.  It gave me commit
852341e¹: when I reconfigured my system (with a wrong swap) using
shepherd on this commit, I had a kernel panic, while with shepherd on
the previous commit the system booted successfully.

¹ http://git.savannah.gnu.org/cgit/shepherd.git/commit/?id=852341ed0c08941cbdd022135f8bef7be2d7ec54

-- 
Alex




Information forwarded to bug-guix <at> gnu.org:
bug#22548; Package guix. (Thu, 04 Feb 2016 22:51:01 GMT) Full text and rfc822 format available.

Message #25 received at 22548 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Alex Kost <alezost <at> gmail.com>
Cc: Albin <albin <at> fripost.org>, Mark H Weaver <mhw <at> netris.org>,
 22548 <at> debbugs.gnu.org
Subject: Re: bug#22548: Kernel panic after system reconfiguration
Date: Thu, 04 Feb 2016 23:50:29 +0100
Alex Kost <alezost <at> gmail.com> skribis:

> And as reported by several people on #guix (I count at least 4 including
> me and Mark) a wrong swap device leads to a kernel panic if shepherd is
> used as the init system.
>
> Until I realized that it was a wrong swap, I made bisecting on shepherd
> to find out which commit introduced this bug.  It gave me commit
> 852341e¹: when I reconfigured my system (with a wrong swap) using
> shepherd on this commit, I had a kernel panic, while with shepherd on
> the previous commit the system booted successfully.
>
> ¹ http://git.savannah.gnu.org/cgit/shepherd.git/commit/?id=852341ed0c08941cbdd022135f8bef7be2d7ec54

Ooooh, it took me a while but I see how this happens.  This is because
we start services directly from the config file, and anything that goes
wrong there is uncaught, which leads to this:

--8<---------------cut here---------------start------------->8---
Service udev has been started.
srfi-34(#<condition &action-runtime-error [service: #<<service> 184b150> action: start key: system-error arguments: ("swapon" "~S: ~A" ("/dev/disk/foobar" "No such file or directory") (2))] 1ea24c0>)
[    6.856167] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000100
[    6.856167] 
[    6.856869] CPU: 0 PID: 1 Comm: shepherd Not tainted 4.4.1-gnu #1
[    6.857319] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
--8<---------------cut here---------------end--------------->8---

Ludo’.




Reply sent to ludo <at> gnu.org (Ludovic Courtès):
You have taken responsibility. (Fri, 05 Feb 2016 13:06:02 GMT) Full text and rfc822 format available.

Notification sent to Albin <albin <at> fripost.org>:
bug acknowledged by developer. (Fri, 05 Feb 2016 13:06:02 GMT) Full text and rfc822 format available.

Message #30 received at 22548-done <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Alex Kost <alezost <at> gmail.com>
Cc: Albin <albin <at> fripost.org>, 22545-done <at> debbugs.gnu.org,
 22548-done <at> debbugs.gnu.org, Mark H Weaver <mhw <at> netris.org>
Subject: Re: bug#22548: Kernel panic after system reconfiguration
Date: Fri, 05 Feb 2016 14:05:25 +0100
ludo <at> gnu.org (Ludovic Courtès) skribis:

> Alex Kost <alezost <at> gmail.com> skribis:
>
>> And as reported by several people on #guix (I count at least 4 including
>> me and Mark) a wrong swap device leads to a kernel panic if shepherd is
>> used as the init system.
>>
>> Until I realized that it was a wrong swap, I made bisecting on shepherd
>> to find out which commit introduced this bug.  It gave me commit
>> 852341e¹: when I reconfigured my system (with a wrong swap) using
>> shepherd on this commit, I had a kernel panic, while with shepherd on
>> the previous commit the system booted successfully.
>>
>> ¹ http://git.savannah.gnu.org/cgit/shepherd.git/commit/?id=852341ed0c08941cbdd022135f8bef7be2d7ec54
>
> Ooooh, it took me a while but I see how this happens.  This is because
> we start services directly from the config file, and anything that goes
> wrong there is uncaught, which leads to this:
>
> Service udev has been started.
> srfi-34(#<condition &action-runtime-error [service: #<<service> 184b150> action: start key: system-error arguments: ("swapon" "~S: ~A" ("/dev/disk/foobar" "No such file or directory") (2))] 1ea24c0>)
> [    6.856167] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000100
> [    6.856167] 
> [    6.856869] CPU: 0 PID: 1 Comm: shepherd Not tainted 4.4.1-gnu #1
> [    6.857319] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014

Commit 081bd3b fixes it.  Commit 234ea8a defensively wraps all the
configuration file in ‘call-with-error-handling’, which spawns a REPL
upon error.

Thanks for the detailed investigation!

Ludo’.




Reply sent to ludo <at> gnu.org (Ludovic Courtès):
You have taken responsibility. (Fri, 05 Feb 2016 13:06:02 GMT) Full text and rfc822 format available.

Notification sent to Albin <albin <at> fripost.org>:
bug acknowledged by developer. (Fri, 05 Feb 2016 13:06:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sat, 05 Mar 2016 12:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 8 years and 52 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.