GNU bug report logs - #27212
203a9455c4695152fc5d0085bffeead9ce3216c2 broke system boot

Previous Next

Package: guix;

Reported by: ng0 <ng0 <at> pragmatique.xyz>

Date: Sat, 3 Jun 2017 16:54:01 UTC

Severity: normal

Done: ng0 <ng0 <at> pragmatique.xyz>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 27212 in the body.
You can then email your comments to 27212 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guix <at> gnu.org:
bug#27212; Package guix. (Sat, 03 Jun 2017 16:54:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to ng0 <ng0 <at> pragmatique.xyz>:
New bug report received and forwarded. Copy sent to bug-guix <at> gnu.org. (Sat, 03 Jun 2017 16:54:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: ng0 <ng0 <at> pragmatique.xyz>
To: bug-guix <at> gnu.org
Subject: 203a9455c4695152fc5d0085bffeead9ce3216c2 broke system boot
Date: Sat, 3 Jun 2017 16:53:21 +0000
Otherwise perfectly working systems I have *all* break after
https://git.savannah.gnu.org/cgit/guix.git/commit/?id=203a9455c4695152fc5d0085bffeead9ce3216c2

and the last thing  they display before guile repl is the
message in that commit.

Yes, I use labels. No, I don't use complicated setups.
Please test such changes before applying them in the wild..
I just hope this happened here and the break is a mistake.
-- 
ng0
OpenPG: A88C8ADD129828D7EAC02E52E22F9BBFEE348588




Information forwarded to bug-guix <at> gnu.org:
bug#27212; Package guix. (Sat, 03 Jun 2017 17:02:02 GMT) Full text and rfc822 format available.

Message #8 received at 27212 <at> debbugs.gnu.org (full text, mbox):

From: Marius Bakke <mbakke <at> fastmail.com>
To: ng0 <ng0 <at> pragmatique.xyz>, 27212 <at> debbugs.gnu.org
Cc: Danny Milosavljevic <dannym <at> scratchpost.org>
Subject: Re: bug#27212: 203a9455c4695152fc5d0085bffeead9ce3216c2 broke system
 boot
Date: Sat, 03 Jun 2017 19:01:10 +0200
[Message part 1 (text/plain, inline)]
ng0 <ng0 <at> pragmatique.xyz> writes:

> Otherwise perfectly working systems I have *all* break after
> https://git.savannah.gnu.org/cgit/guix.git/commit/?id=203a9455c4695152fc5d0085bffeead9ce3216c2
>
> and the last thing  they display before guile repl is the
> message in that commit.
>
> Yes, I use labels. No, I don't use complicated setups.
> Please test such changes before applying them in the wild..
> I just hope this happened here and the break is a mistake.

I *just* noticed the same thing. I also tried reverting this commit, but
that just entered a (re)boot loop. Not sure what's up, but two of my
GuixSD systems are affected.

On my way out now, but will investigate more tomorrow. CC Danny in case
of insight (which report by Chris did this address?).

Older generations are fine.
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-guix <at> gnu.org:
bug#27212; Package guix. (Sat, 03 Jun 2017 17:32:02 GMT) Full text and rfc822 format available.

Message #11 received at 27212 <at> debbugs.gnu.org (full text, mbox):

From: ng0 <ng0 <at> pragmatique.xyz>
To: Marius Bakke <mbakke <at> fastmail.com>
Cc: Danny Milosavljevic <dannym <at> scratchpost.org>, 27212 <at> debbugs.gnu.org,
 ng0 <ng0 <at> pragmatique.xyz>
Subject: Re: bug#27212: 203a9455c4695152fc5d0085bffeead9ce3216c2 broke system
 boot
Date: Sat, 3 Jun 2017 17:30:56 +0000
[Message part 1 (text/plain, inline)]
Marius Bakke transcribed 1.4K bytes:
> ng0 <ng0 <at> pragmatique.xyz> writes:
> 
> > Otherwise perfectly working systems I have *all* break after
> > https://git.savannah.gnu.org/cgit/guix.git/commit/?id=203a9455c4695152fc5d0085bffeead9ce3216c2
> >
> > and the last thing  they display before guile repl is the
> > message in that commit.
> >
> > Yes, I use labels. No, I don't use complicated setups.
> > Please test such changes before applying them in the wild..
> > I just hope this happened here and the break is a mistake.
> 
> I *just* noticed the same thing. I also tried reverting this commit, but
> that just entered a (re)boot loop. Not sure what's up, but two of my
> GuixSD systems are affected.
> 
> On my way out now, but will investigate more tomorrow. CC Danny in case
> of insight (which report by Chris did this address?).
> 
> Older generations are fine.

Thanks.

Results like this should be avoided. As long as this happens
there is a reason for the 'beta' label.

It might be usable in all kinds of environments, but we
have to have rules and guidelines when something can be
applied just like this after QA or when a certain set
of people have to aprove the change as they did
test the change and it worked.

I consider file-system one of these cases.
If the commit touches the way towards boot, approval
is the way to go.

-- 
ng0
OpenPG: A88C8ADD129828D7EAC02E52E22F9BBFEE348588
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-guix <at> gnu.org:
bug#27212; Package guix. (Sat, 03 Jun 2017 18:02:02 GMT) Full text and rfc822 format available.

Message #14 received at 27212 <at> debbugs.gnu.org (full text, mbox):

From: Danny Milosavljevic <dannym <at> scratchpost.org>
To: 27212 <at> debbugs.gnu.org
Cc: Danny Milosavljevic <dannym <at> scratchpost.org>
Subject: [PATCH] file-systems: Improve error handling in the iso9660 case -
 fixes boot problem.
Date: Sat,  3 Jun 2017 20:00:13 +0200
* gnu/build/file-systems.scm (read-iso9660-superblock): Modify.
---
 gnu/build/file-systems.scm | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/gnu/build/file-systems.scm b/gnu/build/file-systems.scm
index 3e0873377..740c37124 100644
--- a/gnu/build/file-systems.scm
+++ b/gnu/build/file-systems.scm
@@ -260,7 +260,11 @@ volume descriptor from ~s"
   "Return the raw contents of DEVICE's iso9660 primary volume descriptor
 as a bytevector, or #f if DEVICE does not contain an iso9660 file system."
   ;; Start reading at sector 16.
-  (read-iso9660-primary-volume-descriptor device (* 2048 16)))
+  ;; Since we are not sure that the device contains an ISO9660 filesystem,
+  ;; we have to find that out first.
+  (if (read-superblock device (* 2048 16) 2048 iso9660-superblock?)
+       (read-iso9660-primary-volume-descriptor device (* 2048 16))
+       #f)) ; Device does not contain an iso9660 filesystem.
 
 (define (iso9660-superblock-uuid sblock)
   "Return the modification time of an iso9660 primary volume descriptor




Information forwarded to bug-guix <at> gnu.org:
bug#27212; Package guix. (Sat, 03 Jun 2017 18:26:02 GMT) Full text and rfc822 format available.

Message #17 received at 27212 <at> debbugs.gnu.org (full text, mbox):

From: Danny Milosavljevic <dannym <at> scratchpost.org>
To: 27212 <at> debbugs.gnu.org, ng0 <ng0 <at> pragmatique.xyz>, Marius Bakke
 <mbakke <at> fastmail.com>
Subject: Re: [PATCH] file-systems: Improve error handling in the iso9660
 case - fixes boot problem.
Date: Sat, 3 Jun 2017 20:25:00 +0200
Hi,

sorry for the slip-up.

The error handling in 203a9455c4695152fc5d0085bffeead9ce3216c2 was improved for the case when there's no iso9660 primary volume descriptor anywhere and no terminator either.  In that case the CD is broken.

But if there's no iso9660 volume descriptor AT ALL (primary or not) it's not a fatal error for guix gnu/build/file-systems.scm - it just means we picked the wrong filesystem type and should try the next one.  This case was not handled correctly and this patch addresses this.

I'd like help with testing this patch.  If someone with a fast computer could apply this patch, then run make check-system, and then, if everything worked, run guix system reconfigure /etc/config.scm , that would be great (my computer is building guix master for the last 3 days and probably needs 4 more - so no idea whether the patch works).

@Marius: I don't understand why reverting 203a9455c4695152fc5d0085bffeead9ce3216c2 didn't fix it.  It should have.  (I can't try it myself because I'm in forced-system-update purgatory - argh).




Information forwarded to bug-guix <at> gnu.org:
bug#27212; Package guix. (Sat, 03 Jun 2017 19:47:01 GMT) Full text and rfc822 format available.

Message #20 received at 27212 <at> debbugs.gnu.org (full text, mbox):

From: Leo Famulari <leo <at> famulari.name>
To: Danny Milosavljevic <dannym <at> scratchpost.org>
Cc: Marius Bakke <mbakke <at> fastmail.com>, 27212 <at> debbugs.gnu.org,
 ng0 <ng0 <at> pragmatique.xyz>
Subject: Re: bug#27212: [PATCH] file-systems: Improve error handling in the
 iso9660 case - fixes boot problem.
Date: Sat, 3 Jun 2017 15:45:55 -0400
[Message part 1 (text/plain, inline)]
On Sat, Jun 03, 2017 at 08:25:00PM +0200, Danny Milosavljevic wrote:
> Hi,
> 
> sorry for the slip-up.
> 
> The error handling in 203a9455c4695152fc5d0085bffeead9ce3216c2 was improved
> for the case when there's no iso9660 primary volume descriptor anywhere and no
> terminator either.  In that case the CD is broken.
> 
> But if there's no iso9660 volume descriptor AT ALL (primary or not) it's not a
> fatal error for guix gnu/build/file-systems.scm - it just means we picked the
> wrong filesystem type and should try the next one.  This case was not handled
> correctly and this patch addresses this.
> 
> I'd like help with testing this patch.  If someone with a fast computer could
> apply this patch, then run make check-system, and then, if everything worked,
> run guix system reconfigure /etc/config.scm , that would be great (my computer
> is building guix master for the last 3 days and probably needs 4 more - so no
> idea whether the patch works).

I don't have access to a fast GuixSD system today.

How about reverting the original change now, so that we can reduce the number of
people who hit the bug. Then we can test the revised commit afterwards.
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-guix <at> gnu.org:
bug#27212; Package guix. (Sat, 03 Jun 2017 19:58:01 GMT) Full text and rfc822 format available.

Message #23 received at 27212 <at> debbugs.gnu.org (full text, mbox):

From: Jelle Licht <jlicht <at> fsfe.org>
To: Leo Famulari <leo <at> famulari.name>
Cc: Danny Milosavljevic <dannym <at> scratchpost.org>, 27212 <at> debbugs.gnu.org
Subject: Re: bug#27212: [PATCH] file-systems: Improve error handling in the
 iso9660 case - fixes boot problem.
Date: Sat, 3 Jun 2017 21:57:36 +0200
Leo Famulari <leo <at> famulari.name> writes:

> On Sat, Jun 03, 2017 at 08:25:00PM +0200, Danny Milosavljevic wrote:
>> Hi,
>>
>> sorry for the slip-up.
>>
>> The error handling in 203a9455c4695152fc5d0085bffeead9ce3216c2 was improved
>> for the case when there's no iso9660 primary volume descriptor anywhere and no
>> terminator either.  In that case the CD is broken.
>>
>> But if there's no iso9660 volume descriptor AT ALL (primary or not) it's not a
>> fatal error for guix gnu/build/file-systems.scm - it just means we picked the
>> wrong filesystem type and should try the next one.  This case was not handled
>> correctly and this patch addresses this.
>>
>> I'd like help with testing this patch.  If someone with a fast computer could
>> apply this patch, then run make check-system, and then, if everything worked,
>> run guix system reconfigure /etc/config.scm , that would be great (my computer
>> is building guix master for the last 3 days and probably needs 4 more - so no
>> idea whether the patch works).
>
> I don't have access to a fast GuixSD system today.
>
> How about reverting the original change now, so that we can reduce the number of
> people who hit the bug. Then we can test the revised commit afterwards.

Good news! My system boots again after applying your patch and
reconfiguring. The make check-system fails a network related test, but
only because of an unrelated problem, as this already happened before
any of these problems started.

- Jelle




Information forwarded to bug-guix <at> gnu.org:
bug#27212; Package guix. (Sat, 03 Jun 2017 20:00:04 GMT) Full text and rfc822 format available.

Message #26 received at 27212 <at> debbugs.gnu.org (full text, mbox):

From: jah <jah <at> jahboite.co.uk>
To: 27212 <at> debbugs.gnu.org
Subject: Re: bug#27212: 203a9455c4695152fc5d0085bffeead9ce3216c2 broke system
 boot
Date: Sat, 3 Jun 2017 20:38:45 +0100
The same issue affects a GuixSD installed on a GPT partitioned Qemu image.  This was after doing guix pull and guix system reconfigure to a fresh install of 0.13.0.x86_64-linux.

The following is a screenshot of the backtrace:-

https://jahboite.co.uk/files/bab/backtrace_file-systems-scm.png

jah




Information forwarded to bug-guix <at> gnu.org:
bug#27212; Package guix. (Sat, 03 Jun 2017 20:12:01 GMT) Full text and rfc822 format available.

Message #29 received at 27212 <at> debbugs.gnu.org (full text, mbox):

From: Danny Milosavljevic <dannym <at> scratchpost.org>
To: Jelle Licht <jlicht <at> fsfe.org>
Cc: 27212 <at> debbugs.gnu.org, Leo Famulari <leo <at> famulari.name>
Subject: Re: bug#27212: [PATCH] file-systems: Improve error handling in the
 iso9660 case - fixes boot problem.
Date: Sat, 3 Jun 2017 22:11:01 +0200
Pushed as fb03f44bb117226e7d67a85401ffbb54ad8858ed.




Information forwarded to bug-guix <at> gnu.org:
bug#27212; Package guix. (Sat, 03 Jun 2017 20:14:01 GMT) Full text and rfc822 format available.

Message #32 received at 27212 <at> debbugs.gnu.org (full text, mbox):

From: Danny Milosavljevic <dannym <at> scratchpost.org>
To: Leo Famulari <leo <at> famulari.name>
Cc: Marius Bakke <mbakke <at> fastmail.com>, 27212 <at> debbugs.gnu.org,
 ng0 <ng0 <at> pragmatique.xyz>
Subject: Re: bug#27212: [PATCH] file-systems: Improve error handling in the
 iso9660 case - fixes boot problem.
Date: Sat, 3 Jun 2017 22:12:57 +0200
Hi Leo,

On Sat, 3 Jun 2017 15:45:55 -0400
Leo Famulari <leo <at> famulari.name> wrote:

> How about reverting the original change now

According to Marius that doesn't work either so it wouldn't help.

So I've applied the new patch to master after Jelle's successful test.  Should be fixed now.






Reply sent to ng0 <ng0 <at> pragmatique.xyz>:
You have taken responsibility. (Sun, 04 Jun 2017 11:16:01 GMT) Full text and rfc822 format available.

Notification sent to ng0 <ng0 <at> pragmatique.xyz>:
bug acknowledged by developer. (Sun, 04 Jun 2017 11:16:02 GMT) Full text and rfc822 format available.

Message #37 received at 27212-done <at> debbugs.gnu.org (full text, mbox):

From: ng0 <ng0 <at> pragmatique.xyz>
To: 27212-done <at> debbugs.gnu.org
Subject: Re: bug#27212: 203a9455c4695152fc5d0085bffeead9ce3216c2 broke system
 boot
Date: Sun, 4 Jun 2017 11:14:37 +0000
ng0 transcribed 0.5K bytes:
> Otherwise perfectly working systems I have *all* break after
> https://git.savannah.gnu.org/cgit/guix.git/commit/?id=203a9455c4695152fc5d0085bffeead9ce3216c2
> 
> and the last thing  they display before guile repl is the
> message in that commit.
> 
> Yes, I use labels. No, I don't use complicated setups.
> Please test such changes before applying them in the wild..
> I just hope this happened here and the break is a mistake.
> -- 
> ng0
> OpenPG: A88C8ADD129828D7EAC02E52E22F9BBFEE348588

Okay, this has been fixed. Thanks!
I just booted into a functional generation.

I'm closing this bug, in the future let's come up
with ideas how to prevent this.
It is not good PR to collectively send everyone into
the guile repl.

I started discussing about solutions to this offline already
and in other channels.

Thanks!
-- 
ng0
OpenPG: A88C8ADD129828D7EAC02E52E22F9BBFEE348588




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 02 Jul 2017 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 6 years and 272 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.