GNU bug report logs - #30820
Chunked store references in compiled code break grafting (again)

Previous Next

Package: guix;

Reported by: ludo <at> gnu.org (Ludovic Courtès)

Date: Wed, 14 Mar 2018 15:48:01 UTC

Severity: serious

Merged with 30395

Done: ludo <at> gnu.org (Ludovic Courtès)

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 30820 in the body.
You can then email your comments to 30820 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guix <at> gnu.org:
bug#30820; Package guix. (Wed, 14 Mar 2018 15:48:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to ludo <at> gnu.org (Ludovic Courtès):
New bug report received and forwarded. Copy sent to bug-guix <at> gnu.org. (Wed, 14 Mar 2018 15:48:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: bug-guix <at> gnu.org
Cc: Ricardo Wurmus <rekado <at> elephly.net>
Subject: Chunked store references in compiled code break grafting (again)
Date: Wed, 14 Mar 2018 16:47:04 +0100
Hello Guix,

The recently added glibc grafts triggered issues that, in the end, show
the return of <http://bugs.gnu.org/24703> (“Store references in 8-byte
chunks in compiled code”).

Specifically on commit 2b5c5f03c2f0a84f84a5517e2e6f5fa9f276ffa5:

--8<---------------cut here---------------start------------->8---
$ ./pre-inst-env guix build -e '((@ (guix packages) package-replacement) (@@ (gnu packages commencement) glibc-final))' --no-grafts
/gnu/store/m07pz38dvizwx2bl9aik6wcbbqbhz6j6-glibc-2.26.105-g0890d5379c-debug
/gnu/store/4sqaib7c2dfjv62ivrg9b8wa7bh226la-glibc-2.26.105-g0890d5379c
/gnu/store/m8m005z2jbvqrj3s5b052camzk2qxpz5-glibc-2.26.105-g0890d5379c-static
$ ./pre-inst-env guix build -e '((@ (guix packages) package-replacement) (@@ (gnu packages commencement) glibc-final))' 
/gnu/store/bdgcd723b8l1h8cg8wx4vjfypip29dsn-glibc-2.26.105-g0890d5379c-debug
/gnu/store/l2wzs674z5ac5ccrvp73gdqw02mmzr22-glibc-2.26.105-g0890d5379c
/gnu/store/2rp8zmymxi32wrw4s44f2dc67ci9kxgs-glibc-2.26.105-g0890d5379c-static
$ grep -r 4sqai /gnu/store/l2wzs674z5ac5ccrvp73gdqw02mmzr22-glibc-2.26.105-g0890d5379c
Duuma dosiero /gnu/store/l2wzs674z5ac5ccrvp73gdqw02mmzr22-glibc-2.26.105-g0890d5379c/sbin/sln kongruas
Duuma dosiero /gnu/store/l2wzs674z5ac5ccrvp73gdqw02mmzr22-glibc-2.26.105-g0890d5379c/sbin/nscd kongruas
--8<---------------cut here---------------end--------------->8---

If we look with hexl-mode, we see that libc-2.26.so contains some of
these too:

--8<---------------cut here---------------start------------->8---
000236a0: b92f 676e 752f 7374 6f48 8d50 01c6 003a  ./gnu/stoH.P...:
000236b0: 4889 4801 48b8 7265 2f34 7371 6169 31f6  H.H.H.re/4sqai1.
000236c0: 4889 4208 48b8 6237 6332 6466 6a76 31ff  H.B.H.b7c2dfjv1.
000236d0: 4889 4210 48b8 3632 6976 7267 3962 c642  H.B.H.62ivrg9b.B
000236e0: 5000 4889 4218 48b8 3877 6137 6268 3232  P.H.B.H.8wa7bh22
000236f0: 4889 4220 48b8 366c 612d 676c 6962 4889  H.B H.6la-glibH.
00023700: 4228 48b8 632d 322e 3236 2e31 4889 4230  B(H.c-2.26.1H.B0
00023710: 48b8 3035 2d67 3038 3930 4889 4238 48b8  H.05-g0890H.B8H.
00023720: 6435 3337 3963 2f6c 4889 4240 48b8 6962  d5379c/lH.B <at> H.ib
00023730: 2f67 636f 6e76 4889 4248 e881 f70b 0048  /gconvH.BH.....H
--8<---------------cut here---------------end--------------->8---

That in turns means that gconv-modules won’t be found, and that guile
with crash during startup with “Uncaught exception” (because early on it
fails to convert file names to UTF-8 through iconv).

To be continued…

Ludo’.




Severity set to 'serious' from 'normal' Request was from ludo <at> gnu.org (Ludovic Courtès) to control <at> debbugs.gnu.org. (Wed, 14 Mar 2018 16:32:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-guix <at> gnu.org:
bug#30820; Package guix. (Wed, 14 Mar 2018 17:25:02 GMT) Full text and rfc822 format available.

Message #10 received at 30820 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: 30820 <at> debbugs.gnu.org
Cc: Ricardo Wurmus <rekado <at> elephly.net>, Mark H Weaver <mhw <at> netris.org>
Subject: Re: bug#30820: Chunked store references in compiled code break
 grafting (again)
Date: Wed, 14 Mar 2018 18:24:48 +0100
There are two issues.  First the gcc-strmov-store-file-names.patch
stopped working.  When we introduced it, we were at 5.3.0 and 6.2.0 (see
<https://git.savannah.gnu.org/cgit/guix.git/tree/gnu/packages/gcc.scm?id=8033772363b287ca529461e575ceb4a69d7af942>).

Today, with 5.5.0 and 6.3.0 it seems to have no effect (I use ‘ltrace’
here, which shows the getenv("NIX_STORE") call, which confirms that the
‘store_reference_p’ function in that patch gets called):

--8<---------------cut here---------------start------------->8---
$ cat strmov.c
#define _GNU_SOURCE
#include <string.h>
static const char str[] = "MEMpCPY /gnu/store/THIS IS A LONG STRING, A VERY, VERY, VERY LOOOOONG STRING";

extern char *p, *q;

#ifndef MEMCPY
# define MEMCPY memcpy
#endif

void foo (char *x, char *y)
{
  /* MEMCPY (x, str, sizeof str); */
  MEMCPY (y, "this is a literal /gnu/store string", 35);
}
$ guix environment --no-grafts --pure --ad-hoc -C ltrace gcc-toolchain <at> 5 -- ltrace -f -e getenv gcc -O2 -c strmov.c 
[pid 2] gcc->getenv("COLUMNS")                                                                                 = nil
[pid 2] gcc->getenv("TERM")                                                                                    = nil
[pid 2] gcc->getenv("GCC_EXEC_PREFIX")                                                                         = nil
[pid 2] gcc->getenv("PATH")                                                                                    = "/gnu/store/c312mxd0ykdh3zi3zj13m"...
[pid 2] gcc->getenv("PATH")                                                                                    = "/gnu/store/c312mxd0ykdh3zi3zj13m"...
[pid 2] gcc->getenv("COMPILER_PATH")                                                                           = nil
[pid 2] gcc->getenv("LIBRARY_PATH")                                                                            = "/gnu/store/c312mxd0ykdh3zi3zj13m"...
[pid 2] gcc->getenv("LPATH")                                                                                   = nil
[pid 2] gcc->getenv("GCC_COMPARE_DEBUG")                                                                       = nil
[pid 2] gcc->getenv("GCC_ROOT")                                                                                = nil
[pid 2] gcc->getenv("BINUTILS_ROOT")                                                                           = nil
[pid 2] gcc->getenv("TMPDIR")                                                                                  = "/tmp"
[pid 2] gcc->getenv("TMP")                                                                                     = "/tmp"
[pid 2] gcc->getenv("TEMP")                                                                                    = "/tmp"
[pid 3] --- Called exec() ---
[pid 3] gcc->getenv("COLUMNS")                                                                                 = nil
[pid 3] gcc->getenv("TERM")                                                                                    = nil
[pid 3] gcc->getenv("DEPENDENCIES_OUTPUT")                                                                     = nil
[pid 3] gcc->getenv("SUNPRO_DEPENDENCIES")                                                                     = nil
[pid 3] gcc->getenv("CPATH")                                                                                   = nil
[pid 3] gcc->getenv("C_INCLUDE_PATH")                                                                          = "/gnu/store/c312mxd0ykdh3zi3zj13m"...
[pid 3] gcc->getenv("GCC_EXEC_PREFIX")                                                                         = nil
[pid 3] gcc->getenv("PWD")                                                                                     = nil
[pid 3] gcc->getenv("NIX_STORE")                                                                               = nil
[pid 3] +++ exited (status 0) +++
[pid 2] --- SIGCHLD (Child exited) ---
[pid 4] --- Called exec() ---
[pid 4] +++ exited (status 0) +++
[pid 2] --- SIGCHLD (Child exited) ---
[pid 2] +++ exited (status 0) +++
$ objdump -S strmov.o | grep movabs
   0:	48 b8 74 68 69 73 20 	movabs $0x2073692073696874,%rax
  11:	48 b8 61 20 6c 69 74 	movabs $0x61726574696c2061,%rax
  1f:	48 b8 6c 20 2f 67 6e 	movabs $0x732f756e672f206c,%rax
  2d:	48 b8 74 6f 72 65 20 	movabs $0x7274732065726f74,%rax
$ guix environment --no-grafts --pure --ad-hoc -C ltrace gcc-toolchain <at> 6 -- ltrace -f -e getenv gcc -O2 -c strmov.c 
[pid 2] gcc->getenv("COLUMNS")                                                                                 = nil
[pid 2] gcc->getenv("TERM")                                                                                    = nil
[pid 2] gcc->getenv("GCC_EXEC_PREFIX")                                                                         = nil
[pid 2] gcc->getenv("PATH")                                                                                    = "/gnu/store/5141337yvhhjj5fhdx660"...
[pid 2] gcc->getenv("PATH")                                                                                    = "/gnu/store/5141337yvhhjj5fhdx660"...
[pid 2] gcc->getenv("COMPILER_PATH")                                                                           = nil
[pid 2] gcc->getenv("LIBRARY_PATH")                                                                            = "/gnu/store/5141337yvhhjj5fhdx660"...
[pid 2] gcc->getenv("LPATH")                                                                                   = nil
[pid 2] gcc->getenv("GCC_COMPARE_DEBUG")                                                                       = nil
[pid 2] gcc->getenv("GCC_ROOT")                                                                                = nil
[pid 2] gcc->getenv("BINUTILS_ROOT")                                                                           = nil
[pid 2] gcc->getenv("TMPDIR")                                                                                  = "/tmp"
[pid 2] gcc->getenv("TMP")                                                                                     = "/tmp"
[pid 2] gcc->getenv("TEMP")                                                                                    = "/tmp"
[pid 3] --- Called exec() ---
[pid 3] gcc->getenv("COLUMNS")                                                                                 = nil
[pid 3] gcc->getenv("TERM")                                                                                    = nil
[pid 3] gcc->getenv("DEPENDENCIES_OUTPUT")                                                                     = nil
[pid 3] gcc->getenv("SUNPRO_DEPENDENCIES")                                                                     = nil
[pid 3] gcc->getenv("CPATH")                                                                                   = nil
[pid 3] gcc->getenv("C_INCLUDE_PATH")                                                                          = "/gnu/store/5141337yvhhjj5fhdx660"...
[pid 3] gcc->getenv("GCC_EXEC_PREFIX")                                                                         = nil
[pid 3] gcc->getenv("PWD")                                                                                     = nil
[pid 3] gcc->getenv("NIX_STORE")                                                                               = nil
[pid 3] +++ exited (status 0) +++
[pid 2] --- SIGCHLD (Child exited) ---
[pid 4] --- Called exec() ---
[pid 4] +++ exited (status 0) +++
[pid 2] --- SIGCHLD (Child exited) ---
[pid 2] +++ exited (status 0) +++
$ objdump -S strmov.o | grep movabs
   0:	48 b8 74 68 69 73 20 	movabs $0x2073692073696874,%rax
  11:	48 b8 61 20 6c 69 74 	movabs $0x61726574696c2061,%rax
  1f:	48 b8 6c 20 2f 67 6e 	movabs $0x732f756e672f206c,%rax
  2d:	48 b8 74 6f 72 65 20 	movabs $0x7274732065726f74,%rax
--8<---------------cut here---------------end--------------->8---

On GCC 7.3.0, it works as intended:

--8<---------------cut here---------------start------------->8---
$ guix environment --no-grafts --pure --ad-hoc -C ltrace gcc-toolchain <at> 7 -- ltrace -f -e getenv gcc -O2 -c strmov.c 
[pid 2] gcc->getenv("COLUMNS")                                                                                 = nil
[pid 2] gcc->getenv("TERM")                                                                                    = nil
[pid 2] gcc->getenv("GCC_EXEC_PREFIX")                                                                         = nil
[pid 2] gcc->getenv("PATH")                                                                                    = "/gnu/store/zahi3qjfnfq5z0bsxkggq"...
[pid 2] gcc->getenv("PATH")                                                                                    = "/gnu/store/zahi3qjfnfq5z0bsxkggq"...
[pid 2] gcc->getenv("COMPILER_PATH")                                                                           = nil
[pid 2] gcc->getenv("LIBRARY_PATH")                                                                            = "/gnu/store/zahi3qjfnfq5z0bsxkggq"...
[pid 2] gcc->getenv("LPATH")                                                                                   = nil
[pid 2] gcc->getenv("GCC_COMPARE_DEBUG")                                                                       = nil
[pid 2] gcc->getenv("GCC_ROOT")                                                                                = nil
[pid 2] gcc->getenv("BINUTILS_ROOT")                                                                           = nil
[pid 2] gcc->getenv("TMPDIR")                                                                                  = "/tmp"
[pid 2] gcc->getenv("TMP")                                                                                     = "/tmp"
[pid 2] gcc->getenv("TEMP")                                                                                    = "/tmp"
[pid 3] --- Called exec() ---
[pid 3] gcc->getenv("COLUMNS")                                                                                 = nil
[pid 3] gcc->getenv("TERM")                                                                                    = nil
[pid 3] gcc->getenv("DEPENDENCIES_OUTPUT")                                                                     = nil
[pid 3] gcc->getenv("SUNPRO_DEPENDENCIES")                                                                     = nil
[pid 3] gcc->getenv("CPATH")                                                                                   = nil
[pid 3] gcc->getenv("C_INCLUDE_PATH")                                                                          = "/gnu/store/zahi3qjfnfq5z0bsxkggq"...
[pid 3] gcc->getenv("GCC_EXEC_PREFIX")                                                                         = nil
[pid 3] gcc->getenv("PWD")                                                                                     = nil
[pid 3] gcc->getenv("NIX_STORE")                                                                               = nil
[pid 3] +++ exited (status 0) +++
[pid 2] --- SIGCHLD (Child exited) ---
[pid 4] --- Called exec() ---
[pid 4] +++ exited (status 0) +++
[pid 2] --- SIGCHLD (Child exited) ---
[pid 2] +++ exited (status 0) +++
$ objdump -S strmov.o | grep movabs
$ guix environment --no-grafts --pure --ad-hoc -C ltrace gcc-toolchain <at> 6 -- /bin/sh -c 'export NIX_STORE=/foo ; ltrace -f -e getenv   gcc -O2 -c strmov.c '
[pid 3] gcc->getenv("COLUMNS")                                                                                 = nil
[pid 3] gcc->getenv("TERM")                                                                                    = nil
[pid 3] gcc->getenv("GCC_EXEC_PREFIX")                                                                         = nil
[pid 3] gcc->getenv("PATH")                                                                                    = "/gnu/store/5141337yvhhjj5fhdx660"...
[pid 3] gcc->getenv("PATH")                                                                                    = "/gnu/store/5141337yvhhjj5fhdx660"...
[pid 3] gcc->getenv("COMPILER_PATH")                                                                           = nil
[pid 3] gcc->getenv("LIBRARY_PATH")                                                                            = "/gnu/store/5141337yvhhjj5fhdx660"...
[pid 3] gcc->getenv("LPATH")                                                                                   = nil
[pid 3] gcc->getenv("GCC_COMPARE_DEBUG")                                                                       = nil
[pid 3] gcc->getenv("GCC_ROOT")                                                                                = nil
[pid 3] gcc->getenv("BINUTILS_ROOT")                                                                           = nil
[pid 3] gcc->getenv("TMPDIR")                                                                                  = "/tmp"
[pid 3] gcc->getenv("TMP")                                                                                     = "/tmp"
[pid 3] gcc->getenv("TEMP")                                                                                    = "/tmp"
[pid 4] --- Called exec() ---
[pid 4] gcc->getenv("COLUMNS")                                                                                 = nil
[pid 4] gcc->getenv("TERM")                                                                                    = nil
[pid 4] gcc->getenv("DEPENDENCIES_OUTPUT")                                                                     = nil
[pid 4] gcc->getenv("SUNPRO_DEPENDENCIES")                                                                     = nil
[pid 4] gcc->getenv("CPATH")                                                                                   = nil
[pid 4] gcc->getenv("C_INCLUDE_PATH")                                                                          = "/gnu/store/5141337yvhhjj5fhdx660"...
[pid 4] gcc->getenv("GCC_EXEC_PREFIX")                                                                         = nil
[pid 4] gcc->getenv("PWD")                                                                                     = "/home/ludo/src/guix"
[pid 4] gcc->getenv("NIX_STORE")                                                                               = "/foo"
[pid 4] gcc->getenv("NIX_STORE")                                                                               = "/foo"
[pid 4] +++ exited (status 0) +++
[pid 3] --- SIGCHLD (Child exited) ---
[pid 5] --- Called exec() ---
[pid 5] +++ exited (status 0) +++
[pid 3] --- SIGCHLD (Child exited) ---
[pid 3] +++ exited (status 0) +++
$ objdump -S strmov.o | grep movabs
   0:	48 b8 74 68 69 73 20 	movabs $0x2073692073696874,%rax
  11:	48 b8 61 20 6c 69 74 	movabs $0x61726574696c2061,%rax
  1f:	48 b8 6c 20 2f 67 6e 	movabs $0x732f756e672f206c,%rax
  2d:	48 b8 74 6f 72 65 20 	movabs $0x7274732065726f74,%rax
--8<---------------cut here---------------end--------------->8---

The second issue is that the patch only ever worked with literal
strings.  It does not “see” strings in constant arrays like the ‘str’
array in the example above.

The gconv-module file name mentioned in the first message in this bug
report is an example of a string assigned to a static array, in
iconv/gconv_conf.c:

  /* This is the default path where we look for module lists.  */
  static const char default_gconv_path[] = GCONV_PATH;

Ludo’.




Merged 30395 30820. Request was from ludo <at> gnu.org (Ludovic Courtès) to control <at> debbugs.gnu.org. (Wed, 14 Mar 2018 17:38:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-guix <at> gnu.org:
bug#30820; Package guix. (Fri, 16 Mar 2018 08:55:02 GMT) Full text and rfc822 format available.

Message #15 received at 30820 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: 30820 <at> debbugs.gnu.org
Cc: Ricardo Wurmus <rekado <at> elephly.net>, 30395 <at> debbugs.gnu.org,
 Mark H Weaver <mhw <at> netris.org>
Subject: Re: bug#30820: Chunked store references in compiled code break
 grafting (again)
Date: Fri, 16 Mar 2018 09:54:31 +0100
ludo <at> gnu.org (Ludovic Courtès) skribis:

> void foo (char *x, char *y)
> {
>   /* MEMCPY (x, str, sizeof str); */
>   MEMCPY (y, "this is a literal /gnu/store string", 35);
> }

This was not a correct example because “/gnu/store” must be followed by
at least 34 chars for the patch to work.  And indeed, it does work in
this case:

--8<---------------cut here---------------start------------->8---
$ cat strmov.c
#define _GNU_SOURCE
#include <string.h>
static const char str[] = "MEMpCPY /gnu/store/THIS IS A LONG STRING, A VERY, VERY, VERY LOOOOONG STRING";

extern char *p, *q;

#ifndef MEMCPY
# define MEMCPY memcpy
#endif

void foo (char *x, char *y)
{
  /* MEMCPY (x, str, sizeof str); */
  MEMCPY (y, "this is a literal /gnu/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee string", 35);
}
$ guix environment --ad-hoc gcc-toolchain <at> 5 -C -- gcc -O2 -c strmov.c
$ objdump -S strmov.o |grep movabs
--8<---------------cut here---------------end--------------->8---

So the real issue is this:

> The second issue is that the patch only ever worked with literal
> strings.  It does not “see” strings in constant arrays like the ‘str’
> array in the example above.

Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#30820; Package guix. (Mon, 19 Mar 2018 19:07:01 GMT) Full text and rfc822 format available.

Message #18 received at 30820 <at> debbugs.gnu.org (full text, mbox):

From: Mark H Weaver <mhw <at> netris.org>
To: ludo <at> gnu.org (Ludovic Courtès)
Cc: 30820 <at> debbugs.gnu.org
Subject: Re: bug#30820: Chunked store references in compiled code break
 grafting (again)
Date: Mon, 19 Mar 2018 15:05:26 -0400
ludo <at> gnu.org (Ludovic Courtès) writes:

> The recently added glibc grafts triggered issues that, in the end, show
> the return of <http://bugs.gnu.org/24703> (“Store references in 8-byte
> chunks in compiled code”).

I think that we should generalize our reference scanning and grafting
code to support store references broken into pieces, as long as each
piece containing part of the hash is at least 8 bytes long.

Here's my preliminary proposal:

(1) The reference scanner should recognize any 8-byte substring of a
    hash as a valid reference to that hash.

(2) To enable reliable grafting of chunked references, we should impose
    the following new restrictions: (a) the store prefix must be at
    least 6 bytes, (b) grafting can change only the hash, not the
    readable part of the store name, and (c) the readable part of the
    store name must be at least 6 bytes.

(3) The grafter should recognize and replace any 8-byte subsequence of
    the absolute store file name.

The rationale for the restrictions is to ensure that any byte that needs
to be modified by the grafter should be part of an 8-byte substring of
the absolute store file name.  This requires that there be at least 7
bytes of known text before the first changed byte and after the last
changed byte.  This is needed to provide a reasonable upper bound on the
probability of grafting a matching sequence of bytes that is not a store
reference.

I'd be willing to work on implementing this soon.

What do you think?

      Mark




Information forwarded to bug-guix <at> gnu.org:
bug#30820; Package guix. (Mon, 19 Mar 2018 19:18:01 GMT) Full text and rfc822 format available.

Message #21 received at 30820 <at> debbugs.gnu.org (full text, mbox):

From: Mark H Weaver <mhw <at> netris.org>
To: ludo <at> gnu.org (Ludovic Courtès)
Cc: 30820 <at> debbugs.gnu.org
Subject: Re: bug#30820: Chunked store references in compiled code break
 grafting (again)
Date: Mon, 19 Mar 2018 15:16:30 -0400
Mark H Weaver <mhw <at> netris.org> writes:

> ludo <at> gnu.org (Ludovic Courtès) writes:
>
>> The recently added glibc grafts triggered issues that, in the end, show
>> the return of <http://bugs.gnu.org/24703> (“Store references in 8-byte
>> chunks in compiled code”).
>
> I think that we should generalize our reference scanning and grafting
> code to support store references broken into pieces, as long as each
> piece containing part of the hash is at least 8 bytes long.
>
> Here's my preliminary proposal:
>
> (1) The reference scanner should recognize any 8-byte substring of a
>     hash as a valid reference to that hash.

To facilitate the transition: to support older versions of the Guix
daemon (or Nix daemon), we could add a final phase to gnu-build-system
which would "unhide" these references as follows: It would scan each
output directory for 8-byte substrings of hashes.  If it finds any
references that the old scanner is unable to see, it would install a
file to the output directory containing the full store names of the
hidden references.

This only works for output directories, though.  I don't know what to do
about output files containing hidden references.  I guess the final
phase should raise an error in this case, and hopefully it rarely
happens.

      Mark




Information forwarded to bug-guix <at> gnu.org:
bug#30820; Package guix. (Mon, 19 Mar 2018 21:23:02 GMT) Full text and rfc822 format available.

Message #24 received at 30820 <at> debbugs.gnu.org (full text, mbox):

From: Danny Milosavljevic <dannym <at> scratchpost.org>
To: ludo <at> gnu.org (Ludovic Courtès)
Cc: 30820 <at> debbugs.gnu.org
Subject: Re: bug#30820: Chunked store references in compiled code break
 grafting (again)
Date: Mon, 19 Mar 2018 22:22:05 +0100
[Message part 1 (text/plain, inline)]
Hi Ludo,

> The second issue is that the patch only ever worked with literal
> strings.  It does not “see” strings in constant arrays like the ‘str’
> array in the example above.
> 
> The gconv-module file name mentioned in the first message in this bug
> report is an example of a string assigned to a static array, in
> iconv/gconv_conf.c:
> 
>   /* This is the default path where we look for module lists.  */
>   static const char default_gconv_path[] = GCONV_PATH;

I don't understand why this is a problem.  Grafting would just
mutate default_gconv_path, right?  Who cares how the runtime memcpy
works (if there's no literal as source)?

Or do you mean that it memcpys at compile time?
[Message part 2 (application/pgp-signature, inline)]

Information forwarded to bug-guix <at> gnu.org:
bug#30820; Package guix. (Mon, 19 Mar 2018 21:35:02 GMT) Full text and rfc822 format available.

Message #27 received at 30820 <at> debbugs.gnu.org (full text, mbox):

From: Danny Milosavljevic <dannym <at> scratchpost.org>
To: Mark H Weaver <mhw <at> netris.org>
Cc: Ludovic Courtès <ludo <at> gnu.org>, 30820 <at> debbugs.gnu.org
Subject: Re: bug#30820: Chunked store references in compiled code break
 grafting (again)
Date: Mon, 19 Mar 2018 22:34:02 +0100
[Message part 1 (text/plain, inline)]
Hi Mark,

On Mon, 19 Mar 2018 15:05:26 -0400
Mark H Weaver <mhw <at> netris.org> wrote:

> I think that we should generalize our reference scanning and grafting
> code to support store references broken into pieces, as long as each
> piece containing part of the hash is at least 8 bytes long.

I don't think it's possible to get that to work reliably over all possible
optimizations.  I mean why 8 Byte?  That's probably because of the
8 Byte registers on x86_64.

What would happen on ARM? (32 bit registers)?

And on MIPS (immediates only smaller than 32 bits)?

What if gcc finds some repetition in the hash and decides not to load
the register again the second time?

I think the best way - since we have control over the entire toolchain anyway -
is to make gcc not do the data inlining in the first place.

As far as I understand the gcc patches work after all.  Ludo just made
a mistake testing it.

Although when we do this patching of gcc we should add a test to gcc
that makes sure that the data inlining is indeed not there anymore
for store paths.

> Here's my preliminary proposal:
> 
> (1) The reference scanner should recognize any 8-byte substring of a
>     hash as a valid reference to that hash.
> 
> (2) To enable reliable grafting of chunked references, we should impose
>     the following new restrictions: (a) the store prefix must be at
>     least 6 bytes, (b) grafting can change only the hash, not the
>     readable part of the store name, and (c) the readable part of the
>     store name must be at least 6 bytes.
> 
> (3) The grafter should recognize and replace any 8-byte subsequence of
>     the absolute store file name.
> 
> The rationale for the restrictions is to ensure that any byte that needs
> to be modified by the grafter should be part of an 8-byte substring of
> the absolute store file name.  This requires that there be at least 7
> bytes of known text before the first changed byte and after the last
> changed byte.  This is needed to provide a reasonable upper bound on the
> probability of grafting a matching sequence of bytes that is not a store
> reference.

I think that is a good way to get the risk down - but it will not actually
fix the problem for good.

Also, complexity like the above makes the reference scanner more brittle and
it's easier for bugs to hide in it (and it's slower).

I think the gcc patch is actually a better way.
[Message part 2 (application/pgp-signature, inline)]

Information forwarded to bug-guix <at> gnu.org:
bug#30820; Package guix. (Mon, 19 Mar 2018 22:28:01 GMT) Full text and rfc822 format available.

Message #30 received at 30820 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Danny Milosavljevic <dannym <at> scratchpost.org>
Cc: Mark H Weaver <mhw <at> netris.org>, 30820 <at> debbugs.gnu.org
Subject: Re: bug#30820: Chunked store references in compiled code break
 grafting (again)
Date: Mon, 19 Mar 2018 23:27:39 +0100
Danny Milosavljevic <dannym <at> scratchpost.org> skribis:

> As far as I understand the gcc patches work after all.  Ludo just made
> a mistake testing it.

They work except when the string is not directly a literal, as in:

  static const char foo[] = "/gnu/store/…";

  …

    strcpy (x, foo);

I think that’s workable though.

Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#30820; Package guix. (Mon, 19 Mar 2018 22:30:02 GMT) Full text and rfc822 format available.

Message #33 received at 30820 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Danny Milosavljevic <dannym <at> scratchpost.org>
Cc: 30820 <at> debbugs.gnu.org
Subject: Re: bug#30820: Chunked store references in compiled code break
 grafting (again)
Date: Mon, 19 Mar 2018 23:29:10 +0100
Heya,

Danny Milosavljevic <dannym <at> scratchpost.org> skribis:

>> The second issue is that the patch only ever worked with literal
>> strings.  It does not “see” strings in constant arrays like the ‘str’
>> array in the example above.
>> 
>> The gconv-module file name mentioned in the first message in this bug
>> report is an example of a string assigned to a static array, in
>> iconv/gconv_conf.c:
>> 
>>   /* This is the default path where we look for module lists.  */
>>   static const char default_gconv_path[] = GCONV_PATH;
>
> I don't understand why this is a problem.  Grafting would just
> mutate default_gconv_path, right?  Who cares how the runtime memcpy
> works (if there's no literal as source)?

At compile-time, GCC finds out that ‘default_gconv_path’ is used only
in one place, in an strcpy call.  Thus, it chooses to use the movabs
optimization, and as a consequence, to split ‘default_gconv_path’ in
8-byte chunks.  It can do so because it’s ‘static’.

Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#30820; Package guix. (Mon, 19 Mar 2018 22:35:01 GMT) Full text and rfc822 format available.

Message #36 received at 30820 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Mark H Weaver <mhw <at> netris.org>
Cc: 30820 <at> debbugs.gnu.org
Subject: Re: bug#30820: Chunked store references in compiled code break
 grafting (again)
Date: Mon, 19 Mar 2018 23:34:38 +0100
Hi Mark,

Mark H Weaver <mhw <at> netris.org> skribis:

> ludo <at> gnu.org (Ludovic Courtès) writes:
>
>> The recently added glibc grafts triggered issues that, in the end, show
>> the return of <http://bugs.gnu.org/24703> (“Store references in 8-byte
>> chunks in compiled code”).
>
> I think that we should generalize our reference scanning and grafting
> code to support store references broken into pieces, as long as each
> piece containing part of the hash is at least 8 bytes long.
>
> Here's my preliminary proposal:
>
> (1) The reference scanner should recognize any 8-byte substring of a
>     hash as a valid reference to that hash.
>
> (2) To enable reliable grafting of chunked references, we should impose
>     the following new restrictions: (a) the store prefix must be at
>     least 6 bytes, (b) grafting can change only the hash, not the
>     readable part of the store name, and (c) the readable part of the
>     store name must be at least 6 bytes.
>
> (3) The grafter should recognize and replace any 8-byte subsequence of
>     the absolute store file name.

I’m quite reluctant because it would add complexity, it will probably
slow things down, and yet it may not handle all the cases, as Danny
suggests.

Mind you, the GCC patches are not perfect either, but they’re relatively
easy to deal with (well, so far at least).  In theory we would need
similar patches for LLVM and maybe a couple other native compilers,
though, which is obviously a downside, although we haven’t had any
problems so far.

WDYT?

Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#30820; Package guix. (Tue, 20 Mar 2018 00:54:01 GMT) Full text and rfc822 format available.

Message #39 received at 30820 <at> debbugs.gnu.org (full text, mbox):

From: Mark H Weaver <mhw <at> netris.org>
To: ludo <at> gnu.org (Ludovic Courtès)
Cc: 30820 <at> debbugs.gnu.org
Subject: Re: bug#30820: Chunked store references in compiled code break
 grafting (again)
Date: Mon, 19 Mar 2018 20:52:37 -0400
ludo <at> gnu.org (Ludovic Courtès) writes:

> Mark H Weaver <mhw <at> netris.org> skribis:
>
>> ludo <at> gnu.org (Ludovic Courtès) writes:
>>
>>> The recently added glibc grafts triggered issues that, in the end, show
>>> the return of <http://bugs.gnu.org/24703> (“Store references in 8-byte
>>> chunks in compiled code”).
>>
>> I think that we should generalize our reference scanning and grafting
>> code to support store references broken into pieces, as long as each
>> piece containing part of the hash is at least 8 bytes long.
>>
>> Here's my preliminary proposal:
>>
>> (1) The reference scanner should recognize any 8-byte substring of a
>>     hash as a valid reference to that hash.
>>
>> (2) To enable reliable grafting of chunked references, we should impose
>>     the following new restrictions: (a) the store prefix must be at
>>     least 6 bytes, (b) grafting can change only the hash, not the
>>     readable part of the store name, and (c) the readable part of the
>>     store name must be at least 6 bytes.
>>
>> (3) The grafter should recognize and replace any 8-byte subsequence of
>>     the absolute store file name.
>
> I’m quite reluctant because it would add complexity, it will probably
> slow things down, and yet it may not handle all the cases, as Danny
> suggests.
>
> Mind you, the GCC patches are not perfect either, but they’re relatively
> easy to deal with (well, so far at least).  In theory we would need
> similar patches for LLVM and maybe a couple other native compilers,
> though, which is obviously a downside, although we haven’t had any
> problems so far.

We would also need to find a solution to the problem described in the
thread "broken references in jar manifests" on guix-devel started by
Ricardo, which still has not found a satifactory solution.

  https://lists.gnu.org/archive/html/guix-devel/2018-03/msg00006.html

My opinion is that I consider Guix's current expectations for how
software must store its data on disk to be far too onerous, in cases
where that data might include a store reference.  I don't see sufficient
justification for imposing such an onerous requirement on the software
in Guix.

Ultimately, I would prefer to see the scanning and grafting operations
completely generalized, so that in general each package can specify how
to scan and graft that particular package, making use of libraries in
(guix build ...) to cover the usual cases.  In most cases, that code
would be within build-systems.

       Mark




Information forwarded to bug-guix <at> gnu.org:
bug#30820; Package guix. (Tue, 20 Mar 2018 01:06:02 GMT) Full text and rfc822 format available.

Message #42 received at 30820 <at> debbugs.gnu.org (full text, mbox):

From: Mark H Weaver <mhw <at> netris.org>
To: Danny Milosavljevic <dannym <at> scratchpost.org>
Cc: Ludovic Courtès <ludo <at> gnu.org>, 30820 <at> debbugs.gnu.org
Subject: Re: bug#30820: Chunked store references in compiled code break
 grafting (again)
Date: Mon, 19 Mar 2018 21:04:09 -0400
Danny Milosavljevic <dannym <at> scratchpost.org> writes:

> On Mon, 19 Mar 2018 15:05:26 -0400
> Mark H Weaver <mhw <at> netris.org> wrote:
>
>> I think that we should generalize our reference scanning and grafting
>> code to support store references broken into pieces, as long as each
>> piece containing part of the hash is at least 8 bytes long.
>
> I don't think it's possible to get that to work reliably over all possible
> optimizations.  I mean why 8 Byte?  That's probably because of the
> 8 Byte registers on x86_64.
>
> What would happen on ARM? (32 bit registers)?
>
> And on MIPS (immediates only smaller than 32 bits)?

It wouldn't make sense to do this kind of optimization with 4-byte
chunks, because the overhead for the instructions would be too large to
justify it.  In any case, I've not seen any reports of any compiler
generating code like this with 4-byte chunks.

> What if gcc finds some repetition in the hash and decides not to load
> the register again the second time?

The probability of that happening with 8-byte chunks is on the order of
1 in 10^14, even if the GCC developers implemented such a thing.

> I think the best way - since we have control over the entire toolchain anyway -
> is to make gcc not do the data inlining in the first place.
>
> As far as I understand the gcc patches work after all.  Ludo just made
> a mistake testing it.
>
> Although when we do this patching of gcc we should add a test to gcc
> that makes sure that the data inlining is indeed not there anymore
> for store paths.
>
>> Here's my preliminary proposal:
>> 
>> (1) The reference scanner should recognize any 8-byte substring of a
>>     hash as a valid reference to that hash.
>> 
>> (2) To enable reliable grafting of chunked references, we should impose
>>     the following new restrictions: (a) the store prefix must be at
>>     least 6 bytes, (b) grafting can change only the hash, not the
>>     readable part of the store name, and (c) the readable part of the
>>     store name must be at least 6 bytes.
>> 
>> (3) The grafter should recognize and replace any 8-byte subsequence of
>>     the absolute store file name.
>> 
>> The rationale for the restrictions is to ensure that any byte that needs
>> to be modified by the grafter should be part of an 8-byte substring of
>> the absolute store file name.  This requires that there be at least 7
>> bytes of known text before the first changed byte and after the last
>> changed byte.  This is needed to provide a reasonable upper bound on the
>> probability of grafting a matching sequence of bytes that is not a store
>> reference.
>
> I think that is a good way to get the risk down - but it will not actually
> fix the problem for good.

Nothing that we do will ever fix this problem for good.  Don't pretend
that patching GCC would fix the problem for good.  There are problems in
other software as well (e.g. in JAR manifests), and we already patched
GCC once, and it broke some time later without anyone noticing.

> Also, complexity like the above makes the reference scanner more brittle and
> it's easier for bugs to hide in it (and it's slower).

I think it could be implemented about as fast in practice, although it
would require more memory.  The hash table would be bigger, containing
all 8-byte substrings of the hashes.

As for bugs in the reference scanner: it's still simple enough that it
could be formally proved correct without much difficulty.

       Mark




Information forwarded to bug-guix <at> gnu.org:
bug#30820; Package guix. (Tue, 20 Mar 2018 08:51:01 GMT) Full text and rfc822 format available.

Message #45 received at 30820 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Mark H Weaver <mhw <at> netris.org>
Cc: Danny Milosavljevic <dannym <at> scratchpost.org>, 30820 <at> debbugs.gnu.org
Subject: Re: bug#30820: Chunked store references in compiled code break
 grafting (again)
Date: Tue, 20 Mar 2018 09:50:29 +0100
Mark H Weaver <mhw <at> netris.org> skribis:

> Nothing that we do will ever fix this problem for good.  Don't pretend
> that patching GCC would fix the problem for good.  There are problems in
> other software as well (e.g. in JAR manifests), and we already patched
> GCC once, and it broke some time later without anyone noticing.

My initial message spread some confusion: the GCC patch still works as
before, but there’s always been a corner case that was improperly
handled.

Now, we currently don’t have tests that would allow us to detect
breakage before it bites, which is a problem.

Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#30820; Package guix. (Tue, 20 Mar 2018 08:57:02 GMT) Full text and rfc822 format available.

Message #48 received at 30820 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Mark H Weaver <mhw <at> netris.org>
Cc: 30820 <at> debbugs.gnu.org
Subject: Re: bug#30820: Chunked store references in compiled code break
 grafting (again)
Date: Tue, 20 Mar 2018 09:56:48 +0100
Mark H Weaver <mhw <at> netris.org> skribis:

> ludo <at> gnu.org (Ludovic Courtès) writes:
>
>> Mark H Weaver <mhw <at> netris.org> skribis:
>>
>>> ludo <at> gnu.org (Ludovic Courtès) writes:
>>>
>>>> The recently added glibc grafts triggered issues that, in the end, show
>>>> the return of <http://bugs.gnu.org/24703> (“Store references in 8-byte
>>>> chunks in compiled code”).
>>>
>>> I think that we should generalize our reference scanning and grafting
>>> code to support store references broken into pieces, as long as each
>>> piece containing part of the hash is at least 8 bytes long.
>>>
>>> Here's my preliminary proposal:
>>>
>>> (1) The reference scanner should recognize any 8-byte substring of a
>>>     hash as a valid reference to that hash.
>>>
>>> (2) To enable reliable grafting of chunked references, we should impose
>>>     the following new restrictions: (a) the store prefix must be at
>>>     least 6 bytes, (b) grafting can change only the hash, not the
>>>     readable part of the store name, and (c) the readable part of the
>>>     store name must be at least 6 bytes.
>>>
>>> (3) The grafter should recognize and replace any 8-byte subsequence of
>>>     the absolute store file name.
>>
>> I’m quite reluctant because it would add complexity, it will probably
>> slow things down, and yet it may not handle all the cases, as Danny
>> suggests.
>>
>> Mind you, the GCC patches are not perfect either, but they’re relatively
>> easy to deal with (well, so far at least).  In theory we would need
>> similar patches for LLVM and maybe a couple other native compilers,
>> though, which is obviously a downside, although we haven’t had any
>> problems so far.
>
> We would also need to find a solution to the problem described in the
> thread "broken references in jar manifests" on guix-devel started by
> Ricardo, which still has not found a satifactory solution.
>
>   https://lists.gnu.org/archive/html/guix-devel/2018-03/msg00006.html
>
> My opinion is that I consider Guix's current expectations for how
> software must store its data on disk to be far too onerous, in cases
> where that data might include a store reference.  I don't see sufficient
> justification for imposing such an onerous requirement on the software
> in Guix.

In practice Guix and Nix have been living fine under these constraints,
and with almost no modifications to upstream software, so it’s not that
bad.  Nix doesn’t have grafts though, which is why this problem was less
visible there.

> Ultimately, I would prefer to see the scanning and grafting operations
> completely generalized, so that in general each package can specify how
> to scan and graft that particular package, making use of libraries in
> (guix build ...) to cover the usual cases.  In most cases, that code
> would be within build-systems.

That would be precise GC instead of conservative GC in a way, right?
So in essence we’d have, say, a scanner for ELF files (like ‘dh_shdep’
in Debian or whatever it’s called), a scanner for jars, and so on?
Still, how would we deal with strings embedded in the middle of
binaries, as in this case?  It seems to remain an open issue, no?

I’m interested in experiments in that direction.  I think that’s a
longer-term goal, though, and there are open questions: we have no idea
how well that would work in practice.

Thanks,
Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#30820; Package guix. (Tue, 20 Mar 2018 23:08:03 GMT) Full text and rfc822 format available.

Message #51 received at 30820-done <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: 30820-done <at> debbugs.gnu.org
Cc: Ricardo Wurmus <rekado <at> elephly.net>, Mark H Weaver <mhw <at> netris.org>,
 30395-done <at> debbugs.gnu.org
Subject: Re: bug#30820: Chunked store references in compiled code break
 grafting (again)
Date: Wed, 21 Mar 2018 00:07:30 +0100
Hello,

ludo <at> gnu.org (Ludovic Courtès) skribis:

> So the real issue is this:
>
>> The second issue is that the patch only ever worked with literal
>> strings.  It does not “see” strings in constant arrays like the ‘str’
>> array in the example above.

Good news!  Commit e288572710250bcd2aa0f69ce88154d98ac69b29 adjusts
‘gcc-strmov-store-file-names.patch’ in ‘core-updates’ to correctly deal
with this case:

--8<---------------cut here---------------start------------->8---
$ cat strmov.c 
#define _GNU_SOURCE
#include <string.h>
static const char str[] =
  "This is a /gnu/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee string in a global variable.";

extern char *p, *q;

#ifndef MEMCPY
# define MEMCPY memcpy
#endif

void foo (char *x, char *y)
{
  MEMCPY (x, str, sizeof str);
  MEMCPY (y, "this is a literal /gnu/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee string", 35);
}
$ ./pre-inst-env guix build -e '(@@ (gnu packages commencement) gcc-final)'
/gnu/store/wzdyqkdslk1s6f0vi9qw1xha8cniijzs-gcc-5.5.0-lib
/gnu/store/46ww5s9zvsw04id438c4drpnwd9m6vl8-gcc-5.5.0
$ /gnu/store/46ww5s9zvsw04id438c4drpnwd9m6vl8-gcc-5.5.0/bin/gcc -O2 -c strmov.c
$ objdump -S strmov.o |grep movabs
$ NIX_STORE=/foo /gnu/store/46ww5s9zvsw04id438c4drpnwd9m6vl8-gcc-5.5.0/bin/gcc -O2 -c strmov.c
$ objdump -S strmov.o |grep movabs
   0:	48 b8 54 68 69 73 20 	movabs $0x2073692073696854,%rax
   a:	48 ba 74 6f 72 65 2f 	movabs $0x6565652f65726f74,%rdx
  1e:	48 b8 61 20 2f 67 6e 	movabs $0x732f756e672f2061,%rax
  30:	48 b8 65 65 65 65 65 	movabs $0x6565656565656565,%rax
  4a:	48 b8 65 65 65 65 65 	movabs $0x2065656565656565,%rax
  58:	48 b8 73 74 72 69 6e 	movabs $0x6920676e69727473,%rax
  66:	48 b8 6e 20 61 20 67 	movabs $0x626f6c672061206e,%rax
  74:	48 b8 61 6c 20 76 61 	movabs $0x6169726176206c61,%rax
  82:	48 b8 74 68 69 73 20 	movabs $0x2073692073696874,%rax
  93:	48 b8 61 20 6c 69 74 	movabs $0x61726574696c2061,%rax
  a5:	48 b8 6c 20 2f 67 6e 	movabs $0x732f756e672f206c,%rax
--8<---------------cut here---------------end--------------->8---

I built everything about to ‘gcc-final’ in ‘core-updates’.  I checked
manually that none of the /gnu/store references in libc-2.26.so were
chunked.

For the record, what the patch initially did was to skip code that would
otherwise emit a “block move” when expanding __builtin_memcpy & co.
This patch additionally skips similar code that would replace
__builtin_memcpy calls with memory moves early on, in
‘gimple_fold_builtin_memory_op’, before ‘expand_builtin’ is called.

In the example above, this transformation would lead to the code below
(as seen with ‘-fdump-tree-all’ in the ‘gimple’ phase output):

--8<---------------cut here---------------start------------->8---
foo (char * x, char * y)
{
  MEM[(char * {ref-all})x] = MEM[(char * {ref-all})&str];
  memcpy (y, "this is a literal /gnu/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee string", 35);
}
--8<---------------cut here---------------end--------------->8---

With the patch we get:

--8<---------------cut here---------------start------------->8---
foo (char * x, char * y)
{
  memcpy (x, &str, 85);
  memcpy (y, "this is a literal /gnu/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee string", 35);
}
--8<---------------cut here---------------end--------------->8---

Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#30820; Package guix. (Wed, 21 Mar 2018 04:20:02 GMT) Full text and rfc822 format available.

Message #54 received at 30820 <at> debbugs.gnu.org (full text, mbox):

From: Mark H Weaver <mhw <at> netris.org>
To: ludo <at> gnu.org (Ludovic Courtès)
Cc: 30820 <at> debbugs.gnu.org
Subject: Re: bug#30820: Chunked store references in compiled code break
 grafting (again)
Date: Wed, 21 Mar 2018 00:17:53 -0400
Hi Ludovic,

ludo <at> gnu.org (Ludovic Courtès) writes:

> Mark H Weaver <mhw <at> netris.org> skribis:
>
>> We would also need to find a solution to the problem described in the
>> thread "broken references in jar manifests" on guix-devel started by
>> Ricardo, which still has not found a satifactory solution.
>>
>>   https://lists.gnu.org/archive/html/guix-devel/2018-03/msg00006.html

Okay, do you have a proposed fix for the issue of jar manifests?

There's a specification for that file format which mandates that "No
line may be longer than 72 bytes (not characters), in its UTF8-encoded
form.  If a value would make the initial line longer than this, it
should be continued on extra lines (each starting with a single SPACE)."

>> My opinion is that I consider Guix's current expectations for how
>> software must store its data on disk to be far too onerous, in cases
>> where that data might include a store reference.  I don't see sufficient
>> justification for imposing such an onerous requirement on the software
>> in Guix.
>
> In practice Guix and Nix have been living fine under these constraints,
> and with almost no modifications to upstream software, so it’s not that
> bad.  Nix doesn’t have grafts though, which is why this problem was less
> visible there.
>
>> Ultimately, I would prefer to see the scanning and grafting operations
>> completely generalized, so that in general each package can specify how
>> to scan and graft that particular package, making use of libraries in
>> (guix build ...) to cover the usual cases.  In most cases, that code
>> would be within build-systems.
>
> That would be precise GC instead of conservative GC in a way, right?
> So in essence we’d have, say, a scanner for ELF files (like ‘dh_shdep’
> in Debian or whatever it’s called), a scanner for jars, and so on?

No, I wasn't thinking along those lines.  While I'd very much prefer
precise GC, it seems wholly infeasible for us to write precise scanners
and grafters for every file format of every package in Guix.

My thought was that supporting scanning and grafting of 8-byte-or-longer
substrings of hashes would cover both GCC's inlined strings and jar
manifests, the two issues that we currently know about, and that it
would be nice if we could add further methods in the future.  For
example, some software might store its data in UTF-16, or compressed.

> Still, how would we deal with strings embedded in the middle of
> binaries, as in this case?  It seems to remain an open issue, no?

I believe that I addressed that case in my original proposal, no?

> I’m interested in experiments in that direction.  I think that’s a
> longer-term goal, though, and there are open questions: we have no idea
> how well that would work in practice.

Thanks for discussing it.  I'm willing to drop it and go with your
decision for now, but the "jar manifest" issue still needs a solution.

     Regards,
       Mark




Information forwarded to bug-guix <at> gnu.org:
bug#30820; Package guix. (Wed, 21 Mar 2018 05:45:01 GMT) Full text and rfc822 format available.

Message #57 received at 30820 <at> debbugs.gnu.org (full text, mbox):

From: Mark H Weaver <mhw <at> netris.org>
To: ludo <at> gnu.org (Ludovic Courtès), Danny Milosavljevic
 <dannym <at> scratchpost.org>
Cc: 30820 <at> debbugs.gnu.org
Subject: Re: bug#30820: Chunked store references in compiled code break
 grafting (again)
Date: Wed, 21 Mar 2018 01:43:15 -0400
I just realized that my proposal is unworkable.  If we allow strings
containing store paths to be split into pieces, then some of those
pieces may contain as little as one character of the hash.  For example,
the grafter might find "/store/c", which is likely not enough to
determine which of the transitive inputs is being referenced, and
therefore the grafter cannot decide what to write in place of the "c".

Sorry for the noise.

       Mark




Information forwarded to bug-guix <at> gnu.org:
bug#30820; Package guix. (Wed, 21 Mar 2018 06:40:03 GMT) Full text and rfc822 format available.

Message #60 received at 30820-done <at> debbugs.gnu.org (full text, mbox):

From: Ricardo Wurmus <rekado <at> elephly.net>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: Mark H Weaver <mhw <at> netris.org>, 30395-done <at> debbugs.gnu.org,
 30820-done <at> debbugs.gnu.org
Subject: Re: bug#30820: Chunked store references in compiled code break
 grafting (again)
Date: Wed, 21 Mar 2018 07:39:20 +0100
Hi Ludo,

> Good news!  Commit e288572710250bcd2aa0f69ce88154d98ac69b29 adjusts
> ‘gcc-strmov-store-file-names.patch’ in ‘core-updates’ to correctly deal
> with this case:
[…]
> I built everything about to ‘gcc-final’ in ‘core-updates’.  I checked
> manually that none of the /gnu/store references in libc-2.26.so were
> chunked.

Wow, thank you so much for fixing this!

Is this an option that we can suggest to the GCC developers to
officially support?

-- 
Ricardo






Information forwarded to bug-guix <at> gnu.org:
bug#30820; Package guix. (Wed, 21 Mar 2018 21:00:03 GMT) Full text and rfc822 format available.

Message #63 received at 30820-done <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Ricardo Wurmus <rekado <at> elephly.net>
Cc: Mark H Weaver <mhw <at> netris.org>, 30395-done <at> debbugs.gnu.org,
 30820-done <at> debbugs.gnu.org
Subject: Re: bug#30820: Chunked store references in compiled code break
 grafting (again)
Date: Wed, 21 Mar 2018 21:59:39 +0100
Hello,

Ricardo Wurmus <rekado <at> elephly.net> skribis:

>> Good news!  Commit e288572710250bcd2aa0f69ce88154d98ac69b29 adjusts
>> ‘gcc-strmov-store-file-names.patch’ in ‘core-updates’ to correctly deal
>> with this case:
> […]
>> I built everything about to ‘gcc-final’ in ‘core-updates’.  I checked
>> manually that none of the /gnu/store references in libc-2.26.so were
>> chunked.
>
> Wow, thank you so much for fixing this!

It turned out to be easier than the first time.  ;-)

> Is this an option that we can suggest to the GCC developers to
> officially support?

I don’t know, it’s a Guix-specific hack, and just explaining the
rationale to GCC people may be tricky: they’ll think we’re all mad.  ;-)

Ludo’.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 19 Apr 2018 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 6 years and 18 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.