GNU bug report logs - #25598
R packages are not bit-reproducible

Previous Next

Package: guix;

Reported by: ludo <at> gnu.org (Ludovic Courtès)

Date: Wed, 1 Feb 2017 09:56:01 UTC

Owned by: Ricardo Wurmus <rekado <at> elephly.net>

Severity: normal

Done: Ricardo Wurmus <rekado <at> elephly.net>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 25598 in the body.
You can then email your comments to 25598 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guix <at> gnu.org:
bug#25598; Package guix. (Wed, 01 Feb 2017 09:56:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to ludo <at> gnu.org (Ludovic Courtès):
New bug report received and forwarded. Copy sent to bug-guix <at> gnu.org. (Wed, 01 Feb 2017 09:56:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: bug-guix <at> gnu.org
Subject: R packages are not bit-reproducible
Date: Wed, 01 Feb 2017 10:55:09 +0100
R packages build non-deterministically:

  https://www.gnu.org/software/guix/packages/reproducibility.html

--8<---------------cut here---------------start------------->8---
$ wget -q -O - https://mirror.hydra.gnu.org/nar/imiwif0wn7dxcc7f4zdq09y1l1132pqj-r-zoo-1.7-14 | bunzip2 | guix archive -x one
$ wget -q -O - https://bayfront.guixsd.org/nar/gzip/imiwif0wn7dxcc7f4zdq09y1l1132pqj-r-zoo-1.7-14 | gunzip | guix archive -x two
$ diff -ru one two
diff -ru one/site-library/zoo/DESCRIPTION two/site-library/zoo/DESCRIPTION
--- one/site-library/zoo/DESCRIPTION	2017-02-01 10:49:49.700423133 +0100
+++ two/site-library/zoo/DESCRIPTION	2017-02-01 10:49:57.224462007 +0100
@@ -28,4 +28,4 @@
 Maintainer: Achim Zeileis <Achim.Zeileis <at> R-project.org>
 Repository: CRAN
 Date/Publication: 2016-12-19 09:38:14
-Built: R 3.3.2; x86_64-unknown-linux-gnu; 2017-01-15 03:12:57 UTC; unix
+Built: R 3.3.2; x86_64-unknown-linux-gnu; 2017-01-23 21:48:44 UTC; unix
Binary files one/site-library/zoo/Meta/package.rds and two/site-library/zoo/Meta/package.rds differ
--8<---------------cut here---------------end--------------->8---

First there’s a timestamp in ‘DESCRIPTION’ (this is discussed at
<https://bugs.debian.org/782764>).

The .rds differences seem less trivial but there’s apparently a fix at
<https://bugs.debian.org/774031>.

Ludo’.





Information forwarded to bug-guix <at> gnu.org:
bug#25598; Package guix. (Wed, 01 Feb 2017 11:09:01 GMT) Full text and rfc822 format available.

Message #8 received at 25598 <at> debbugs.gnu.org (full text, mbox):

From: Ricardo Wurmus <ricardo.wurmus <at> mdc-berlin.de>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 25598 <at> debbugs.gnu.org
Subject: Re: bug#25598: R packages are not bit-reproducible
Date: Wed, 1 Feb 2017 12:08:45 +0100
[Message part 1 (text/plain, inline)]
It looks like R 3.3.2 already includes the fixes but they need to be
explicitly requested when installing packages.

Attached is a patch that seems to fix this.

[0001-build-r-build-system-Use-deterministic-built-date.patch (text/x-patch, inline)]
From fa42971cb7099e3b370565de5d3f454faecf0369 Mon Sep 17 00:00:00 2001
From: Ricardo Wurmus <ricardo.wurmus <at> mdc-berlin.de>
Date: Wed, 1 Feb 2017 11:42:34 +0100
Subject: [PATCH] build: r-build-system: Use deterministic built date.

Fixes <http://bugs.gnu.org/25598>.

* guix/build/r-build-system.scm (install): Pass "--built-timestamp"
option to make build deterministic.
---
 guix/build/r-build-system.scm | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/guix/build/r-build-system.scm b/guix/build/r-build-system.scm
index 3fc13eb83..24aa73d4f 100644
--- a/guix/build/r-build-system.scm
+++ b/guix/build/r-build-system.scm
@@ -1,5 +1,5 @@
 ;;; GNU Guix --- Functional package management for GNU
-;;; Copyright © 2015 Ricardo Wurmus <rekado <at> elephly.net>
+;;; Copyright © 2015, 2017 Ricardo Wurmus <rekado <at> elephly.net>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -84,6 +84,7 @@
          (params       (append configure-flags
                                (list "--install-tests"
                                      (string-append "--library=" site-library)
+                                     "--built-timestamp=1970-01-01"
                                      ".")))
          (site-path    (string-append site-library ":"
                                       (generate-site-path inputs))))
-- 
2.11.0


Information forwarded to bug-guix <at> gnu.org:
bug#25598; Package guix. (Wed, 01 Feb 2017 13:01:01 GMT) Full text and rfc822 format available.

Message #11 received at 25598 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Ricardo Wurmus <ricardo.wurmus <at> mdc-berlin.de>
Cc: 25598 <at> debbugs.gnu.org
Subject: Re: bug#25598: R packages are not bit-reproducible
Date: Wed, 01 Feb 2017 14:00:25 +0100
Hi!

Ricardo Wurmus <ricardo.wurmus <at> mdc-berlin.de> skribis:

> From fa42971cb7099e3b370565de5d3f454faecf0369 Mon Sep 17 00:00:00 2001
> From: Ricardo Wurmus <ricardo.wurmus <at> mdc-berlin.de>
> Date: Wed, 1 Feb 2017 11:42:34 +0100
> Subject: [PATCH] build: r-build-system: Use deterministic built date.
>
> Fixes <http://bugs.gnu.org/25598>.
>
> * guix/build/r-build-system.scm (install): Pass "--built-timestamp"
> option to make build deterministic.

Great.  I think it’s fine for master, that’s 276 packages but they don’t
take long to build.

Does that also help with the .rds discrepancies?

Thank you for the super-fast reply!

Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#25598; Package guix. (Fri, 10 Feb 2017 12:39:01 GMT) Full text and rfc822 format available.

Message #14 received at 25598 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Ricardo Wurmus <ricardo.wurmus <at> mdc-berlin.de>
Cc: Guix-devel <guix-devel <at> gnu.org>, 25598 <at> debbugs.gnu.org
Subject: Re: [PATCH] More reproducibility fixes for R.
Date: Fri, 10 Feb 2017 13:38:07 +0100
Hi!

Ricardo Wurmus <ricardo.wurmus <at> mdc-berlin.de> skribis:

> attached are more reproducibility fixes for R.  Unfortunately, it seems
> that files of type “rdb”, “rdx”, and “rds” are still not reproducible.
> This leaves us with the following files in R that are currently not
> reproducible:

Could it be that --built-timestamp is not honored for R modules within R?
Do the Debian patches mentioned in #25598 help?

> From e8cd2114b824ab6fed671c2214956ee22deeaedf Mon Sep 17 00:00:00 2001
> From: Ricardo Wurmus <ricardo.wurmus <at> mdc-berlin.de>
> Date: Thu, 9 Feb 2017 14:34:57 +0100
> Subject: [PATCH 1/2] gnu: r: Fix syntax for INSTALL_OPTS.
>
> This is a follow-up to commit 4621acfd8272fa93d0530faa5f015b26a194b587.
>
> * gnu/packages/statistics.scm (r)[arguments]: Ensure that
> "--built-timestamp" appears on the same line as the other INSTALL_OPTS.

So the previous attempt had no effect, right?

LGTM.

> From 95b939f662a29b3cc6973a2fba286f32faf010c1 Mon Sep 17 00:00:00 2001
> From: Ricardo Wurmus <ricardo.wurmus <at> mdc-berlin.de>
> Date: Thu, 9 Feb 2017 15:40:02 +0100
> Subject: [PATCH 2/2] gnu: r: Fix more reproducibility problems.
>
> * gnu/packages/statistics.scm (r)[arguments]: Patch locations in the
> build system that need special treatment for reproducibility.

LGTM, thanks!

Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#25598; Package guix. (Wed, 08 Mar 2017 11:55:02 GMT) Full text and rfc822 format available.

Message #17 received at 25598 <at> debbugs.gnu.org (full text, mbox):

From: Ricardo Wurmus <ricardo.wurmus <at> mdc-berlin.de>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: Guix-devel <guix-devel <at> gnu.org>, 25598 <at> debbugs.gnu.org
Subject: Re: [PATCH] More reproducibility fixes for R.
Date: Wed, 8 Mar 2017 12:53:43 +0100
Ludovic Courtès <ludo <at> gnu.org> writes:

> Ricardo Wurmus <ricardo.wurmus <at> mdc-berlin.de> skribis:
>
>> attached are more reproducibility fixes for R.  Unfortunately, it seems
>> that files of type “rdb”, “rdx”, and “rds” are still not reproducible.
>> This leaves us with the following files in R that are currently not
>> reproducible:
>
> Could it be that --built-timestamp is not honored for R modules within
> R?

With these two patches the flag *should* be honoured.  I don’t
understand yet where the rds differences come from, but I’ll
investigate this now.

> Do the Debian patches mentioned in #25598 help?

R 3.3.2 already includes the patches that were posted on Debian bug
#774031.  The patch at #782764 is the equivalent of our change to the
r-build-system to pass down the flag to R packages.


>> From e8cd2114b824ab6fed671c2214956ee22deeaedf Mon Sep 17 00:00:00 2001
>> From: Ricardo Wurmus <ricardo.wurmus <at> mdc-berlin.de>
>> Date: Thu, 9 Feb 2017 14:34:57 +0100
>> Subject: [PATCH 1/2] gnu: r: Fix syntax for INSTALL_OPTS.
>>
>> This is a follow-up to commit 4621acfd8272fa93d0530faa5f015b26a194b587.
>>
>> * gnu/packages/statistics.scm (r)[arguments]: Ensure that
>> "--built-timestamp" appears on the same line as the other INSTALL_OPTS.
>
> So the previous attempt had no effect, right?

Yeah, it was not effective and I failed to use “guix build --check”
properly (without grafts), so I thought everything was fine already.

>> From 95b939f662a29b3cc6973a2fba286f32faf010c1 Mon Sep 17 00:00:00 2001
>> From: Ricardo Wurmus <ricardo.wurmus <at> mdc-berlin.de>
>> Date: Thu, 9 Feb 2017 15:40:02 +0100
>> Subject: [PATCH 2/2] gnu: r: Fix more reproducibility problems.
>>
>> * gnu/packages/statistics.scm (r)[arguments]: Patch locations in the
>> build system that need special treatment for reproducibility.
>
> LGTM, thanks!

I pushed both to master.

--
Ricardo

GPG: BCA6 89B6 3655 3801 C3C6  2150 197A 5888 235F ACAC
https://elephly.net




Information forwarded to bug-guix <at> gnu.org:
bug#25598; Package guix. (Wed, 08 Mar 2017 17:57:01 GMT) Full text and rfc822 format available.

Message #20 received at 25598 <at> debbugs.gnu.org (full text, mbox):

From: Ricardo Wurmus <rekado <at> elephly.net>
To: Guix-devel <guix-devel <at> gnu.org>
Cc: 25598 <at> debbugs.gnu.org
Subject: Re: [PATCH] More reproducibility fixes for R.
Date: Wed, 08 Mar 2017 18:56:34 +0100
Ricardo Wurmus <ricardo.wurmus <at> mdc-berlin.de> writes:

> attached are more reproducibility fixes for R.  Unfortunately, it seems
> that files of type “rdb”, “rdx”, and “rds” are still not reproducible.
> This leaves us with the following files in R that are currently not
> reproducible:

[…]
> /lib/R/library/boot/help/paths.rds
> /lib/R/library/class/help/paths.rds
> /lib/R/library/cluster/help/paths.rds
> /lib/R/library/codetools/help/paths.rds
> /lib/R/library/foreign/help/paths.rds
> /lib/R/library/KernSmooth/help/paths.rds
> /lib/R/library/lattice/help/paths.rds
> /lib/R/library/MASS/help/paths.rds
> /lib/R/library/Matrix/help/paths.rds
> /lib/R/library/mgcv/help/paths.rds
> /lib/R/library/nlme/help/paths.rds
> /lib/R/library/nnet/help/paths.rds
> /lib/R/library/rpart/help/paths.rds
> /lib/R/library/spatial/help/paths.rds
> /lib/R/library/survival/help/paths.rds
[…]
>
> I’ll try to figure out if there’s something we can do to make them
> reproducible (there’s a Debian bug report with relevant information).  I
> had originally assumed that 3.3.2 already included fixes for this.

The paths.rds files contain temporary paths like this:

    /tmp/guix-build-r-3.3.2.drv-0/RtmpCmeE9W/R.INSTALL43fb733deccc/survival/

These paths contain the random strings produced by “mkdtemp”.  This
happens in “src/main/sysutils.c”.

I don’t know if we need these files.  All of them are part of the
recommended packages.  I don’t know if these are also built by Debian.

I patched the package in a previous commit to override the built
timestamp, and it does seem to have an effect on the DESCRIPTION file,
but it does not affect the .rd* files.  More investigation required.

--
Ricardo

GPG: BCA6 89B6 3655 3801 C3C6  2150 197A 5888 235F ACAC
https://elephly.net





Owner recorded as Ricardo Wurmus <rekado <at> elephly.net>. Request was from Ricardo Wurmus <rekado <at> elephly.net> to control <at> debbugs.gnu.org. (Tue, 14 Mar 2017 08:25:01 GMT) Full text and rfc822 format available.

Information forwarded to bug-guix <at> gnu.org:
bug#25598; Package guix. (Thu, 16 Mar 2017 07:55:02 GMT) Full text and rfc822 format available.

Message #25 received at 25598 <at> debbugs.gnu.org (full text, mbox):

From: Ricardo Wurmus <rekado <at> elephly.net>
To: 25598 <at> debbugs.gnu.org
Cc: Ricardo Wurmus <rekado <at> elephly.net>
Subject: [PATCH] gnu: r: Fix remaining reproducibility problems.
Date: Thu, 16 Mar 2017 08:54:02 +0100
Fixes <https://bugs.gnu.org/25598>.

* gnu/packages/statistics.scm (r)[arguments]: Add remaining reproducibility
fixes to "build-reproducibly" phase.
---
 gnu/packages/statistics.scm | 35 ++++++++++++++++++++++++++++++++++-
 1 file changed, 34 insertions(+), 1 deletion(-)

diff --git a/gnu/packages/statistics.scm b/gnu/packages/statistics.scm
index 656895273..2a20abd86 100644
--- a/gnu/packages/statistics.scm
+++ b/gnu/packages/statistics.scm
@@ -134,11 +134,44 @@ be output in text, PostScript, PDF or HTML.")
              #t))
          (add-after 'unpack 'build-reproducibly
            (lambda _
-             ;; Ensure that gzipped files are reproducible
+             ;; The documentation contains time stamps to demonstrate
+             ;; documentation generation in different phases.
+             (substitute* "src/library/tools/man/Rd2HTML.Rd"
+               (("\\\\%Y-\\\\%m-\\\\%d at \\\\%H:\\\\%M:\\\\%S")
+                "(removed for reproducibility)"))
+
+             ;; Remove timestamp from tracing environment.  This fixes
+             ;; reproducibility of "methods.rd{b,x}".
+             (substitute* "src/library/methods/R/trace.R"
+               (("dateCreated = Sys.time\\(\\)")
+                "dateCreated = as.POSIXct(\"1970-1-1 00:00:00\", tz = \"UTC\")"))
+
+             ;; Ensure that gzipped files are reproducible.
              (substitute* '("src/library/grDevices/Makefile.in"
                             "doc/manual/Makefile.in")
                (("R_GZIPCMD\\)" line)
                 (string-append line " -n")))
+
+             ;; The "srcfile" procedure in "src/library/base/R/srcfile.R"
+             ;; queries the mtime of a given file and records it in an object.
+             ;; This is acceptable at runtime to detect stale source files,
+             ;; but it destroys reproducibility at build time.
+             ;;
+             ;; Instead of disabling this feature, which may have unexpected
+             ;; consequences, we reset the mtime of generated files before
+             ;; passing them to the "srcfile" procedure.
+             (substitute* "src/library/Makefile.in"
+               (("@\\(cd base && \\$\\(MAKE\\) mkdesc\\)" line)
+                (string-append line "\n	find $(top_builddir)/library/tools | xargs touch -d '1970-01-01'; \n"))
+               (("@\\$\\(MAKE\\) Rdobjects" line)
+                (string-append "@find $(srcdir)/tools | xargs touch -d '1970-01-01'; \n	"
+                               line)))
+             (substitute* "src/library/tools/Makefile.in"
+               (("@\\$\\(INSTALL_DATA\\) all.R \\$\\(top_builddir\\)/library/\\$\\(pkg\\)/R/\\$\\(pkg\\)" line)
+                (string-append
+                 line
+                 "\n	find $(srcdir)/$(pkg) $(top_builddir)/library/$(pkg) | xargs touch -d \"1970-01-01\"; \n")))
+
              ;; This library is installed using "install_package_description",
              ;; so we need to pass the "builtStamp" argument.
              (substitute* "src/library/tools/Makefile.in"
-- 
2.12.0






Information forwarded to bug-guix <at> gnu.org, Ricardo Wurmus <rekado <at> elephly.net>:
bug#25598; Package guix. (Thu, 16 Mar 2017 09:01:02 GMT) Full text and rfc822 format available.

Message #28 received at 25598 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Ricardo Wurmus <rekado <at> elephly.net>
Cc: 25598 <at> debbugs.gnu.org
Subject: Re: bug#25598: [PATCH] gnu: r: Fix remaining reproducibility problems.
Date: Thu, 16 Mar 2017 10:00:32 +0100
Hello!

Ricardo Wurmus <rekado <at> elephly.net> skribis:

> Fixes <https://bugs.gnu.org/25598>.
>
> * gnu/packages/statistics.scm (r)[arguments]: Add remaining reproducibility
> fixes to "build-reproducibly" phase.

Woow, impressive work.  You’re a reproducibility hero!

You’re welcome to push to master.

Thank you!

Ludo’.




Reply sent to Ricardo Wurmus <rekado <at> elephly.net>:
You have taken responsibility. (Fri, 17 Mar 2017 09:18:01 GMT) Full text and rfc822 format available.

Notification sent to ludo <at> gnu.org (Ludovic Courtès):
bug acknowledged by developer. (Fri, 17 Mar 2017 09:18:01 GMT) Full text and rfc822 format available.

Message #33 received at 25598-done <at> debbugs.gnu.org (full text, mbox):

From: Ricardo Wurmus <rekado <at> elephly.net>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 25598-done <at> debbugs.gnu.org
Subject: Re: bug#25598: [PATCH] gnu: r: Fix remaining reproducibility problems.
Date: Fri, 17 Mar 2017 10:17:46 +0100
Ludovic Courtès <ludo <at> gnu.org> writes:

> You’re welcome to push to master.

Done with commit 60c9190e21edfaa3a18be857b9a906b8521e948b.  Thanks for
the quick review!

-- 
Ricardo

GPG: BCA6 89B6 3655 3801 C3C6  2150 197A 5888 235F ACAC
https://elephly.net





bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Fri, 14 Apr 2017 11:24:05 GMT) Full text and rfc822 format available.

This bug report was last modified 7 years and 13 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.