Package: guix;
Reported by: Leo Famulari <leo <at> famulari.name>
Date: Tue, 2 Feb 2016 05:17:02 UTC
Severity: important
Done: Ricardo Wurmus <rekado <at> elephly.net>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 22533 in the body.
You can then email your comments to 22533 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
View this report as an mbox folder, status mbox, maintainer mbox
bug-guix <at> gnu.org
:bug#22533
; Package guix
.
(Tue, 02 Feb 2016 05:17:02 GMT) Full text and rfc822 format available.Leo Famulari <leo <at> famulari.name>
:bug-guix <at> gnu.org
.
(Tue, 02 Feb 2016 05:17:02 GMT) Full text and rfc822 format available.Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
From: Leo Famulari <leo <at> famulari.name> To: bug-guix <at> gnu.org Subject: Non-determinism in python-3 ".pyc" bytecode Date: Tue, 2 Feb 2016 00:15:44 -0500
While preparing a package for borg [0], I found that the built output was not reproducible. The problem is that the bytecode compiler [1] for Python 3.4.3 (our current version) encodes the mtime of the corresponding Python source file in the output. This is described in PEP-3147 [2], and the responsible Python code is referenced below [3]. I tested a few of our existing python-3 packages: python-ccm, python-pysam, and python-scripttest all exhibit the same problem. We fixed this in python-2 with the patch python-2.7-source-date-epoch.patch, but I don't know how to write this patch for python-3. Can somebody write this patch? I asked about this on #debian-reproducible and they said that it wasn't an issue for Debian since they don't ship bytecode, but instead generate it at install time. Of course, that doesn't really apply to Guix. I used diffoscope-34 to inspect the build outputs to find this, and you can see the report here: https://famulari.name/misc/7c55c9e97f668234ddea50299d986f14/borg-diffoscope-report.html It's first demonstrated in the file ...-borg-0.30.0/lib/python3.4/site-packages/__pycache__/site.cpython-34.pyc. The first 2 bytes are the "magic numbers" described in PEP-3147, which specify the version of the bytecode format. The next 2 bytes are the problematic timestamp, as described in the PEP-3147. [0] http://borgbackup.github.io/ [1] https://docs.python.org/3/library/py_compile.html [2] https://www.python.org/dev/peps/pep-3147/ [3] Check out the Guix git commit 4efc8eb27502c, and from there: $ tar xf $(./pre-inst-env guix build --source python-3) $ sed -n 139,140p Python-3.4.3/Lib/py_compile.py bytecode = importlib._bootstrap._code_to_bytecode( code, source_stats['mtime'], source_stats['size'])
bug-guix <at> gnu.org
:bug#22533
; Package guix
.
(Tue, 02 Feb 2016 08:55:02 GMT) Full text and rfc822 format available.Message #8 received at 22533 <at> debbugs.gnu.org (full text, mbox):
From: Leo Famulari <leo <at> famulari.name> To: 22533 <at> debbugs.gnu.org Subject: Re: bug#22533: Non-determinism in python-3 ".pyc" bytecode Date: Tue, 2 Feb 2016 03:54:39 -0500
On Tue, Feb 02, 2016 at 12:15:44AM -0500, Leo Famulari wrote: > While preparing a package for borg [0], I found that the built output > was not reproducible. The problem is that the bytecode compiler [1] for > Python 3.4.3 (our current version) encodes the mtime of the > corresponding Python source file in the output. This is described in > PEP-3147 [2], and the responsible Python code is referenced below [3]. > > I tested a few of our existing python-3 packages: python-ccm, > python-pysam, and python-scripttest all exhibit the same problem. > > We fixed this in python-2 with the patch > python-2.7-source-date-epoch.patch, but I don't know how to write this > patch for python-3. mark_weaver suggested setting the timestamps of the source files before building. I think this is a better option if it doesn't break anything. It would allow the bytecode "staleness" check to work as expected while keeping the output consistent. > > Can somebody write this patch? > > I asked about this on #debian-reproducible and they said that it wasn't > an issue for Debian since they don't ship bytecode, but instead generate > it at install time. Of course, that doesn't really apply to Guix. > > I used diffoscope-34 to inspect the build outputs to find this, and you > can see the report here: > https://famulari.name/misc/7c55c9e97f668234ddea50299d986f14/borg-diffoscope-report.html > > It's first demonstrated in the file > ...-borg-0.30.0/lib/python3.4/site-packages/__pycache__/site.cpython-34.pyc. > > The first 2 bytes are the "magic numbers" described in PEP-3147, which > specify the version of the bytecode format. The next 2 bytes are the > problematic timestamp, as described in the PEP-3147. > > [0] > http://borgbackup.github.io/ > > [1] > https://docs.python.org/3/library/py_compile.html > > [2] > https://www.python.org/dev/peps/pep-3147/ > > [3] Check out the Guix git commit 4efc8eb27502c, and from there: > $ tar xf $(./pre-inst-env guix build --source python-3) > $ sed -n 139,140p Python-3.4.3/Lib/py_compile.py > bytecode = importlib._bootstrap._code_to_bytecode( > code, source_stats['mtime'], source_stats['size']) > > >
bug-guix <at> gnu.org
:bug#22533
; Package guix
.
(Tue, 02 Feb 2016 20:42:01 GMT) Full text and rfc822 format available.Message #11 received at 22533 <at> debbugs.gnu.org (full text, mbox):
From: ludo <at> gnu.org (Ludovic Courtès) To: Leo Famulari <leo <at> famulari.name> Cc: 22533 <at> debbugs.gnu.org Subject: Re: bug#22533: Non-determinism in python-3 ".pyc" bytecode Date: Tue, 02 Feb 2016 21:41:19 +0100
[Message part 1 (text/plain, inline)]
Leo Famulari <leo <at> famulari.name> skribis: > We fixed this in python-2 with the patch > python-2.7-source-date-epoch.patch, but I don't know how to write this > patch for python-3. I would imagine something like this (untested):
[Message part 2 (text/x-patch, inline)]
--- Python-3.4.3/Lib/importlib/_bootstrap.py 2016-02-02 21:38:48.655809055 +0100 +++ Python-3.4.3/Lib/importlib/_bootstrap.py.new 2016-02-02 21:38:43.659769251 +0100 @@ -667,7 +667,10 @@ def _code_to_bytecode(code, mtime=0, sou """Compile a code object into bytecode for writing out to a byte-compiled file.""" data = bytearray(MAGIC_NUMBER) - data.extend(_w_long(mtime)) + if 'SOURCE_DATE_EPOCH' in _os.environ: + data.extend(_w_long(string.atoi(_os.environ['SOURCE_DATE_EPOCH']))) + else: + data.extend(_w_long(mtime)) data.extend(_w_long(source_size)) data.extend(marshal.dumps(code)) return data
[Message part 3 (text/plain, inline)]
Could you give it a try and refine as needed? :-) > I asked about this on #debian-reproducible and they said that it wasn't > an issue for Debian since they don't ship bytecode, but instead generate > it at install time. Of course, that doesn't really apply to Guix. I’d recommend trying #reproducible-builds on OFTC, which is more generic. Also, in some cases, it’s useful to look at <git://git.debian.org/git/reproducible/notes.git>, which contains notes about non-reproducible packages (currently partly Debian-specific, but we need to lobby to make it more generic. ;-)) Thanks, Ludo’.
bug-guix <at> gnu.org
:bug#22533
; Package guix
.
(Thu, 04 Feb 2016 23:18:02 GMT) Full text and rfc822 format available.Message #14 received at 22533 <at> debbugs.gnu.org (full text, mbox):
From: Leo Famulari <leo <at> famulari.name> To: Ludovic Courtès <ludo <at> gnu.org> Cc: 22533 <at> debbugs.gnu.org Subject: Re: bug#22533: Non-determinism in python-3 ".pyc" bytecode Date: Thu, 4 Feb 2016 18:17:08 -0500
[Message part 1 (text/plain, inline)]
On Tue, Feb 02, 2016 at 09:41:19PM +0100, Ludovic Courtès wrote: > Could you give it a try and refine as needed? :-) I altered your example as shown in the attached patch. It causes some tests related to timestamps to fail, so I disabled them in a very crude way. The final patch should address those tests more carefully. But, the patch doesn't seem to have the desired effect so I'm asking for help! Here is how I tested the patch: I build python-3 with it, and then `export SOURCE_DATE_EPOCH=1` and enter the resulting Python shell. I manually define the '_w_long' function used by the patched function. Then: print (_w_long(locale.atoi(os.getenv('SOURCE_DATE_EPOCH')))) b'\x01\x00\x00\x00' But, when I leave the Python shell and issue `python3 -m compileall helloworld.py`, the timestamps are present in the compiled bytecode. I can watch the clock "tick" by doing this repeatedly: $ touch helloworld.py && rm -r __pycache__ && \ python3 -m compileall helloworld.py && \ hexdump __pycache__/helloworld.cpython-34.pyc | head -n1 I'm not much of a Python programmer, so I'm stumped.
[0001-SOURCE_DATE_EPOCH.patch (text/x-diff, attachment)]
ludo <at> gnu.org (Ludovic Courtès)
to control <at> debbugs.gnu.org
.
(Fri, 25 Mar 2016 08:47:02 GMT) Full text and rfc822 format available.bug-guix <at> gnu.org
:bug#22533
; Package guix
.
(Tue, 29 Mar 2016 23:13:02 GMT) Full text and rfc822 format available.Message #19 received at submit <at> debbugs.gnu.org (full text, mbox):
From: Cyril Roelandt <tipecaml <at> gmail.com> To: bug-guix <at> gnu.org Subject: Re: bug#22533: Non-determinism in python-3 ".pyc" bytecode Date: Wed, 30 Mar 2016 01:11:47 +0200
[Message part 1 (text/plain, inline)]
Here is a version of the patch that works with the upstream Python, but that I cannot get to work with our Guix recipe. Could you test it and tell me what you think? I intend to push this to CPython. Cyril.
[upstream.patch (text/x-diff, attachment)]
bug-guix <at> gnu.org
:bug#22533
; Package guix
.
(Tue, 29 Mar 2016 23:14:01 GMT) Full text and rfc822 format available.Message #22 received at 22533 <at> debbugs.gnu.org (full text, mbox):
From: Cyril Roelandt <tipecaml <at> gmail.com> To: Leo Famulari <leo <at> famulari.name>, Ludovic Courtès <ludo <at> gnu.org> Cc: 22533 <at> debbugs.gnu.org Subject: Re: bug#22533: Non-determinism in python-3 ".pyc" bytecode Date: Wed, 30 Mar 2016 01:13:24 +0200
[Message part 1 (text/plain, inline)]
Here is a version of the patch that works with the upstream Python, but that I cannot get to work with our Guix recipe. Could you test it and tell me what you think? I intend to push this to CPython. Cyril.
[upstream.patch (text/x-diff, attachment)]
bug-guix <at> gnu.org
:bug#22533
; Package guix
.
(Wed, 06 Apr 2016 08:31:02 GMT) Full text and rfc822 format available.Message #25 received at 22533 <at> debbugs.gnu.org (full text, mbox):
From: ludo <at> gnu.org (Ludovic Courtès) To: Cyril Roelandt <tipecaml <at> gmail.com> Cc: 22533 <at> debbugs.gnu.org, Leo Famulari <leo <at> famulari.name> Subject: Re: bug#22533: Non-determinism in python-3 ".pyc" bytecode Date: Wed, 06 Apr 2016 10:29:57 +0200
[Message part 1 (text/plain, inline)]
Cyril Roelandt <tipecaml <at> gmail.com> skribis: > Here is a version of the patch that works with the upstream Python, but > that I cannot get to work with our Guix recipe. At first sight the patch LGTM. How does it not work for you? :-) I applied this:
[Message part 2 (text/x-patch, inline)]
diff --git a/gnu/packages/patches/python-3-deterministic-build-info.patch b/gnu/packages/patches/python-3-deterministic-build-info.patch index 22c372a..bdf9f20 100644 --- a/gnu/packages/patches/python-3-deterministic-build-info.patch +++ b/gnu/packages/patches/python-3-deterministic-build-info.patch @@ -15,3 +15,28 @@ We cannot pass it in CPPFLAGS due to whitespace in the DATE string. #ifndef DATE #ifdef __DATE__ #define DATE __DATE__ + +--- Lib/importlib/_bootstrap.py ++++ Lib/importlib/_bootstrap.py +@@ -1443,7 +1443,8 @@ class SourceLoader(_LoaderBasics): + Implementing this method allows the loader to read bytecode files. + Raises IOError when the path cannot be handled. + """ +- return {'mtime': self.path_mtime(path)} ++ return {'mtime': float(_os.environ.get(b'SOURCE_DATE_EPOCH', ++ st.st_mtime))} + + def _cache_bytecode(self, source_path, cache_path, data): + """Optional method which writes data (bytes) to a file path (a str). +@@ -1580,7 +1581,10 @@ class SourceFileLoader(FileLoader, SourceLoader): + def path_stats(self, path): + """Return the metadata for the path.""" + st = _path_stat(path) +- return {'mtime': st.st_mtime, 'size': st.st_size} ++ return { ++ 'mtime': float(_os.environ.get(b'SOURCE_DATE_EPOCH', st.st_mtime)), ++ 'size': st.st_size ++ } + + def _cache_bytecode(self, source_path, bytecode_path, data): + # Adapt between the two APIs
[Message part 3 (text/plain, inline)]
… and that leads to these test failures: --8<---------------cut here---------------start------------->8--- $ ./pre-inst-env guix build python <at> 3 --rounds=2 -K [...] ====================================================================== FAIL: test_bad_marshal (test.test_importlib.source.test_file_loader.Source_SourceLoaderBadBytecodeTestPEP302) ---------------------------------------------------------------------- Traceback (most recent call last): File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/util.py", line 22, in wrapper to_return = fxn(*args, **kwargs) File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/test_file_loader.py", line 452, in test_bad_marshal self._test_bad_marshal() File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/test_file_loader.py", line 342, in _test_bad_marshal self.import_(file_path, '_temp') AssertionError: EOFError not raised ====================================================================== FAIL: test_no_marshal (test.test_importlib.source.test_file_loader.Source_SourceLoaderBadBytecodeTestPEP302) ---------------------------------------------------------------------- Traceback (most recent call last): File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/util.py", line 22, in wrapper to_return = fxn(*args, **kwargs) File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/test_file_loader.py", line 441, in test_no_marshal self._test_no_marshal() File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/test_file_loader.py", line 322, in _test_no_marshal self.import_(file_path, '_temp') AssertionError: EOFError not raised ====================================================================== FAIL: test_non_code_marshal (test.test_importlib.source.test_file_loader.Source_SourceLoaderBadBytecodeTestPEP302) ---------------------------------------------------------------------- Traceback (most recent call last): File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/util.py", line 22, in wrapper to_return = fxn(*args, **kwargs) File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/test_file_loader.py", line 445, in test_non_code_marshal self._test_non_code_marshal() File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/test_file_loader.py", line 331, in _test_non_code_marshal self.import_(file_path, '_temp') AssertionError: ImportError not raised ====================================================================== FAIL: test_old_timestamp (test.test_importlib.source.test_file_loader.Source_SourceLoaderBadBytecodeTestPEP302) ---------------------------------------------------------------------- Traceback (most recent call last): File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/util.py", line 22, in wrapper to_return = fxn(*args, **kwargs) File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/test_file_loader.py", line 471, in test_old_timestamp self.assertEqual(bytecode_file.read(4), source_timestamp) AssertionError: b'\x01\x00\x00\x00' != b'\x7f\xc7\x04W' ====================================================================== FAIL: test_bad_marshal (test.test_importlib.source.test_file_loader.Source_SourceLoaderBadBytecodeTestPEP451) ---------------------------------------------------------------------- Traceback (most recent call last): File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/util.py", line 22, in wrapper to_return = fxn(*args, **kwargs) File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/test_file_loader.py", line 452, in test_bad_marshal self._test_bad_marshal() File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/test_file_loader.py", line 342, in _test_bad_marshal self.import_(file_path, '_temp') AssertionError: EOFError not raised ====================================================================== FAIL: test_no_marshal (test.test_importlib.source.test_file_loader.Source_SourceLoaderBadBytecodeTestPEP451) ---------------------------------------------------------------------- Traceback (most recent call last): File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/util.py", line 22, in wrapper to_return = fxn(*args, **kwargs) File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/test_file_loader.py", line 441, in test_no_marshal self._test_no_marshal() File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/test_file_loader.py", line 322, in _test_no_marshal self.import_(file_path, '_temp') AssertionError: EOFError not raised ====================================================================== FAIL: test_non_code_marshal (test.test_importlib.source.test_file_loader.Source_SourceLoaderBadBytecodeTestPEP451) ---------------------------------------------------------------------- Traceback (most recent call last): File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/util.py", line 22, in wrapper to_return = fxn(*args, **kwargs) File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/test_file_loader.py", line 445, in test_non_code_marshal self._test_non_code_marshal() File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/test_file_loader.py", line 331, in _test_non_code_marshal self.import_(file_path, '_temp') AssertionError: ImportError not raised ====================================================================== FAIL: test_old_timestamp (test.test_importlib.source.test_file_loader.Source_SourceLoaderBadBytecodeTestPEP451) ---------------------------------------------------------------------- Traceback (most recent call last): File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/util.py", line 22, in wrapper to_return = fxn(*args, **kwargs) File "/tmp/guix-build-python-minimal-3.4.3.drv-0/Python-3.4.3/Lib/test/test_importlib/source/test_file_loader.py", line 471, in test_old_timestamp self.assertEqual(bytecode_file.read(4), source_timestamp) AssertionError: b'\x01\x00\x00\x00' != b'\x7f\xc7\x04W' ---------------------------------------------------------------------- Ran 951 tests in 1.102s FAILED (failures=8, skipped=19, expected failures=1) Makefile:958: recipe for target 'test' failed --8<---------------cut here---------------end--------------->8--- ‘test_old_timestamp’ clearly needs to be adjusted to account for the change. The others have to do with the bytecode loader, so it’s probably a similar story. Could you look into it? Perhaps you tested with SOURCE_DATE_EPOCH unset? Thanks for working on this, it’s an important bug to fix! Ludo’.
bug-guix <at> gnu.org
:bug#22533
; Package guix
.
(Fri, 26 May 2017 13:42:02 GMT) Full text and rfc822 format available.Message #28 received at 22533 <at> debbugs.gnu.org (full text, mbox):
From: Marius Bakke <mbakke <at> fastmail.com> To: 22533 <at> debbugs.gnu.org Subject: Python bytecode reproducibility Date: Fri, 26 May 2017 15:41:39 +0200
[Message part 1 (text/plain, inline)]
Hello! I stumbled across this bug after re-discovering that Python bytecode is not reproducible (through "glib"). Just sharing some notes.. Nix recently made an effort to fix this. AFAICT the ".pyc" files are still a problem, but at least they got the interpreters building reproducibly: https://github.com/NixOS/nixpkgs/issues/22570 https://github.com/NixOS/nixpkgs/pull/22585 It would be great to revive this longstanding bug! *walks away slowly before anyone notices*
[signature.asc (application/pgp-signature, inline)]
bug-guix <at> gnu.org
:bug#22533
; Package guix
.
(Sat, 03 Mar 2018 22:38:02 GMT) Full text and rfc822 format available.Message #31 received at 22533 <at> debbugs.gnu.org (full text, mbox):
From: Ricardo Wurmus <rekado <at> elephly.net> To: Marius Bakke <mbakke <at> fastmail.com> Cc: 22533 <at> debbugs.gnu.org Subject: Re: bug#22533: Python bytecode reproducibility Date: Sat, 03 Mar 2018 23:37:29 +0100
Hi Guix, Marius Bakke <mbakke <at> fastmail.com> writes: > It would be great to revive this longstanding bug! Indeed. Here’s another attempt. As far as I understand, the timestamp in the pyc files only affects the header. Up until Python 3.6 (incl) the header looks like this: magic | timestamp | size Since Python 3.7 the header may either contain a timestamp or a hash: magic | 00000000000000000000000000000000 | timestamp | size magic | 00000000000000000000000000000001 | hash | size This means we likely won’t have this problem any more with Python 3.7. For Python 3.6 I guess we could add a final build phase that overwrites the timestamp in the *binary*. This needs to happen before any of the compiled files are wrapped up in a wheel. Should we just wait for Python 3.7 which is expected to be released in June 2018? We’d still have to deal with this problem in Python 2, though. Is it a bad idea to override the timestamps in the generated binaries? I think that we could avoid the recency check then, which was an obstacle to resetting the timestamps of the source files. -- Ricardo GPG: BCA6 89B6 3655 3801 C3C6 2150 197A 5888 235F ACAC https://elephly.net
bug-guix <at> gnu.org
:bug#22533
; Package guix
.
(Sun, 04 Mar 2018 09:22:01 GMT) Full text and rfc822 format available.Message #34 received at 22533 <at> debbugs.gnu.org (full text, mbox):
From: Gábor Boskovits <boskovits <at> gmail.com> To: Ricardo Wurmus <rekado <at> elephly.net> Cc: Marius Bakke <mbakke <at> fastmail.com>, 22533 <at> debbugs.gnu.org Subject: Re: bug#22533: Python bytecode reproducibility Date: Sun, 4 Mar 2018 10:21:17 +0100
[Message part 1 (text/plain, inline)]
2018-03-03 23:37 GMT+01:00 Ricardo Wurmus <rekado <at> elephly.net>: > Hi Guix, > > Marius Bakke <mbakke <at> fastmail.com> writes: > > > It would be great to revive this longstanding bug! > > Indeed. > > Here’s another attempt. As far as I understand, the timestamp in the > pyc files only affects the header. > > Up until Python 3.6 (incl) the header looks like this: > > magic | timestamp | size > > Since Python 3.7 the header may either contain a timestamp or a hash: > > magic | 00000000000000000000000000000000 | timestamp | size > magic | 00000000000000000000000000000001 | hash | size > > This means we likely won’t have this problem any more with Python 3.7. > For Python 3.6 I guess we could add a final build phase that overwrites > the timestamp in the *binary*. This needs to happen before any of the > compiled files are wrapped up in a wheel. > > Should we just wait for Python 3.7 which is expected to be released in > June 2018? We’d still have to deal with this problem in Python 2, > though. > > Is it a bad idea to override the timestamps in the generated binaries? > I think that we could avoid the recency check then, which was an > obstacle to resetting the timestamps of the source files. -- > Ricardo > > GPG: BCA6 89B6 3655 3801 C3C6 2150 197A 5888 235F ACAC > https://elephly.net > > Nix had this issue, it seems they have a python 3.5 solution, which should be easy to adopt: https://github.com/NixOS/nixpkgs/issues/22570. WDYT?
[Message part 2 (text/html, inline)]
bug-guix <at> gnu.org
:bug#22533
; Package guix
.
(Sun, 04 Mar 2018 12:47:02 GMT) Full text and rfc822 format available.Message #37 received at 22533 <at> debbugs.gnu.org (full text, mbox):
From: Ricardo Wurmus <rekado <at> elephly.net> To: Gábor Boskovits <boskovits <at> gmail.com> Cc: Marius Bakke <mbakke <at> fastmail.com>, 22533 <at> debbugs.gnu.org Subject: Re: bug#22533: Python bytecode reproducibility Date: Sun, 04 Mar 2018 13:46:07 +0100
Hi Gábor, > Nix had this issue, it seems they have a python 3.5 solution, which > should be easy to adopt: https://github.com/NixOS/nixpkgs/issues/22570. > WDYT? Here’s the patch for Nix: https://patch-diff.githubusercontent.com/raw/NixOS/nixpkgs/pull/22585.diff Here are the relevant changes to the Python packages: * Python 3.4 substituteInPlace "Lib/py_compile.py" --replace "source_stats['mtime']" "(1 if 'DETERMINISTIC_BUILD' in os.environ else source_stats['mtime'])" substituteInPlace "Lib/importlib/_bootstrap.py" --replace "source_mtime = int(source_stats['mtime'])" "source_mtime = 1" * Python 3.5 substituteInPlace "Lib/py_compile.py" --replace "source_stats['mtime']" "(1 if 'DETERMINISTIC_BUILD' in os.environ else source_stats['mtime'])" substituteInPlace "Lib/importlib/_bootstrap_external.py" --replace "source_mtime = int(st['mtime'])" "source_mtime = 1" * Python 3.6 substituteInPlace "Lib/py_compile.py" --replace "source_stats['mtime']" "(1 if 'DETERMINISTIC_BUILD' in os.environ else source_stats['mtime'])" substituteInPlace "Lib/importlib/_bootstrap_external.py" --replace "source_mtime = int(st['mtime'])" "source_mtime = 1" For all packages they set these environment variables: - set PYTHONHASHSEED=0 (for hashes of str, bytes and datetime objects) - set DETERMINISTIC_BUILD; for conditional patching of the timestamp for package builds. The timestamp is not patched in ad-hoc environments, because that would mess with Python’s ability to determine whether to compile source files. They also rebuild all bytecode (with the exception of lib2to3 because it is Python 2 code) three times, once for each optimization level. --8<---------------cut here---------------start------------->8--- + # Determinism: rebuild all bytecode + # We exclude lib2to3 because that's Python 2 code which fails + # We rebuild three times, once for each optimization level + find $out -name "*.py" | $out/bin/python -m compileall -q -f -x "lib2to3" -i - + find $out -name "*.py" | $out/bin/python -O -m compileall -q -f -x "lib2to3" -i - + find $out -name "*.py" | $out/bin/python -OO -m compileall -q -f -x "lib2to3" -i - --8<---------------cut here---------------end--------------->8--- -- Ricardo GPG: BCA6 89B6 3655 3801 C3C6 2150 197A 5888 235F ACAC https://elephly.net
bug-guix <at> gnu.org
:bug#22533
; Package guix
.
(Sun, 04 Mar 2018 15:32:02 GMT) Full text and rfc822 format available.Message #40 received at 22533 <at> debbugs.gnu.org (full text, mbox):
From: Gábor Boskovits <boskovits <at> gmail.com> To: Ricardo Wurmus <rekado <at> elephly.net> Cc: Marius Bakke <mbakke <at> fastmail.com>, 22533 <at> debbugs.gnu.org Subject: Re: bug#22533: Python bytecode reproducibility Date: Sun, 4 Mar 2018 16:30:59 +0100
[Message part 1 (text/plain, inline)]
2018-03-04 13:46 GMT+01:00 Ricardo Wurmus <rekado <at> elephly.net>: > > Hi Gábor, > > > Nix had this issue, it seems they have a python 3.5 solution, which > > should be easy to adopt: https://github.com/NixOS/nixpkgs/issues/22570. > > WDYT? > > Here’s the patch for Nix: > > https://patch-diff.githubusercontent.com/raw/ > NixOS/nixpkgs/pull/22585.diff > > Here are the relevant changes to the Python packages: > > * Python 3.4 > > substituteInPlace "Lib/py_compile.py" --replace "source_stats['mtime']" > "(1 if 'DETERMINISTIC_BUILD' in os.environ else source_stats['mtime'])" > substituteInPlace "Lib/importlib/_bootstrap.py" --replace "source_mtime > = int(source_stats['mtime'])" "source_mtime = 1" > > * Python 3.5 > > substituteInPlace "Lib/py_compile.py" --replace "source_stats['mtime']" > "(1 if 'DETERMINISTIC_BUILD' in os.environ else source_stats['mtime'])" > substituteInPlace "Lib/importlib/_bootstrap_external.py" --replace > "source_mtime = int(st['mtime'])" "source_mtime = 1" > > * Python 3.6 > substituteInPlace "Lib/py_compile.py" --replace "source_stats['mtime']" > "(1 if 'DETERMINISTIC_BUILD' in os.environ else source_stats['mtime'])" > substituteInPlace "Lib/importlib/_bootstrap_external.py" --replace > "source_mtime = int(st['mtime'])" "source_mtime = 1" > > > Nice, thanks for the summary. Can we adopt this as is? Do we need the 3.4 and 3.5 fix or the 3.6 one is enough? > For all packages they set these environment variables: > > - set PYTHONHASHSEED=0 (for hashes of str, bytes and datetime objects) > > - set DETERMINISTIC_BUILD; for conditional patching of the timestamp > for package builds. The timestamp is not patched in ad-hoc > environments, because that would mess with Python’s ability to > determine whether to compile source files. > > Should we set these in python-build-system? What about python booststrap? I guess we use gnu-build-system there, so bootstrap packages might need to set these explicitly? > They also rebuild all bytecode (with the exception of lib2to3 because it > is Python 2 code) three times, once for each optimization level. > > --8<---------------cut here---------------start------------->8--- > + # Determinism: rebuild all bytecode > + # We exclude lib2to3 because that's Python 2 code which fails > + # We rebuild three times, once for each optimization level > + find $out -name "*.py" | $out/bin/python -m compileall -q -f -x > "lib2to3" -i - > + find $out -name "*.py" | $out/bin/python -O -m compileall -q -f -x > "lib2to3" -i - > + find $out -name "*.py" | $out/bin/python -OO -m compileall -q -f -x > "lib2to3" -i - > --8<---------------cut here---------------end--------------->8--- > > Do we also have to do this, or should we settle with one optimization level? Which one? > -- > Ricardo > > GPG: BCA6 89B6 3655 3801 C3C6 2150 197A 5888 235F ACAC > https://elephly.net > > >
[Message part 2 (text/html, inline)]
bug-guix <at> gnu.org
:bug#22533
; Package guix
.
(Sun, 04 Mar 2018 19:19:01 GMT) Full text and rfc822 format available.Message #43 received at 22533 <at> debbugs.gnu.org (full text, mbox):
From: Ricardo Wurmus <rekado <at> elephly.net> To: Gábor Boskovits <boskovits <at> gmail.com> Cc: Marius Bakke <mbakke <at> fastmail.com>, 22533 <at> debbugs.gnu.org Subject: Re: bug#22533: Python bytecode reproducibility Date: Sun, 04 Mar 2018 20:18:23 +0100
[Message part 1 (text/plain, inline)]
I have applied this patch locally:
[1.diff (text/x-patch, inline)]
diff --git a/gnu/packages/python.scm b/gnu/packages/python.scm index 5f701701a..0d1ecc3c6 100644 --- a/gnu/packages/python.scm +++ b/gnu/packages/python.scm @@ -359,8 +359,42 @@ data types.") "Lib/ctypes/test/test_win32.py" ; fails on aarch64 "Lib/test/test_fcntl.py")) ; fails on aarch64 #t)))) - (arguments (substitute-keyword-arguments (package-arguments python-2) - ((#:tests? _) #t))) + (arguments + (substitute-keyword-arguments (package-arguments python-2) + ((#:tests? _) #t) + ((#:phases phases) + `(modify-phases ,phases + (add-after 'unpack 'patch-timestamp-for-pyc-files + (lambda _ + ;; We set DETERMINISTIC_BUILD to only override the mtime when + ;; building with Guix, lest we break auto-compilation in + ;; environments. + (setenv "DETERMINISTIC_BUILD" "1") + (substitute* "Lib/py_compile.py" + (("source_stats\\['mtime'\\]") + "(1 if 'DETERMINISTIC_BUILD' in os.environ else source_stats['mtime'])")) + + ;; Use deterministic hashes for strings, bytes, and datetime + ;; objects. + (setenv "PYTHONHASHSEED" "0") + + ;; Reset mtime when validating bytecode header. + (substitute* "Lib/importlib/_bootstrap_external.py" + (("source_mtime = int\\(source_stats\\['mtime'\\]\\)") + "source_mtime = 1")) + #t)) + (add-after 'unpack 'disable-timestamp-tests + (lambda _ + (substitute* "Lib/test/test_importlib/source/test_file_loader.py" + (("test_bad_marshal") + "disable_test_bad_marshal") + (("test_no_marshal") + "disable_test_no_marshal") + (("test_non_code_marshal") + "disable_test_non_code_marshal")) + #t)) + (add-before 'check 'allow-non-deterministic-compilation + (lambda _ (unsetenv "DETERMINISTIC_BUILD") #t)))))) (native-search-paths (list (search-path-specification (variable "PYTHONPATH")
[Message part 3 (text/plain, inline)]
It allows me to build python-six and python-sip reproducibly. It does not fix problems with Python 2, and I haven’t yet tested if it causes any new problems. It’s a little worrying that I had to disable three more tests that I think shouldn’t have failed. What do you think? -- Ricardo GPG: BCA6 89B6 3655 3801 C3C6 2150 197A 5888 235F ACAC https://elephly.net
bug-guix <at> gnu.org
:bug#22533
; Package guix
.
(Mon, 05 Mar 2018 00:03:01 GMT) Full text and rfc822 format available.Message #46 received at 22533 <at> debbugs.gnu.org (full text, mbox):
From: Ricardo Wurmus <rekado <at> elephly.net> To: Gábor Boskovits <boskovits <at> gmail.com> Cc: Marius Bakke <mbakke <at> fastmail.com>, 22533 <at> debbugs.gnu.org Subject: Re: bug#22533: Python bytecode reproducibility Date: Mon, 05 Mar 2018 01:02:15 +0100
Ricardo Wurmus <rekado <at> elephly.net> writes: > I have applied this patch locally: > > diff --git a/gnu/packages/python.scm b/gnu/packages/python.scm > index 5f701701a..0d1ecc3c6 100644 > --- a/gnu/packages/python.scm > +++ b/gnu/packages/python.scm > @@ -359,8 +359,42 @@ data types.") > "Lib/ctypes/test/test_win32.py" ; fails on aarch64 > "Lib/test/test_fcntl.py")) ; fails on aarch64 > #t)))) > - (arguments (substitute-keyword-arguments (package-arguments python-2) > - ((#:tests? _) #t))) > + (arguments > + (substitute-keyword-arguments (package-arguments python-2) > + ((#:tests? _) #t) > + ((#:phases phases) > + `(modify-phases ,phases > + (add-after 'unpack 'patch-timestamp-for-pyc-files > + (lambda _ > + ;; We set DETERMINISTIC_BUILD to only override the mtime when > + ;; building with Guix, lest we break auto-compilation in > + ;; environments. > + (setenv "DETERMINISTIC_BUILD" "1") > + (substitute* "Lib/py_compile.py" > + (("source_stats\\['mtime'\\]") > + "(1 if 'DETERMINISTIC_BUILD' in os.environ else source_stats['mtime'])")) > + > + ;; Use deterministic hashes for strings, bytes, and datetime > + ;; objects. > + (setenv "PYTHONHASHSEED" "0") > + > + ;; Reset mtime when validating bytecode header. > + (substitute* "Lib/importlib/_bootstrap_external.py" > + (("source_mtime = int\\(source_stats\\['mtime'\\]\\)") > + "source_mtime = 1")) > + #t)) > + (add-after 'unpack 'disable-timestamp-tests > + (lambda _ > + (substitute* "Lib/test/test_importlib/source/test_file_loader.py" > + (("test_bad_marshal") > + "disable_test_bad_marshal") > + (("test_no_marshal") > + "disable_test_no_marshal") > + (("test_non_code_marshal") > + "disable_test_non_code_marshal")) > + #t)) > + (add-before 'check 'allow-non-deterministic-compilation > + (lambda _ (unsetenv "DETERMINISTIC_BUILD") #t)))))) > (native-search-paths > (list (search-path-specification > (variable "PYTHONPATH") > > It allows me to build python-six and python-sip reproducibly. It does > not fix problems with Python 2, and I haven’t yet tested if it causes > any new problems. I tested importing modules in an ad-hoc environment — no problems. Unfortunately, this doesn’t fix all reproducibility problems with numpy: --8<---------------cut here---------------start------------->8--- Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-packages/numpy/distutils/__pycache__/__config__.cpython-36.pyc and /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/python3.6/site-packages/numpy/distutils/__pycache__/__config__.cpython-36.pyc differ Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-packages/numpy/distutils/__pycache__/exec_command.cpython-36.pyc and /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/python3.6/site-packages/numpy/distutils/__pycache__/exec_command.cpython-36.pyc differ Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-packages/numpy/distutils/__pycache__/system_info.cpython-36.pyc and /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/python3.6/site-packages/numpy/distutils/__pycache__/system_info.cpython-36.pyc differ Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-packages/numpy/__pycache__/__config__.cpython-36.pyc and /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/python3.6/site-packages/numpy/__pycache__/__config__.cpython-36.pyc differ Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-packages/numpy/__pycache__/version.cpython-36.pyc and /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/python3.6/site-packages/numpy/__pycache__/version.cpython-36.pyc differ Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-packages/numpy/testing/nose_tools/__pycache__/utils.cpython-36.pyc and /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/python3.6/site-packages/numpy/testing/nose_tools/__pycache__/utils.cpython-36.pyc differ --8<---------------cut here---------------end--------------->8--- But the successes with simpler Python packages are promising. -- Ricardo GPG: BCA6 89B6 3655 3801 C3C6 2150 197A 5888 235F ACAC https://elephly.net
bug-guix <at> gnu.org
:bug#22533
; Package guix
.
(Mon, 05 Mar 2018 00:06:01 GMT) Full text and rfc822 format available.Message #49 received at 22533 <at> debbugs.gnu.org (full text, mbox):
From: Ricardo Wurmus <rekado <at> elephly.net> To: Gábor Boskovits <boskovits <at> gmail.com> Cc: Marius Bakke <mbakke <at> fastmail.com>, 22533 <at> debbugs.gnu.org Subject: Re: bug#22533: Python bytecode reproducibility Date: Mon, 05 Mar 2018 01:05:04 +0100
Ricardo Wurmus <rekado <at> elephly.net> writes: > Unfortunately, this doesn’t fix all reproducibility problems with numpy: > > --8<---------------cut here---------------start------------->8--- > Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-packages/numpy/distutils/__pycache__/__config__.cpython-36.pyc and /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/python3.6/site-packages/numpy/distutils/__pycache__/__config__.cpython-36.pyc differ > Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-packages/numpy/distutils/__pycache__/exec_command.cpython-36.pyc and /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/python3.6/site-packages/numpy/distutils/__pycache__/exec_command.cpython-36.pyc differ > Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-packages/numpy/distutils/__pycache__/system_info.cpython-36.pyc and /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/python3.6/site-packages/numpy/distutils/__pycache__/system_info.cpython-36.pyc differ > Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-packages/numpy/__pycache__/__config__.cpython-36.pyc and /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/python3.6/site-packages/numpy/__pycache__/__config__.cpython-36.pyc differ > Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-packages/numpy/__pycache__/version.cpython-36.pyc and /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/python3.6/site-packages/numpy/__pycache__/version.cpython-36.pyc differ > Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-packages/numpy/testing/nose_tools/__pycache__/utils.cpython-36.pyc and /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/python3.6/site-packages/numpy/testing/nose_tools/__pycache__/utils.cpython-36.pyc differ > --8<---------------cut here---------------end--------------->8--- Here’s what diffoscope says: --8<---------------cut here---------------start------------->8--- diffoscope /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0{-check,}/lib/python3.6/site-packages/numpy/__pycache__/version.cpython-36.pyc --- /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-packages/numpy/__pycache__/version.cpython-36.pyc +++ /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/python3.6/site-packages/numpy/__pycache__/version.cpython-36.pyc @@ -1,8 +1,8 @@ -00000000: 330d 0d0a fa87 9c5a 2601 0000 e300 0000 3......Z&....... +00000000: 330d 0d0a c485 9c5a 2601 0000 e300 0000 3......Z&....... 00000010: 0000 0000 0000 0000 0001 0000 0040 0000 .............@.. 00000020: 0073 2000 0000 6400 5a00 6400 5a01 6400 .s ...d.Z.d.Z.d. 00000030: 5a02 6401 5a03 6402 5a04 6504 731c 6502 Z.d.Z.d.Z.e.s.e. 00000040: 5a01 6403 5300 2904 7a06 312e 3134 2e30 Z.d.S.).z.1.14.0 00000050: da28 3639 3134 6262 3431 6630 6662 3363 .(6914bb41f0fb3c 00000060: 3162 6135 3030 6261 6534 6537 6436 3731 1ba500bae4e7d671 00000070: 6461 3935 3336 3738 3666 544e 2905 da0d da9536786fTN)... --8<---------------cut here---------------end--------------->8--- In other words: this is the timestamp field of the pyc file. Maybe this can be avoided by setting DETERMINISTIC_BUILD in the python-build-system? -- Ricardo GPG: BCA6 89B6 3655 3801 C3C6 2150 197A 5888 235F ACAC https://elephly.net
bug-guix <at> gnu.org
:bug#22533
; Package guix
.
(Mon, 05 Mar 2018 09:26:02 GMT) Full text and rfc822 format available.Message #52 received at 22533 <at> debbugs.gnu.org (full text, mbox):
From: ludo <at> gnu.org (Ludovic Courtès) To: Ricardo Wurmus <rekado <at> elephly.net> Cc: Marius Bakke <mbakke <at> fastmail.com>, 22533 <at> debbugs.gnu.org Subject: Re: bug#22533: Python bytecode reproducibility Date: Mon, 05 Mar 2018 10:25:40 +0100
Hello! Ricardo Wurmus <rekado <at> elephly.net> skribis: > Is it a bad idea to override the timestamps in the generated binaries? > I think that we could avoid the recency check then, which was an > obstacle to resetting the timestamps of the source files. I think it’s good if we can fix Python itself to honor SOURCE_DATE_EPOCH for its timestamps, but it’s also OK to patch timestamps in generated binaries. We do that already in gzip headers, with ‘reset-gzip-timestamp’. Thanks for tackling this! Ludo’.
bug-guix <at> gnu.org
:bug#22533
; Package guix
.
(Mon, 05 Mar 2018 15:37:02 GMT) Full text and rfc822 format available.Message #55 received at 22533 <at> debbugs.gnu.org (full text, mbox):
From: Gábor Boskovits <boskovits <at> gmail.com> To: Ricardo Wurmus <rekado <at> elephly.net> Cc: Marius Bakke <mbakke <at> fastmail.com>, 22533 <at> debbugs.gnu.org Subject: Re: bug#22533: Python bytecode reproducibility Date: Mon, 5 Mar 2018 16:36:31 +0100
[Message part 1 (text/plain, inline)]
2018-03-05 1:05 GMT+01:00 Ricardo Wurmus <rekado <at> elephly.net>: > > Ricardo Wurmus <rekado <at> elephly.net> writes: > > > Unfortunately, this doesn’t fix all reproducibility problems with numpy: > > > > --8<---------------cut here---------------start------------->8--- > > Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dw > cc-python-numpy-1.14.0-check/lib/python3.6/site-packages/ > numpy/distutils/__pycache__/__config__.cpython-36.pyc and /gnu/store/ > kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/ > python3.6/site-packages/numpy/distutils/__pycache__/__config__.cpython-36.pyc > differ > > Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dw > cc-python-numpy-1.14.0-check/lib/python3.6/site-packages/ > numpy/distutils/__pycache__/exec_command.cpython-36.pyc and /gnu/store/ > kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/ > python3.6/site-packages/numpy/distutils/__pycache__/exec_command.cpython-36.pyc > differ > > Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dw > cc-python-numpy-1.14.0-check/lib/python3.6/site-packages/ > numpy/distutils/__pycache__/system_info.cpython-36.pyc and /gnu/store/ > kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/ > python3.6/site-packages/numpy/distutils/__pycache__/system_info.cpython-36.pyc > differ > > Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dw > cc-python-numpy-1.14.0-check/lib/python3.6/site-packages/ > numpy/__pycache__/__config__.cpython-36.pyc and /gnu/store/ > kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/ > python3.6/site-packages/numpy/__pycache__/__config__.cpython-36.pyc differ > > Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dw > cc-python-numpy-1.14.0-check/lib/python3.6/site-packages/ > numpy/__pycache__/version.cpython-36.pyc and /gnu/store/ > kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/ > python3.6/site-packages/numpy/__pycache__/version.cpython-36.pyc differ > > Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dw > cc-python-numpy-1.14.0-check/lib/python3.6/site-packages/ > numpy/testing/nose_tools/__pycache__/utils.cpython-36.pyc and /gnu/store/ > kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/ > python3.6/site-packages/numpy/testing/nose_tools/__pycache__/utils.cpython-36.pyc > differ > > --8<---------------cut here---------------end--------------->8--- > > Here’s what diffoscope says: > > --8<---------------cut here---------------start------------->8--- > diffoscope /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dw > cc-python-numpy-1.14.0{-check,}/lib/python3.6/site-packages/ > numpy/__pycache__/version.cpython-36.pyc > --- /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/ > lib/python3.6/site-packages/numpy/__pycache__/version.cpython-36.pyc > +++ /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/ > python3.6/site-packages/numpy/__pycache__/version.cpython-36.pyc > @@ -1,8 +1,8 @@ > -00000000: 330d 0d0a fa87 9c5a 2601 0000 e300 0000 3......Z&....... > +00000000: 330d 0d0a c485 9c5a 2601 0000 e300 0000 3......Z&....... > 00000010: 0000 0000 0000 0000 0001 0000 0040 0000 .............@.. > 00000020: 0073 2000 0000 6400 5a00 6400 5a01 6400 .s ...d.Z.d.Z.d. > 00000030: 5a02 6401 5a03 6402 5a04 6504 731c 6502 Z.d.Z.d.Z.e.s.e. > 00000040: 5a01 6403 5300 2904 7a06 312e 3134 2e30 Z.d.S.).z.1.14.0 > 00000050: da28 3639 3134 6262 3431 6630 6662 3363 .(6914bb41f0fb3c > 00000060: 3162 6135 3030 6261 6534 6537 6436 3731 1ba500bae4e7d671 > 00000070: 6461 3935 3336 3738 3666 544e 2905 da0d da9536786fTN)... > --8<---------------cut here---------------end--------------->8--- > > In other words: this is the timestamp field of the pyc file. > > Maybe this can be avoided by setting DETERMINISTIC_BUILD in the > python-build-system? > > It seems that the deterministic build patch already landed upstream https://github.com/python/cpython/pull/5200, so we might consider applying the upstream patches. WDYT? > -- > Ricardo > > GPG: BCA6 89B6 3655 3801 C3C6 2150 197A 5888 235F ACAC > https://elephly.net > > >
[Message part 2 (text/html, inline)]
bug-guix <at> gnu.org
:bug#22533
; Package guix
.
(Mon, 05 Mar 2018 20:34:02 GMT) Full text and rfc822 format available.Message #58 received at 22533 <at> debbugs.gnu.org (full text, mbox):
From: Gábor Boskovits <boskovits <at> gmail.com> To: Ricardo Wurmus <rekado <at> elephly.net> Cc: Marius Bakke <mbakke <at> fastmail.com>, 22533 <at> debbugs.gnu.org Subject: Re: bug#22533: Python bytecode reproducibility Date: Mon, 5 Mar 2018 21:33:02 +0100
[Message part 1 (text/plain, inline)]
2018-03-05 16:36 GMT+01:00 Gábor Boskovits <boskovits <at> gmail.com>: > 2018-03-05 1:05 GMT+01:00 Ricardo Wurmus <rekado <at> elephly.net>: > >> >> Ricardo Wurmus <rekado <at> elephly.net> writes: >> >> > Unfortunately, this doesn’t fix all reproducibility problems with numpy: >> > >> > --8<---------------cut here---------------start------------->8--- >> > Binary files /gnu/store/kd06ql8fynlydymzhhn >> wk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site- >> packages/numpy/distutils/__pycache__/__config__.cpython-36.pyc and >> /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1. >> 14.0/lib/python3.6/site-packages/numpy/distutils/__ >> pycache__/__config__.cpython-36.pyc differ >> > Binary files /gnu/store/kd06ql8fynlydymzhhn >> wk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site- >> packages/numpy/distutils/__pycache__/exec_command.cpython-36.pyc and >> /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1. >> 14.0/lib/python3.6/site-packages/numpy/distutils/__ >> pycache__/exec_command.cpython-36.pyc differ >> > Binary files /gnu/store/kd06ql8fynlydymzhhn >> wk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site- >> packages/numpy/distutils/__pycache__/system_info.cpython-36.pyc and >> /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1. >> 14.0/lib/python3.6/site-packages/numpy/distutils/__ >> pycache__/system_info.cpython-36.pyc differ >> > Binary files /gnu/store/kd06ql8fynlydymzhhn >> wk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site- >> packages/numpy/__pycache__/__config__.cpython-36.pyc and >> /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1. >> 14.0/lib/python3.6/site-packages/numpy/__pycache__/__config__.cpython-36.pyc >> differ >> > Binary files /gnu/store/kd06ql8fynlydymzhhn >> wk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site- >> packages/numpy/__pycache__/version.cpython-36.pyc and >> /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1. >> 14.0/lib/python3.6/site-packages/numpy/__pycache__/version.cpython-36.pyc >> differ >> > Binary files /gnu/store/kd06ql8fynlydymzhhn >> wk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site- >> packages/numpy/testing/nose_tools/__pycache__/utils.cpython-36.pyc and >> /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1. >> 14.0/lib/python3.6/site-packages/numpy/testing/nose_ >> tools/__pycache__/utils.cpython-36.pyc differ >> > --8<---------------cut here---------------end--------------->8--- >> >> Here’s what diffoscope says: >> >> --8<---------------cut here---------------start------------->8--- >> diffoscope /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1. >> 14.0{-check,}/lib/python3.6/site-packages/numpy/__pycache_ >> _/version.cpython-36.pyc >> --- /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1. >> 14.0-check/lib/python3.6/site-packages/numpy/__pycache__/ >> version.cpython-36.pyc >> +++ /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1. >> 14.0/lib/python3.6/site-packages/numpy/__pycache__/version.cpython-36.pyc >> @@ -1,8 +1,8 @@ >> -00000000: 330d 0d0a fa87 9c5a 2601 0000 e300 0000 3......Z&....... >> +00000000: 330d 0d0a c485 9c5a 2601 0000 e300 0000 3......Z&....... >> 00000010: 0000 0000 0000 0000 0001 0000 0040 0000 .............@.. >> 00000020: 0073 2000 0000 6400 5a00 6400 5a01 6400 .s ...d.Z.d.Z.d. >> 00000030: 5a02 6401 5a03 6402 5a04 6504 731c 6502 Z.d.Z.d.Z.e.s.e. >> 00000040: 5a01 6403 5300 2904 7a06 312e 3134 2e30 Z.d.S.).z.1.14.0 >> 00000050: da28 3639 3134 6262 3431 6630 6662 3363 .(6914bb41f0fb3c >> 00000060: 3162 6135 3030 6261 6534 6537 6436 3731 1ba500bae4e7d671 >> 00000070: 6461 3935 3336 3738 3666 544e 2905 da0d da9536786fTN)... >> --8<---------------cut here---------------end--------------->8--- >> >> In other words: this is the timestamp field of the pyc file. >> >> Maybe this can be avoided by setting DETERMINISTIC_BUILD in the >> python-build-system? >> >> > It seems that the deterministic build patch already landed upstream > https://github.com/python/cpython/pull/5200, so we might consider > applying the upstream patches. WDYT? > And also this: https://github.com/python/cpython/pull/4575. I'm now having a look at this approach. However this second one seems quite invasive... > > >> -- >> Ricardo >> >> GPG: BCA6 89B6 3655 3801 C3C6 2150 197A 5888 235F ACAC >> https://elephly.net >> >> >> >
[Message part 2 (text/html, inline)]
bug-guix <at> gnu.org
:bug#22533
; Package guix
.
(Mon, 05 Mar 2018 21:48:02 GMT) Full text and rfc822 format available.Message #61 received at 22533 <at> debbugs.gnu.org (full text, mbox):
From: Ricardo Wurmus <rekado <at> elephly.net> To: Gábor Boskovits <boskovits <at> gmail.com> Cc: Marius Bakke <mbakke <at> fastmail.com>, 22533 <at> debbugs.gnu.org Subject: Re: bug#22533: Python bytecode reproducibility Date: Mon, 05 Mar 2018 22:46:38 +0100
Gábor Boskovits <boskovits <at> gmail.com> writes: > 2018-03-05 16:36 GMT+01:00 Gábor Boskovits <boskovits <at> gmail.com>: > >> 2018-03-05 1:05 GMT+01:00 Ricardo Wurmus <rekado <at> elephly.net>: >> >>> >>> Ricardo Wurmus <rekado <at> elephly.net> writes: >>> >>> > Unfortunately, this doesn’t fix all reproducibility problems with numpy: >>> > >>> > --8<---------------cut here---------------start------------->8--- >>> > Binary files /gnu/store/kd06ql8fynlydymzhhn >>> wk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site- >>> packages/numpy/distutils/__pycache__/__config__.cpython-36.pyc and >>> /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1. >>> 14.0/lib/python3.6/site-packages/numpy/distutils/__ >>> pycache__/__config__.cpython-36.pyc differ >>> > Binary files /gnu/store/kd06ql8fynlydymzhhn >>> wk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site- >>> packages/numpy/distutils/__pycache__/exec_command.cpython-36.pyc and >>> /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1. >>> 14.0/lib/python3.6/site-packages/numpy/distutils/__ >>> pycache__/exec_command.cpython-36.pyc differ >>> > Binary files /gnu/store/kd06ql8fynlydymzhhn >>> wk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site- >>> packages/numpy/distutils/__pycache__/system_info.cpython-36.pyc and >>> /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1. >>> 14.0/lib/python3.6/site-packages/numpy/distutils/__ >>> pycache__/system_info.cpython-36.pyc differ >>> > Binary files /gnu/store/kd06ql8fynlydymzhhn >>> wk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site- >>> packages/numpy/__pycache__/__config__.cpython-36.pyc and >>> /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1. >>> 14.0/lib/python3.6/site-packages/numpy/__pycache__/__config__.cpython-36.pyc >>> differ >>> > Binary files /gnu/store/kd06ql8fynlydymzhhn >>> wk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site- >>> packages/numpy/__pycache__/version.cpython-36.pyc and >>> /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1. >>> 14.0/lib/python3.6/site-packages/numpy/__pycache__/version.cpython-36.pyc >>> differ >>> > Binary files /gnu/store/kd06ql8fynlydymzhhn >>> wk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site- >>> packages/numpy/testing/nose_tools/__pycache__/utils.cpython-36.pyc and >>> /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1. >>> 14.0/lib/python3.6/site-packages/numpy/testing/nose_ >>> tools/__pycache__/utils.cpython-36.pyc differ >>> > --8<---------------cut here---------------end--------------->8--- >>> >>> Here’s what diffoscope says: >>> >>> --8<---------------cut here---------------start------------->8--- >>> diffoscope /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1. >>> 14.0{-check,}/lib/python3.6/site-packages/numpy/__pycache_ >>> _/version.cpython-36.pyc >>> --- /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1. >>> 14.0-check/lib/python3.6/site-packages/numpy/__pycache__/ >>> version.cpython-36.pyc >>> +++ /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1. >>> 14.0/lib/python3.6/site-packages/numpy/__pycache__/version.cpython-36.pyc >>> @@ -1,8 +1,8 @@ >>> -00000000: 330d 0d0a fa87 9c5a 2601 0000 e300 0000 3......Z&....... >>> +00000000: 330d 0d0a c485 9c5a 2601 0000 e300 0000 3......Z&....... >>> 00000010: 0000 0000 0000 0000 0001 0000 0040 0000 .............@.. >>> 00000020: 0073 2000 0000 6400 5a00 6400 5a01 6400 .s ...d.Z.d.Z.d. >>> 00000030: 5a02 6401 5a03 6402 5a04 6504 731c 6502 Z.d.Z.d.Z.e.s.e. >>> 00000040: 5a01 6403 5300 2904 7a06 312e 3134 2e30 Z.d.S.).z.1.14.0 >>> 00000050: da28 3639 3134 6262 3431 6630 6662 3363 .(6914bb41f0fb3c >>> 00000060: 3162 6135 3030 6261 6534 6537 6436 3731 1ba500bae4e7d671 >>> 00000070: 6461 3935 3336 3738 3666 544e 2905 da0d da9536786fTN)... >>> --8<---------------cut here---------------end--------------->8--- >>> >>> In other words: this is the timestamp field of the pyc file. >>> >>> Maybe this can be avoided by setting DETERMINISTIC_BUILD in the >>> python-build-system? >>> >>> >> It seems that the deterministic build patch already landed upstream >> https://github.com/python/cpython/pull/5200, so we might consider >> applying the upstream patches. WDYT? >> > > And also this: https://github.com/python/cpython/pull/4575. > I'm now having a look at this approach. However this second one > seems quite invasive... These patches are for what will become Python 3.7. Python 3.6 does not have support for “invalidation_mode”, so at least the first patch would not work for us. -- Ricardo GPG: BCA6 89B6 3655 3801 C3C6 2150 197A 5888 235F ACAC https://elephly.net
bug-guix <at> gnu.org
:bug#22533
; Package guix
.
(Mon, 05 Mar 2018 22:03:01 GMT) Full text and rfc822 format available.Message #64 received at 22533 <at> debbugs.gnu.org (full text, mbox):
From: Ricardo Wurmus <rekado <at> elephly.net> To: Gábor Boskovits <boskovits <at> gmail.com> Cc: 22533 <at> debbugs.gnu.org Subject: Re: bug#22533: Python bytecode reproducibility Date: Mon, 05 Mar 2018 23:02:29 +0100
Ricardo Wurmus <rekado <at> elephly.net> writes: > Ricardo Wurmus <rekado <at> elephly.net> writes: > >> Unfortunately, this doesn’t fix all reproducibility problems with numpy: >> >> --8<---------------cut here---------------start------------->8--- >> Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-packages/numpy/distutils/__pycache__/__config__.cpython-36.pyc and /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/python3.6/site-packages/numpy/distutils/__pycache__/__config__.cpython-36.pyc differ >> Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-packages/numpy/distutils/__pycache__/exec_command.cpython-36.pyc and /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/python3.6/site-packages/numpy/distutils/__pycache__/exec_command.cpython-36.pyc differ >> Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-packages/numpy/distutils/__pycache__/system_info.cpython-36.pyc and /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/python3.6/site-packages/numpy/distutils/__pycache__/system_info.cpython-36.pyc differ >> Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-packages/numpy/__pycache__/__config__.cpython-36.pyc and /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/python3.6/site-packages/numpy/__pycache__/__config__.cpython-36.pyc differ >> Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-packages/numpy/__pycache__/version.cpython-36.pyc and /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/python3.6/site-packages/numpy/__pycache__/version.cpython-36.pyc differ >> Binary files /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-packages/numpy/testing/nose_tools/__pycache__/utils.cpython-36.pyc and /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/python3.6/site-packages/numpy/testing/nose_tools/__pycache__/utils.cpython-36.pyc differ >> --8<---------------cut here---------------end--------------->8--- > > Here’s what diffoscope says: > > --8<---------------cut here---------------start------------->8--- > diffoscope /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0{-check,}/lib/python3.6/site-packages/numpy/__pycache__/version.cpython-36.pyc > --- /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0-check/lib/python3.6/site-packages/numpy/__pycache__/version.cpython-36.pyc > +++ /gnu/store/kd06ql8fynlydymzhhnwk2lh0778dwcc-python-numpy-1.14.0/lib/python3.6/site-packages/numpy/__pycache__/version.cpython-36.pyc > @@ -1,8 +1,8 @@ > -00000000: 330d 0d0a fa87 9c5a 2601 0000 e300 0000 3......Z&....... > +00000000: 330d 0d0a c485 9c5a 2601 0000 e300 0000 3......Z&....... > 00000010: 0000 0000 0000 0000 0001 0000 0040 0000 .............@.. > 00000020: 0073 2000 0000 6400 5a00 6400 5a01 6400 .s ...d.Z.d.Z.d. > 00000030: 5a02 6401 5a03 6402 5a04 6504 731c 6502 Z.d.Z.d.Z.e.s.e. > 00000040: 5a01 6403 5300 2904 7a06 312e 3134 2e30 Z.d.S.).z.1.14.0 > 00000050: da28 3639 3134 6262 3431 6630 6662 3363 .(6914bb41f0fb3c > 00000060: 3162 6135 3030 6261 6534 6537 6436 3731 1ba500bae4e7d671 > 00000070: 6461 3935 3336 3738 3666 544e 2905 da0d da9536786fTN)... > --8<---------------cut here---------------end--------------->8--- > > In other words: this is the timestamp field of the pyc file. > > Maybe this can be avoided by setting DETERMINISTIC_BUILD in the > python-build-system? It cannot. So, something’s still missing from my patch. Does anyone see what might be missing? -- Ricardo
bug-guix <at> gnu.org
:bug#22533
; Package guix
.
(Mon, 05 Mar 2018 22:08:01 GMT) Full text and rfc822 format available.Message #67 received at 22533 <at> debbugs.gnu.org (full text, mbox):
From: Ricardo Wurmus <rekado <at> elephly.net> To: Gábor Boskovits <boskovits <at> gmail.com> Cc: 22533 <at> debbugs.gnu.org Subject: Re: bug#22533: Python bytecode reproducibility Date: Mon, 05 Mar 2018 23:06:51 +0100
Ricardo Wurmus <rekado <at> elephly.net> writes: > Ricardo Wurmus <rekado <at> elephly.net> writes: > >> I have applied this patch locally: >> >> diff --git a/gnu/packages/python.scm b/gnu/packages/python.scm >> index 5f701701a..0d1ecc3c6 100644 >> --- a/gnu/packages/python.scm >> +++ b/gnu/packages/python.scm >> @@ -359,8 +359,42 @@ data types.") >> "Lib/ctypes/test/test_win32.py" ; fails on aarch64 >> "Lib/test/test_fcntl.py")) ; fails on aarch64 >> #t)))) >> - (arguments (substitute-keyword-arguments (package-arguments python-2) >> - ((#:tests? _) #t))) >> + (arguments >> + (substitute-keyword-arguments (package-arguments python-2) >> + ((#:tests? _) #t) >> + ((#:phases phases) >> + `(modify-phases ,phases >> + (add-after 'unpack 'patch-timestamp-for-pyc-files >> + (lambda _ >> + ;; We set DETERMINISTIC_BUILD to only override the mtime when >> + ;; building with Guix, lest we break auto-compilation in >> + ;; environments. >> + (setenv "DETERMINISTIC_BUILD" "1") >> + (substitute* "Lib/py_compile.py" >> + (("source_stats\\['mtime'\\]") >> + "(1 if 'DETERMINISTIC_BUILD' in os.environ else source_stats['mtime'])")) >> + >> + ;; Use deterministic hashes for strings, bytes, and datetime >> + ;; objects. >> + (setenv "PYTHONHASHSEED" "0") >> + >> + ;; Reset mtime when validating bytecode header. >> + (substitute* "Lib/importlib/_bootstrap_external.py" >> + (("source_mtime = int\\(source_stats\\['mtime'\\]\\)") >> + "source_mtime = 1")) >> + #t)) >> + (add-after 'unpack 'disable-timestamp-tests >> + (lambda _ >> + (substitute* "Lib/test/test_importlib/source/test_file_loader.py" >> + (("test_bad_marshal") >> + "disable_test_bad_marshal") >> + (("test_no_marshal") >> + "disable_test_no_marshal") >> + (("test_non_code_marshal") >> + "disable_test_non_code_marshal")) >> + #t)) >> + (add-before 'check 'allow-non-deterministic-compilation >> + (lambda _ (unsetenv "DETERMINISTIC_BUILD") #t)))))) >> (native-search-paths >> (list (search-path-specification >> (variable "PYTHONPATH") >> >> It allows me to build python-six and python-sip reproducibly. It does >> not fix problems with Python 2, and I haven’t yet tested if it causes >> any new problems. I should also note that Python 3 itself still contains pyc files with timestamps. This could be the reason why in Nix all pyc files are rebuilt (more than once). -- Ricardo GPG: BCA6 89B6 3655 3801 C3C6 2150 197A 5888 235F ACAC https://elephly.net
bug-guix <at> gnu.org
:bug#22533
; Package guix
.
(Mon, 05 Mar 2018 23:22:01 GMT) Full text and rfc822 format available.Message #70 received at 22533 <at> debbugs.gnu.org (full text, mbox):
From: Marius Bakke <mbakke <at> fastmail.com> To: Ricardo Wurmus <rekado <at> elephly.net>, Gábor Boskovits <boskovits <at> gmail.com> Cc: 22533 <at> debbugs.gnu.org Subject: Re: bug#22533: Python bytecode reproducibility Date: Tue, 06 Mar 2018 00:21:21 +0100
[Message part 1 (text/plain, inline)]
Ricardo Wurmus <rekado <at> elephly.net> writes: > I have applied this patch locally: > > diff --git a/gnu/packages/python.scm b/gnu/packages/python.scm > index 5f701701a..0d1ecc3c6 100644 > --- a/gnu/packages/python.scm > +++ b/gnu/packages/python.scm > @@ -359,8 +359,42 @@ data types.") > "Lib/ctypes/test/test_win32.py" ; fails on aarch64 > "Lib/test/test_fcntl.py")) ; fails on aarch64 > #t)))) > - (arguments (substitute-keyword-arguments (package-arguments python-2) > - ((#:tests? _) #t))) > + (arguments > + (substitute-keyword-arguments (package-arguments python-2) > + ((#:tests? _) #t) > + ((#:phases phases) > + `(modify-phases ,phases > + (add-after 'unpack 'patch-timestamp-for-pyc-files > + (lambda _ > + ;; We set DETERMINISTIC_BUILD to only override the mtime when > + ;; building with Guix, lest we break auto-compilation in > + ;; environments. > + (setenv "DETERMINISTIC_BUILD" "1") > + (substitute* "Lib/py_compile.py" > + (("source_stats\\['mtime'\\]") > + "(1 if 'DETERMINISTIC_BUILD' in os.environ else source_stats['mtime'])")) > + > + ;; Use deterministic hashes for strings, bytes, and datetime > + ;; objects. > + (setenv "PYTHONHASHSEED" "0") > + > + ;; Reset mtime when validating bytecode header. > + (substitute* "Lib/importlib/_bootstrap_external.py" > + (("source_mtime = int\\(source_stats\\['mtime'\\]\\)") > + "source_mtime = 1")) > + #t)) > + (add-after 'unpack 'disable-timestamp-tests > + (lambda _ > + (substitute* "Lib/test/test_importlib/source/test_file_loader.py" > + (("test_bad_marshal") > + "disable_test_bad_marshal") > + (("test_no_marshal") > + "disable_test_no_marshal") > + (("test_non_code_marshal") > + "disable_test_non_code_marshal")) > + #t)) > + (add-before 'check 'allow-non-deterministic-compilation > + (lambda _ (unsetenv "DETERMINISTIC_BUILD") #t)))))) > (native-search-paths > (list (search-path-specification > (variable "PYTHONPATH") > > It allows me to build python-six and python-sip reproducibly. It does > not fix problems with Python 2, and I haven’t yet tested if it causes > any new problems. > > It’s a little worrying that I had to disable three more tests that I > think shouldn’t have failed. Woow, nice work! I can't tell what's going on with the tests, they do some bytecode manipulation stuff. Maybe it does not expect the low timestamp somehow? https://github.com/python/cpython/blob/374c6e178a7599aae46c857b17c6c8bc19dfe4c2/Lib/test/test_importlib/source/test_file_loader.py#L457-L484 I guess we'll do at least one 'core-updates' before 3.7 is released, so it makes sense to include this. It should also give us some experience that might be relevant for 2.7, since it probably won't get the upstream reproducibility patch that relies on 3.7 features. The only remark I have is: is introducing a new variable necessary? SOURCE_DATE_EPOCH implies that the user wants a deterministic build; the upstream patch doesn't actually honor it outside of making the hashing method deterministic. So, I think it might be enough to just test for SOURCE_DATE_EPOCH instead of DETERMINISTIC_BUILD. The former is also already set in the build environment. However, I just noticed that you unset DETERMINISTIC_BUILD before the 'check' phase. Did it break more things? I suppose we'll have to set PYTHONHASHSEED somewhere in python-build-system as well. Did you check if that makes a difference for numpy? Perhaps it's enough to set it if we add an auto-compilation step?
[signature.asc (application/pgp-signature, inline)]
bug-guix <at> gnu.org
:bug#22533
; Package guix
.
(Tue, 06 Mar 2018 13:30:02 GMT) Full text and rfc822 format available.Message #73 received at 22533 <at> debbugs.gnu.org (full text, mbox):
From: Ricardo Wurmus <rekado <at> elephly.net> To: Marius Bakke <mbakke <at> fastmail.com> Cc: Gábor Boskovits <boskovits <at> gmail.com>, 22533 <at> debbugs.gnu.org Subject: Re: bug#22533: Python bytecode reproducibility Date: Tue, 06 Mar 2018 14:28:49 +0100
Marius Bakke <mbakke <at> fastmail.com> writes: > The only remark I have is: is introducing a new variable necessary? > SOURCE_DATE_EPOCH implies that the user wants a deterministic build; > the upstream patch doesn't actually honor it outside of making the > hashing method deterministic. So, I think it might be enough to just > test for SOURCE_DATE_EPOCH instead of DETERMINISTIC_BUILD. The former > is also already set in the build environment. > However, I just noticed that you unset DETERMINISTIC_BUILD before the > 'check' phase. Did it break more things? Yes, it broke a bunch of tests that are all about recompiling files when they are considered stale. > I suppose we'll have to set PYTHONHASHSEED somewhere in > python-build-system as well. Did you check if that makes a difference > for numpy? Perhaps it's enough to set it if we add an auto-compilation > step? Right, I’m going to test this with numpy now. Thanks for the hint! -- Ricardo GPG: BCA6 89B6 3655 3801 C3C6 2150 197A 5888 235F ACAC https://elephly.net
bug-guix <at> gnu.org
:bug#22533
; Package guix
.
(Tue, 06 Mar 2018 14:44:02 GMT) Full text and rfc822 format available.Message #76 received at 22533 <at> debbugs.gnu.org (full text, mbox):
From: Ricardo Wurmus <rekado <at> elephly.net> To: Marius Bakke <mbakke <at> fastmail.com> Cc: Gábor Boskovits <boskovits <at> gmail.com>, 22533 <at> debbugs.gnu.org Subject: Re: bug#22533: Python bytecode reproducibility Date: Tue, 06 Mar 2018 15:43:11 +0100
Ricardo Wurmus <rekado <at> elephly.net> writes: > Marius Bakke <mbakke <at> fastmail.com> writes: > >> I suppose we'll have to set PYTHONHASHSEED somewhere in >> python-build-system as well. Did you check if that makes a difference >> for numpy? Perhaps it's enough to set it if we add an auto-compilation >> step? > > Right, I’m going to test this with numpy now. Thanks for the hint! It did help with one file, which is now built reproducibly, namely lib/python3.6/site-packages/numpy/testing/nose_tools/__pycache__/utils.cpython-36.pyc This leaves five files in numpy that shouldn’t be but unfortunately are different. -- Ricardo GPG: BCA6 89B6 3655 3801 C3C6 2150 197A 5888 235F ACAC https://elephly.net
bug-guix <at> gnu.org
:bug#22533
; Package guix
.
(Tue, 06 Mar 2018 14:58:01 GMT) Full text and rfc822 format available.Message #79 received at 22533 <at> debbugs.gnu.org (full text, mbox):
From: Gábor Boskovits <boskovits <at> gmail.com> To: Ricardo Wurmus <rekado <at> elephly.net> Cc: Marius Bakke <mbakke <at> fastmail.com>, 22533 <at> debbugs.gnu.org Subject: Re: bug#22533: Python bytecode reproducibility Date: Tue, 6 Mar 2018 15:57:02 +0100
[Message part 1 (text/plain, inline)]
2018-03-06 15:43 GMT+01:00 Ricardo Wurmus <rekado <at> elephly.net>: > > Ricardo Wurmus <rekado <at> elephly.net> writes: > > > Marius Bakke <mbakke <at> fastmail.com> writes: > > > >> I suppose we'll have to set PYTHONHASHSEED somewhere in > >> python-build-system as well. Did you check if that makes a difference > >> for numpy? Perhaps it's enough to set it if we add an auto-compilation > >> step? > > > > Right, I’m going to test this with numpy now. Thanks for the hint! > > It did help with one file, which is now built reproducibly, namely > > lib/python3.6/site-packages/numpy/testing/nose_tools/__ > pycache__/utils.cpython-36.pyc > > This leaves five files in numpy that shouldn’t be but unfortunately are > different. > > Unfortunately backporting the upstream version is not straightforward at all. There are too many changes. I will have a look at those test failures instead. > -- > Ricardo > > GPG: BCA6 89B6 3655 3801 C3C6 2150 197A 5888 235F ACAC > https://elephly.net > > >
[Message part 2 (text/html, inline)]
bug-guix <at> gnu.org
:bug#22533
; Package guix
.
(Thu, 08 Mar 2018 10:40:02 GMT) Full text and rfc822 format available.Message #82 received at 22533 <at> debbugs.gnu.org (full text, mbox):
From: Gábor Boskovits <boskovits <at> gmail.com> To: Ricardo Wurmus <rekado <at> elephly.net> Cc: Marius Bakke <mbakke <at> fastmail.com>, 22533 <at> debbugs.gnu.org Subject: Re: bug#22533: Python bytecode reproducibility Date: Thu, 8 Mar 2018 11:39:52 +0100
[Message part 1 (text/plain, inline)]
2018-03-04 20:18 GMT+01:00 Ricardo Wurmus <rekado <at> elephly.net>: > I have applied this patch locally: > > > diff --git a/gnu/packages/python.scm b/gnu/packages/python.scm > index 5f701701a..0d1ecc3c6 100644 > --- a/gnu/packages/python.scm > +++ b/gnu/packages/python.scm > @@ -359,8 +359,42 @@ data types.") > "Lib/ctypes/test/test_win32.py" ; fails on > aarch64 > "Lib/test/test_fcntl.py")) ; fails on > aarch64 > #t)))) > - (arguments (substitute-keyword-arguments (package-arguments python-2) > - ((#:tests? _) #t))) > + (arguments > + (substitute-keyword-arguments (package-arguments python-2) > + ((#:tests? _) #t) > + ((#:phases phases) > + `(modify-phases ,phases > + (add-after 'unpack 'patch-timestamp-for-pyc-files > + (lambda _ > + ;; We set DETERMINISTIC_BUILD to only override the mtime > when > + ;; building with Guix, lest we break auto-compilation in > + ;; environments. > + (setenv "DETERMINISTIC_BUILD" "1") > + (substitute* "Lib/py_compile.py" > + (("source_stats\\['mtime'\\]") > + "(1 if 'DETERMINISTIC_BUILD' in os.environ else > source_stats['mtime'])")) > + > + ;; Use deterministic hashes for strings, bytes, and > datetime > + ;; objects. > + (setenv "PYTHONHASHSEED" "0") > + > + ;; Reset mtime when validating bytecode header. > + (substitute* "Lib/importlib/_bootstrap_external.py" > + (("source_mtime = int\\(source_stats\\['mtime'\\]\\)") > + "source_mtime = 1")) > + #t)) > + (add-after 'unpack 'disable-timestamp-tests > + (lambda _ > + (substitute* "Lib/test/test_importlib/ > source/test_file_loader.py" > + (("test_bad_marshal") > + "disable_test_bad_marshal") > + (("test_no_marshal") > + "disable_test_no_marshal") > + (("test_non_code_marshal") > + "disable_test_non_code_marshal")) > + #t)) > + (add-before 'check 'allow-non-deterministic-compilation > + (lambda _ (unsetenv "DETERMINISTIC_BUILD") #t)))))) > (native-search-paths > (list (search-path-specification > (variable "PYTHONPATH") > > > It allows me to build python-six and python-sip reproducibly. It does > not fix problems with Python 2, and I haven’t yet tested if it causes > any new problems. > > It’s a little worrying that I had to disable three more tests that I > think shouldn’t have failed. > > Ok, I've checked the test issue again. If we change the _bootstrap_external.py substitution to: "source_mtime = 1 if 'DETERMINISTIC_BUILD' in _os.environ else int(source_stats['mtime'])" the test do not fail any more. WDYT? > What do you think? > > -- > Ricardo > > GPG: BCA6 89B6 3655 3801 C3C6 2150 197A 5888 235F ACAC > https://elephly.net > >
[Message part 2 (text/html, inline)]
bug-guix <at> gnu.org
:bug#22533
; Package guix
.
(Mon, 14 Jan 2019 13:41:01 GMT) Full text and rfc822 format available.Message #85 received at 22533 <at> debbugs.gnu.org (full text, mbox):
From: Ricardo Wurmus <rekado <at> elephly.net> To: Gábor Boskovits <boskovits <at> gmail.com> Cc: Marius Bakke <mbakke <at> fastmail.com>, 22533 <at> debbugs.gnu.org Subject: Re: bug#22533: Python bytecode reproducibility Date: Mon, 14 Jan 2019 14:40:18 +0100
Now that we’re using Python 3.7 and this version supports hash-based pyc files, is this still an issue? Do we need to do anything to enable hash-based pyc compilation? See: https://docs.python.org/3/whatsnew/3.7.html#pep-552-hash-based-pyc-files https://www.python.org/dev/peps/pep-0552/ -- Ricardo
Ricardo Wurmus <rekado <at> elephly.net>
:Leo Famulari <leo <at> famulari.name>
:Message #90 received at 22533-done <at> debbugs.gnu.org (full text, mbox):
From: Ricardo Wurmus <rekado <at> elephly.net> To: Gábor Boskovits <boskovits <at> gmail.com> Cc: Marius Bakke <mbakke <at> fastmail.com>, 22533-done <at> debbugs.gnu.org Subject: Re: bug#22533: Python bytecode reproducibility Date: Sun, 03 Feb 2019 22:22:23 +0100
Ricardo Wurmus <rekado <at> elephly.net> writes: > Now that we’re using Python 3.7 and this version supports hash-based pyc > files, is this still an issue? Do we need to do anything to enable > hash-based pyc compilation? > > See: > https://docs.python.org/3/whatsnew/3.7.html#pep-552-hash-based-pyc-files > https://www.python.org/dev/peps/pep-0552/ It looks like this is no longer a problem. I built borg just now and the pyc files are reproducible. (The man pages include a date stamp, though, which I’m trying to patch now.) -- Ricardo
bug-guix <at> gnu.org
:bug#22533
; Package guix
.
(Mon, 04 Feb 2019 22:40:02 GMT) Full text and rfc822 format available.Message #93 received at 22533 <at> debbugs.gnu.org (full text, mbox):
From: Ludovic Courtès <ludo <at> gnu.org> To: 22533 <at> debbugs.gnu.org Cc: rekado <at> elephly.net, leo <at> famulari.name Subject: Re: bug#22533: Python bytecode reproducibility Date: Mon, 04 Feb 2019 23:39:21 +0100
Ricardo Wurmus <rekado <at> elephly.net> skribis: > Ricardo Wurmus <rekado <at> elephly.net> writes: > >> Now that we’re using Python 3.7 and this version supports hash-based pyc >> files, is this still an issue? Do we need to do anything to enable >> hash-based pyc compilation? >> >> See: >> https://docs.python.org/3/whatsnew/3.7.html#pep-552-hash-based-pyc-files >> https://www.python.org/dev/peps/pep-0552/ > > It looks like this is no longer a problem. I built borg just now and > the pyc files are reproducible. Yay! \o/ Ludo'.
Debbugs Internal Request <help-debbugs <at> gnu.org>
to internal_control <at> debbugs.gnu.org
.
(Tue, 05 Mar 2019 12:24:06 GMT) Full text and rfc822 format available.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.