GNU bug report logs - #36487
Evaluations do not complete since commit c30d117822a8ca26cd8c06c0a3974955bef68eac

Previous Next

Package: guix;

Reported by: Ludovic Courtès <ludo <at> gnu.org>

Date: Wed, 3 Jul 2019 10:30:02 UTC

Severity: serious

Done: Mathieu Othacehe <mathieu <at> cervin.i-did-not-set--mail-host-address--so-tickle-me>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 36487 in the body.
You can then email your comments to 36487 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guix <at> gnu.org:
bug#36487; Package guix. (Wed, 03 Jul 2019 10:30:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Ludovic Courtès <ludo <at> gnu.org>:
New bug report received and forwarded. Copy sent to bug-guix <at> gnu.org. (Wed, 03 Jul 2019 10:30:03 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: bug-Guix <at> gnu.org
Cc: Christopher Baines <mail <at> cbaines.net>
Subject: Evaluations do not complete since commit
 c30d117822a8ca26cd8c06c0a3974955bef68eac
Date: Wed, 03 Jul 2019 12:28:53 +0200
Hello,

Chris reported that evaluations “hang” since commit
c30d117822a8ca26cd8c06c0a3974955bef68eac, as can be seen at:

  https://prototype-guix-data-service.cbaines.net/branch/master

as well as:

  https://berlin.guixsd.org/jobset/guix-master (ca. eval. 6171)

The log at <https://prototype-guix-data-service.cbaines.net/job/2674>
shows that it hangs while computing i686 derivations:

--8<---------------cut here---------------start------------->8---
[…]
debug: Starting getting inferior derivations
heap size: 1579.0 MiB
debug: Starting getting derivations for ((x86_64-linux . x86_64-linux))
debug: Finished getting derivations for ((x86_64-linux . x86_64-linux)), took 252 seconds
heap size: 1579.0 MiB
debug: Starting getting derivations for ((i686-linux . i686-linux))
--8<---------------cut here---------------end--------------->8---

If I attach to the ‘guix repl -t machine’ process started by the Cuirass
evaluation process, I see this:

--8<---------------cut here---------------start------------->8---
#0  0x00007fc5248d5fb8 in _dl_update_slotinfo (req_modid=1) at ../elf/dl-tls.c:623
#1  0x00007fc5248d620c in update_get_addr (ti=0x7fc5248ab240) at ../elf/dl-tls.c:799
#2  0x00007fc5248da838 in __tls_get_addr () at ../sysdeps/x86_64/tls_get_addr.S:55
#3  0x00007fc5247ddbe0 in scm_equal_p () from /gnu/store/r658y3cgpnf99nxjxqgjiaizx20ac4k0-guile-2.2.4/lib/libguile-2.2.so.1
#4  0x00007fc5247cda18 in array_compare () from /gnu/store/r658y3cgpnf99nxjxqgjiaizx20ac4k0-guile-2.2.4/lib/libguile-2.2.so.1
#5  0x00007fc5247cd9c7 in array_compare () from /gnu/store/r658y3cgpnf99nxjxqgjiaizx20ac4k0-guile-2.2.4/lib/libguile-2.2.so.1
#6  0x00007fc5247ce675 in scm_array_equal_p () from /gnu/store/r658y3cgpnf99nxjxqgjiaizx20ac4k0-guile-2.2.4/lib/libguile-2.2.so.1
#7  0x00007fc52484891b in scm_i_struct_equalp () from /gnu/store/r658y3cgpnf99nxjxqgjiaizx20ac4k0-guile-2.2.4/lib/libguile-2.2.so.1
#8  0x00007fc5247ddc82 in scm_equal_p () from /gnu/store/r658y3cgpnf99nxjxqgjiaizx20ac4k0-guile-2.2.4/lib/libguile-2.2.so.1
#9  0x00007fc52484891b in scm_i_struct_equalp () from /gnu/store/r658y3cgpnf99nxjxqgjiaizx20ac4k0-guile-2.2.4/lib/libguile-2.2.so.1
#10 0x00007fc52484891b in scm_i_struct_equalp () from /gnu/store/r658y3cgpnf99nxjxqgjiaizx20ac4k0-guile-2.2.4/lib/libguile-2.2.so.1
#11 0x00007fc5247ddc82 in scm_equal_p () from /gnu/store/r658y3cgpnf99nxjxqgjiaizx20ac4k0-guile-2.2.4/lib/libguile-2.2.so.1
…
#101 0x00007fc5247ddc82 in scm_equal_p () from /gnu/store/r658y3cgpnf99nxjxqgjiaizx20ac4k0-guile-2.2.4/lib/libguile-2.2.so.1
#102 0x00007fc52484891b in scm_i_struct_equalp () from /gnu/store/r658y3cgpnf99nxjxqgjiaizx20ac4k0-guile-2.2.4/lib/libguile-2.2.so.1
#103 0x00007fc5247cbb38 in scm_sloppy_assoc () from /gnu/store/r658y3cgpnf99nxjxqgjiaizx20ac4k0-guile-2.2.4/lib/libguile-2.2.so.1
#104 0x00007fc5247f3a92 in scm_hash_fn_ref () from /gnu/store/r658y3cgpnf99nxjxqgjiaizx20ac4k0-guile-2.2.4/lib/libguile-2.2.so.1
#105 0x00007fc524859715 in vm_regular_engine () from /gnu/store/r658y3cgpnf99nxjxqgjiaizx20ac4k0-guile-2.2.4/lib/libguile-2.2.so.1
#106 0x00007fc52485cd6a in scm_call_n () from /gnu/store/r658y3cgpnf99nxjxqgjiaizx20ac4k0-guile-2.2.4/lib/libguile-2.2.so.1
#107 0x00007fc5247e0da7 in scm_primitive_eval () from /gnu/store/r658y3cgpnf99nxjxqgjiaizx20ac4k0-guile-2.2.4/lib/libguile-2.2.so.1
--8<---------------cut here---------------end--------------->8---

and with debugging symbols:

--8<---------------cut here---------------start------------->8---
#3  0x00007fc5247ddc23 in scm_equal_p (x=x <at> entry="/gnu/store/djh3drjx3hnxlx1bsdnixdm3xjbg5v2c-guile-bootstrap-2.0/bin/guile", 
    y=y <at> entry="/gnu/store/djh3drjx3hnxlx1bsdnixdm3xjbg5v2c-guile-bootstrap-2.0/bin/guile") at eq.c:290
#4  0x00007fc52484891b in scm_i_struct_equalp (s1="#<struct <derivation>>" = {...}, s2="#<struct <derivation>>" = {...}) at struct.c:693
#5  0x00007fc52484891b in scm_i_struct_equalp (s1="#<struct <derivation-input>>" = {...}, s2="#<struct <derivation-input>>" = {...}) at struct.c:693
#6  0x00007fc5247ddc82 in scm_equal_p (
    x=(#<struct <derivation-input> #<struct <derivation> (("out" . #<struct <derivation-output> "/gnu/store/jc5q4i55hrilibrmacs38skk3cpg2zyb-glibc-bootstrap-0" #f #f #f 3ade740>)) (#<struct <derivation-input> #<struct <derivation> (("out" . #<struct <derivation-output> "/gnu/store/nbi0mzk1dw37nnnahw9akn3bwzwi4gdk-module-import-compiled" #f #f #f 3ae6e00>)) (#<struct <derivation-input> #<struct <derivation> (("out" . #<struct <derivation-output> "/gnu/store/djh3drjx3hnxlx1bsdnixdm3xjbg5v2c-guile-bootstrap-2.0" #f #f #f 3af1b40>)) (#<struct <derivation-input> #<struct <derivation> (("out" . #<struct <derivation-output> "/gnu/store/dgncc5wmw8prxq09y71hqjc6g7rxqvvb-guile-2.0.9.tar.xz" sha256 #vu8(183 87 205 70 191 19 236 172 131 251 142 149 95 181 0 150 172 45 23 187 97 12 168 235 129 111 41 48 42 0 168 70) #f 3af10c0>)) () ("/gnu/store/7f1alh9qj2h0wwy2220npgnmw6pbrkwx-mirrors" "/gnu/store/vwyxp1dq4lb97n6b20w5cqxasy2dai79-content-addressed-mirrors") "i686-linux" "builtin:download" () (("content-addressed-mirrors" . "/gnu/store/vwyxp1dq4lb97n6b20w5cqxasy2dai79-content-addressed-mirrors") ("impureEnvVars" . "http_proxy https_proxy LC_ALL LC_MESSAGES LANG COLUMNS") ("mirrors" . "/gnu/store/7f1alh9qj2h0wwy2220npgnmw6pbrkwx-mirrors") ("out" . "/gnu/store/dgncc5wmw8prxq09y71hqjc6g7rxqvvb-guile-2.0.9.tar.xz") ("preferLocalBuild" . "1") ("url" . "(\"https://alpha.gnu.org/gnu/guix/bootstrap/i686-linux/20131110/guile-2.0.9.tar.xz\" \"http://alpha.gnu.org/gnu/guix/bootstrap/i686-linux/20131110/guile-2.0.9.tar.xz\" \"ftp://alpha.gnu.org/gnu/guix/bootstrap/i686-linux/20131110/guile-2.0.9.tar.xz\" \"http://www.fdn.fr/~lcourtes/software/guix/packages/i686-linux/20131110/guile-2.0.9.tar.xz\" \"http://flashner.co.il/guix/bootstrap/i686-linux/20131110/guile-2.0.9.tar.xz\")")) "/gnu/store/jagg4l53ar1m4hsgqlpisw9i4rq2whzd-guile-2.0.9.tar.xz.drv" 3af14e0> ("out") 3af1780>) ("/gnu/store/d32948vi9bs5mmbnvlczqcgwwdqvygiw-build-bootstrap-guile.sh" "/gnu/store/mzfkrxd4w8vqrmyrx169wj8wyw7r8i37-bash") "i686-linux" "/gnu/store/mzfkrxd4w8 […]
--8<---------------cut here---------------end--------------->8---

So this process is busy comparing deep structures, probably as a
consequence of 5cf4b26d52bcea382d98fb4becce89be9ee37b55.

Indeed, the load on berlin is at above 60 currently, with all these
evaluations processes consuming CPU…

To be continued…

Ludo’.




Severity set to 'serious' from 'normal' Request was from Ludovic Courtès <ludo <at> gnu.org> to control <at> debbugs.gnu.org. (Wed, 03 Jul 2019 10:37:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-guix <at> gnu.org:
bug#36487; Package guix. (Wed, 03 Jul 2019 17:02:02 GMT) Full text and rfc822 format available.

Message #10 received at 36487 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: 36487 <at> debbugs.gnu.org,  Christopher Baines <mail <at> cbaines.net>
Subject: Re: bug#36487: Evaluations do not complete since commit
 c30d117822a8ca26cd8c06c0a3974955bef68eac
Date: Wed, 03 Jul 2019 19:01:06 +0200
[Message part 1 (text/plain, inline)]
Ludovic Courtès <ludo <at> gnu.org> skribis:

> #104 0x00007fc5247f3a92 in scm_hash_fn_ref () from /gnu/store/r658y3cgpnf99nxjxqgjiaizx20ac4k0-guile-2.2.4/lib/libguile-2.2.so.1
> #105 0x00007fc524859715 in vm_regular_engine () from /gnu/store/r658y3cgpnf99nxjxqgjiaizx20ac4k0-guile-2.2.4/lib/libguile-2.2.so.1
> #106 0x00007fc52485cd6a in scm_call_n () from /gnu/store/r658y3cgpnf99nxjxqgjiaizx20ac4k0-guile-2.2.4/lib/libguile-2.2.so.1
> #107 0x00007fc5247e0da7 in scm_primitive_eval () from /gnu/store/r658y3cgpnf99nxjxqgjiaizx20ac4k0-guile-2.2.4/lib/libguile-2.2.so.1
>
>
> and with debugging symbols:
>
> #3  0x00007fc5247ddc23 in scm_equal_p (x=x <at> entry="/gnu/store/djh3drjx3hnxlx1bsdnixdm3xjbg5v2c-guile-bootstrap-2.0/bin/guile", 
>     y=y <at> entry="/gnu/store/djh3drjx3hnxlx1bsdnixdm3xjbg5v2c-guile-bootstrap-2.0/bin/guile") at eq.c:290
> #4  0x00007fc52484891b in scm_i_struct_equalp (s1="#<struct <derivation>>" = {...}, s2="#<struct <derivation>>" = {...}) at struct.c:693
> #5  0x00007fc52484891b in scm_i_struct_equalp (s1="#<struct <derivation-input>>" = {...}, s2="#<struct <derivation-input>>" = {...}) at struct.c:693
> #6  0x00007fc5247ddc82 in scm_equal_p (
>     x=(#<struct <derivation-input> #<struct <derivation> (("out" . #<struct <derivation-output> "/gnu/store/jc5q4i55hrilibrmacs38skk3cpg2zyb-glibc-bootstrap-0" #f #f #f 3ade740>)) (#<struct <derivation-input> #<struct <derivation> (("out" . #<struct <derivation-output> "/gnu/store/nbi0mzk1dw37nnnahw9akn3bwzwi4gdk-module-import-compiled" #f #f #f 3ae6e00>)) (#<struct <derivation-input> #<struct <derivation> (("out" . #<struct <derivation-output> "/gnu/store/djh3drjx3hnxlx1bsdnixdm3xjbg5v2c-guile-bootstrap-2.0" #f #f #f 3af1b40>)) (#<struct <derivation-input> #<struct <derivation> (("out" . #<struct <derivation-output> "/gnu/store/dgncc5wmw8prxq09y71hqjc6g7rxqvvb-guile-2.0.9.tar.xz" sha256 #vu8(183 87 205 70 191 19 236 172 131 251 142 149 95 181 0 150 172 45 23 187 97 12 168 235 129 111 41 48 42 0 168 70) #f 3af10c0>)) () ("/gnu/store/7f1alh9qj2h0wwy2220npgnmw6pbrkwx-mirrors" "/gnu/store/vwyxp1dq4lb97n6b20w5cqxasy2dai79-content-addressed-mirrors") "i686-linux" "builtin:download" () (("content-addressed-mirrors" . "/gnu/store/vwyxp1dq4lb97n6b20w5cqxasy2dai79-content-addressed-mirrors") ("impureEnvVars" . "http_proxy https_proxy LC_ALL LC_MESSAGES LANG COLUMNS") ("mirrors" . "/gnu/store/7f1alh9qj2h0wwy2220npgnmw6pbrkwx-mirrors") ("out" . "/gnu/store/dgncc5wmw8prxq09y71hqjc6g7rxqvvb-guile-2.0.9.tar.xz") ("preferLocalBuild" . "1") ("url" . "(\"https://alpha.gnu.org/gnu/guix/bootstrap/i686-linux/20131110/guile-2.0.9.tar.xz\" \"http://alpha.gnu.org/gnu/guix/bootstrap/i686-linux/20131110/guile-2.0.9.tar.xz\" \"ftp://alpha.gnu.org/gnu/guix/bootstrap/i686-linux/20131110/guile-2.0.9.tar.xz\" \"http://www.fdn.fr/~lcourtes/software/guix/packages/i686-linux/20131110/guile-2.0.9.tar.xz\" \"http://flashner.co.il/guix/bootstrap/i686-linux/20131110/guile-2.0.9.tar.xz\")")) "/gnu/store/jagg4l53ar1m4hsgqlpisw9i4rq2whzd-guile-2.0.9.tar.xz.drv" 3af14e0> ("out") 3af1780>) ("/gnu/store/d32948vi9bs5mmbnvlczqcgwwdqvygiw-build-bootstrap-guile.sh" "/gnu/store/mzfkrxd4w8vqrmyrx169wj8wyw7r8i37-bash") "i686-linux" "/gnu/store/mzfkrxd4w8 […]
>
> So this process is busy comparing deep structures, probably as a
> consequence of 5cf4b26d52bcea382d98fb4becce89be9ee37b55.

With some debugging, I found that the pathological behavior comes from
the memoization hash table of ‘derivation->bytevector’, specifically
after ‘invalidate-derivation-caches!’ has been called.

Namely, after ‘invalidate-derivation-caches!’ has been called, that hash
table has 24K buckets, including 72 buckets of size 6 or more.  One
bucket has 18 entries, which are all alike:

  #<derivation #f => /gnu/store/…-module-import-compiled 94283c0>

The ‘equal?’ calls come from those buckets, and each of them involves
traversing the whole <derivation> tree.

I’m looking for a workaround.

Ludo’.

PS: Below is some debugging code.

[Message part 2 (text/plain, inline)]
(use-modules (system foreign)
             (ice-9 match))

;;       typedef struct {
;;    0    scm_t_weak_entry **buckets;   /* the data */
;;    8    scm_i_pthread_mutex_t lock;   /* the lock */
;;   48    scm_t_weak_table_kind kind;   /* what kind of table it is */
;;   56    unsigned long n_buckets;    	/* total number of buckets. */
;;   64    unsigned long n_items;	/* number of items in table */
;;   72    unsigned long lower;		/* when to shrink */
;;   80    unsigned long upper;		/* when to grow */
;;   88    int size_index;		/* index into hashtable_size */
;;   92    int min_size_index;		/* minimum size_index */
;;   96    GC_word last_gc_no;
;;       } scm_t_weak_table;

(define (weak-table-data table)
  "Return an alist describing TABLE."
  (unless (or (weak-key-hash-table? table)
              (weak-value-hash-table? table))
    (error "wrong type argument" table))

  (let* ((address (object-address table))
         (ptr     (dereference-pointer
                   (make-pointer (+ address 8)))))
    (match (parse-c-struct (make-pointer (+ (pointer-address ptr) 48))
                           (list int
                                 unsigned-long unsigned-long
                                 unsigned-long unsigned-long
                                 int
                                 int
                                 unsigned-long))
      ((kind buckets items lower upper size-index min-size-index last-gc-no)
       `((kind . ,kind) (items . ,items) (buckets . ,buckets)
         (lower . ,lower) (upper . ,upper)
         (size-index . ,size-index) (min-size-index . ,min-size-index)
         (last-gc-no . ,last-gc-no))))))

;; typedef struct scm_t_hashtable {
;;   unsigned long n_items;	/* number of items in table */
;;   unsigned long lower;		/* when to shrink */
;;   unsigned long upper;		/* when to grow */
;;   int size_index;		/* index into hashtable_size */
;;   int min_size_index;		/* minimum size_index */
;;   scm_t_hash_fn hash_fn;  /* for rehashing after a GC. */
;; } scm_t_hashtable;

(define (hash-table-data table)
  (unless (hash-table? table)
    (error "wrong type argument" table))

  (let* ((address (object-address table))
         (ptr     (dereference-pointer
                   (make-pointer (+ address (* 2 (sizeof '*)))))))
    (match (parse-c-struct ptr
                           (list unsigned-long
                                 unsigned-long unsigned-long
                                 int
                                 int))
      ((items lower upper size-index min-size-index)
       `((items . ,items)
         (lower . ,lower) (upper . ,upper)
         (size-index . ,size-index) (min-size-index . ,min-size-index))))))

(define (hash-table-buckets table)
  (unless (hash-table? table)
    (error "wrong type argument" table))

  (let* ((address (object-address table))
         (ptr     (dereference-pointer
                   (make-pointer (+ address (* 1 (sizeof '*)))))))
    (pointer->scm ptr)))

(define (hash-table-bucket-sizes table)
  (let ((buckets (hash-table-buckets table)))
    (let loop ((index 0)
               (result '()))
      (if (< index (vector-length buckets))
          (loop (+ 1 index)
                (cons (length (vector-ref buckets index))
                      result))
          (reverse result)))))


Information forwarded to bug-guix <at> gnu.org:
bug#36487; Package guix. (Wed, 03 Jul 2019 18:11:01 GMT) Full text and rfc822 format available.

Message #13 received at 36487 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: 36487 <at> debbugs.gnu.org
Cc: Christopher Baines <mail <at> cbaines.net>
Subject: Re: bug#36487: Evaluations do not complete since commit
 c30d117822a8ca26cd8c06c0a3974955bef68eac
Date: Wed, 03 Jul 2019 20:10:37 +0200
I pushed a workaround as e79281be105b16153c375af5506db31fd1e32698,
though more work is still needed to fully understand what’s happening.

Note that you can test things like this:

--8<---------------cut here---------------start------------->8---
$ ./pre-inst-env guile
GNU Guile 2.2.5
Copyright (C) 1995-2019 Free Software Foundation, Inc.

Guile comes with ABSOLUTELY NO WARRANTY; for details type `,show w'.
This program is free software, and you are welcome to redistribute it
under certain conditions; type `,show c' for details.

Enter `,help' for help.
scheme@(guile-user)> ,use(guix store)
scheme@(guile-user)> (use-modules (gnu ci) (ice-9 match))
scheme@(guile-user)> (define store (open-connection))
scheme@(guile-user)> (define jobs  (hydra-jobs store '((systems "x86_64-linux" "i686-linux"))))
evaluating for 'x86_64-linux' (heap size: 35.0 MiB)...
random seed for tests: 1562154761
evaluating for 'i686-linux' (heap size: 467.0 MiB)...
--8<---------------cut here---------------end--------------->8---

Ludo’.




bug closed, send any further explanations to 36487 <at> debbugs.gnu.org and Ludovic Courtès <ludo <at> gnu.org> Request was from Mathieu Othacehe <mathieu <at> cervin.i-did-not-set--mail-host-address--so-tickle-me> to control <at> debbugs.gnu.org. (Sun, 06 Sep 2020 07:11:01 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 04 Oct 2020 11:24:10 GMT) Full text and rfc822 format available.

This bug report was last modified 3 years and 177 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.