GNU bug report logs - #19180
vacuum_weak_hash_table error

Previous Next

Package: guile;

Reported by: Anand Mohanadoss <anand108 <at> gmail.com>

Date: Tue, 25 Nov 2014 16:38:02 UTC

Severity: normal

Done: Andy Wingo <wingo <at> pobox.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 19180 in the body.
You can then email your comments to 19180 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guile <at> gnu.org:
bug#19180; Package guile. (Tue, 25 Nov 2014 16:38:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Anand Mohanadoss <anand108 <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-guile <at> gnu.org. (Tue, 25 Nov 2014 16:38:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Anand Mohanadoss <anand108 <at> gmail.com>
To: bug-guile <at> gnu.org
Subject: vacuum_weak_hash_table error
Date: Tue, 25 Nov 2014 16:53:28 +0530
[Message part 1 (text/plain, inline)]
Hi,

We have observed the following error a few times with guile 2.0.11 (32-bit)
on x86_64 Linux while processing large binary files (1.5GB+) and comparing
messages contained therein.

fish: hashtab.c:137: vacuum_weak_hash_table: Assertion `removed <= len'
failed.

I am appending the stack trace below.  There is one instance where we can
consistently reproduce this problem.  We see this issue only when we have
threading enabled (using future construct) to compare the messages in
parallel.  Reading of data from the binary files is on a single thread
only.  We don't get this error when we run everything on a single thread,
even though the memory footprint does grow.

I found references to a similar problem in the links below –

http://wingolog.org/archives/2011/02/25/ports-weaks-gc-and-dark-matter
https://www.gnunet.org/bot/log/guile/2014-02-04

In this case also we see consistent memory footprint increase, even though
we probably are not opening/closing many ports and should not be leaking
memory.  We also confirmed that in this case we are not running out of
virtual address space.

Given that we can consistently reproduce this problem on a particular
system, is there something you would like us to try to find the underlying
cause of this problem?  Do you have an explanation why using future
construct appears to trigger this problem (at least sooner)?

Thanks,
Anand


#0  0x00c8c430 in __kernel_vsyscall ()
#1  0x003bfb01 in raise () from /lib/libc.so.6
#2  0x003c13da in abort () from /lib/libc.so.6
#3  0x003b8ddb in __assert_fail_base () from /lib/libc.so.6
#4  0x003b8e96 in __assert_fail () from /lib/libc.so.6
#5  0x080e93d7 in vacuum_weak_hash_table (table=0x97f8b10) at hashtab.c:137
#6  0x080e9857 in weak_gc_callback (hook_data=0x0, fn_data=0x6, data=0x0)
at hashtab.c:437
#7  weak_gc_hook (hook_data=0x0, fn_data=0x6, data=0x0) at hashtab.c:446
#8  0x080eaa2d in scm_c_hook_run (hook=0x895acc0, data=0x0) at hooks.c:103
#9  0x080dfda6 in run_before_gc_c_hook () at gc.c:240
#10 0x08187719 in GC_notify_full_gc (stop_func=0x81868e0
<GC_never_stop_func>) at alloc.c:334
#11 GC_try_to_collect_inner (stop_func=0x81868e0 <GC_never_stop_func>) at
alloc.c:429
#12 0x08187d1b in GC_collect_or_expand (needed_blocks=4, ignore_off_page=0,
retry=0) at alloc.c:1242
#13 0x0818a64f in GC_alloc_large (lb=14080, k=1, flags=0) at malloc.c:63
#14 0x0818a9c2 in GC_generic_malloc (lb=14076, k=1) at malloc.c:175
#15 0x0818ace7 in GC_core_malloc (lb=14076) at malloc.c:263
#16 0x080df572 in do_gc_malloc (size=0, what=0x6 <Address 0x6 out of
bounds>) at gc-malloc.c:100
#17 0x0813a565 in scm_c_make_vector (k=3517, fill=0x304) at vectors.c:408
#18 0x080e9471 in scm_i_rehash (table=0x97f8b10, hash_fn=0x816c1d0
<scm_ihashq>, closure=0x0, func_name=0x854f8bd
"scm_hash_fn_create_handle_x") at hashtab.c:344
#19 0x080e9e23 in scm_hash_fn_create_handle_x (table=0x97f8b10,
obj=0xc41c638, init=0x4, hash_fn=0x816c1d0 <scm_ihashq>, assoc_fn=0x80c7eb0
<scm_sloppy_assq>, closure=0x0)
    at hashtab.c:748
#20 0x080ea11d in scm_hash_fn_set_x (table=0x97f8b10, obj=0xc41c638,
val=0x4, hash_fn=0x816c1d0 <scm_ihashq>, assoc_fn=0x80c7eb0
<scm_sloppy_assq>, closure=0x0)
    at hashtab.c:804
#21 0x080ea2a7 in scm_hashq_set_x (table=0x97f8b10, key=0xc41c638, val=0x4)
at hashtab.c:948
#22 0x08108aee in scm_new_port_table_entry (tag=381) at ports.c:688
#23 0x0813207e in scm_mkstrport (pos=0x2, str=0x4, modes=327680,
caller=0x855b842 "call-with-output-string") at strports.c:288
#24 0x08132393 in scm_call_with_output_string (proc=0xbbf49e0) at
strports.c:427
#25 0x08143d21 in vm_regular_engine (vm=0x9821060, program=0x0, argv=0x1,
nargs=0) at vm-i-system.c:855
#26 0x0813c367 in scm_c_vm_run (vm=0x9821060, program=0xa267de0, argv=0x0,
nargs=0) at vm.c:768
#27 0x0813c3cd in scm_load_compiled_with_vm (file=0xa267e60) at vm.c:1079
#28 0x08143d21 in vm_regular_engine (vm=0x9821060, program=0x0, argv=0x1,
nargs=1) at vm-i-system.c:855
#29 0x0813c367 in scm_c_vm_run (vm=0x9821060, program=0x9868470,
argv=0xffe5ec50, nargs=1) at vm.c:768
#30 0x080d9036 in scm_primitive_eval (exp=0xa261938) at eval.c:692
#31 0x080d908d in scm_eval (exp=0xa261938, module_or_state=0x98b4630) at
eval.c:726
#32 0x08117ac2 in scm_shell (argc=12, argv=0xffe5ed94) at script.c:439
#33 0x08068c4f in main (argc=12, argv=0xffe5ed94) at fish.cpp:90
[Message part 2 (text/html, inline)]

Information forwarded to bug-guile <at> gnu.org:
bug#19180; Package guile. (Fri, 28 Nov 2014 16:57:02 GMT) Full text and rfc822 format available.

Message #8 received at 19180 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Anand Mohanadoss <anand108 <at> gmail.com>
Cc: 19180 <at> debbugs.gnu.org
Subject: Re: bug#19180: vacuum_weak_hash_table error
Date: Fri, 28 Nov 2014 17:56:33 +0100
Anand Mohanadoss <anand108 <at> gmail.com> skribis:

> We have observed the following error a few times with guile 2.0.11 (32-bit)
> on x86_64 Linux while processing large binary files (1.5GB+) and comparing
> messages contained therein.
>
> fish: hashtab.c:137: vacuum_weak_hash_table: Assertion `removed <= len'
> failed.

[...]

> Given that we can consistently reproduce this problem on a particular
> system, is there something you would like us to try to find the underlying
> cause of this problem?

Could you try to come up with a reduced test case to reproduce the
problem?  That would help find out what’s going on.  Let us know if you
need help to do that.

Thanks for the report!

Ludo’.




Information forwarded to bug-guile <at> gnu.org:
bug#19180; Package guile. (Mon, 01 Dec 2014 12:44:01 GMT) Full text and rfc822 format available.

Message #11 received at 19180 <at> debbugs.gnu.org (full text, mbox):

From: Anand Mohanadoss <anand108 <at> gmail.com>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 19180 <at> debbugs.gnu.org
Subject: Re: bug#19180: vacuum_weak_hash_table error
Date: Mon, 1 Dec 2014 18:13:25 +0530
[Message part 1 (text/plain, inline)]
Dear Ludovic,

Thank you very much for your help!  Given that the problem happens for me
only while processing huge binary files after around an hour of processing
(and the problem in general is not consistent), I am not sure how to come
up with a reduced test case that would still reproduce the problem.  So, I
would appreciate any help with that.

I was also wondering that given we can reproduce the problem under a
particular circumstance, if we can approach the problem from the other side
- e.g. by making changes on guile side to print information on the hash
table for which the assertion is failing, print stats on this hash table
over a period of time, make changes to hash table code to try to resolve
the problem, etc.

Thanks,
Anand

On Fri, Nov 28, 2014 at 10:26 PM, Ludovic Courtès <ludo <at> gnu.org> wrote:

> Anand Mohanadoss <anand108 <at> gmail.com> skribis:
>
> > We have observed the following error a few times with guile 2.0.11
> (32-bit)
> > on x86_64 Linux while processing large binary files (1.5GB+) and
> comparing
> > messages contained therein.
> >
> > fish: hashtab.c:137: vacuum_weak_hash_table: Assertion `removed <= len'
> > failed.
>
> [...]
>
> > Given that we can consistently reproduce this problem on a particular
> > system, is there something you would like us to try to find the
> underlying
> > cause of this problem?
>
> Could you try to come up with a reduced test case to reproduce the
> problem?  That would help find out what’s going on.  Let us know if you
> need help to do that.
>
> Thanks for the report!
>
> Ludo’.
>
[Message part 2 (text/html, inline)]

Information forwarded to bug-guile <at> gnu.org:
bug#19180; Package guile. (Tue, 02 Dec 2014 19:48:02 GMT) Full text and rfc822 format available.

Message #14 received at 19180 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Anand Mohanadoss <anand108 <at> gmail.com>
Cc: 19180 <at> debbugs.gnu.org
Subject: Re: bug#19180: vacuum_weak_hash_table error
Date: Tue, 02 Dec 2014 20:47:22 +0100
[Message part 1 (text/plain, inline)]
Anand Mohanadoss <anand108 <at> gmail.com> skribis:

> #4  0x003b8e96 in __assert_fail () from /lib/libc.so.6
> #5  0x080e93d7 in vacuum_weak_hash_table (table=0x97f8b10) at hashtab.c:137
> #6  0x080e9857 in weak_gc_callback (hook_data=0x0, fn_data=0x6, data=0x0)
> at hashtab.c:437
> #7  weak_gc_hook (hook_data=0x0, fn_data=0x6, data=0x0) at hashtab.c:446
> #8  0x080eaa2d in scm_c_hook_run (hook=0x895acc0, data=0x0) at hooks.c:103
> #9  0x080dfda6 in run_before_gc_c_hook () at gc.c:240
> #10 0x08187719 in GC_notify_full_gc (stop_func=0x81868e0
> <GC_never_stop_func>) at alloc.c:334
> #11 GC_try_to_collect_inner (stop_func=0x81868e0 <GC_never_stop_func>) at
> alloc.c:429
> #12 0x08187d1b in GC_collect_or_expand (needed_blocks=4, ignore_off_page=0,
> retry=0) at alloc.c:1242
> #13 0x0818a64f in GC_alloc_large (lb=14080, k=1, flags=0) at malloc.c:63
> #14 0x0818a9c2 in GC_generic_malloc (lb=14076, k=1) at malloc.c:175
> #15 0x0818ace7 in GC_core_malloc (lb=14076) at malloc.c:263
> #16 0x080df572 in do_gc_malloc (size=0, what=0x6 <Address 0x6 out of
> bounds>) at gc-malloc.c:100
> #17 0x0813a565 in scm_c_make_vector (k=3517, fill=0x304) at vectors.c:408
> #18 0x080e9471 in scm_i_rehash (table=0x97f8b10, hash_fn=0x816c1d0
> <scm_ihashq>, closure=0x0, func_name=0x854f8bd
> "scm_hash_fn_create_handle_x") at hashtab.c:344

Looking at this stack trace, it seems ‘table’ is being concurrently
modified: a GC occurs while it is being rehashed.

Could you apply with attached patch (with “patch -p1 < the-patch” run
from the top of the source tree), and report the lines that are printed
before the assertion failure?

Also, please report the corresponding backtrace, as you did above (just
to make sure this is the same scenario.)

Thanks in advance,
Ludo’.

[Message part 2 (text/x-patch, inline)]
diff --git a/libguile/hashtab.c b/libguile/hashtab.c
index 44db051..8e7c9bc 100644
--- a/libguile/hashtab.c
+++ b/libguile/hashtab.c
@@ -1,5 +1,5 @@
 /* Copyright (C) 1995, 1996, 1998, 1999, 2000, 2001, 2003, 2004, 2006,
- *   2008, 2009, 2010, 2011, 2012 Free Software Foundation, Inc.
+ *   2008, 2009, 2010, 2011, 2012, 2014 Free Software Foundation, Inc.
  *
  * This library is free software; you can redistribute it and/or
  * modify it under the terms of the GNU Lesser General Public License
@@ -134,6 +134,7 @@ vacuum_weak_hash_table (SCM table)
       size_t removed;
       SCM alist = SCM_SIMPLE_VECTOR_REF (buckets, k);
       alist = scm_fixup_weak_alist (alist, &removed);
+      printf ("%i %p l=%zi r=%zi\n", __LINE__, table, len, removed);
       assert (removed <= len);
       len -= removed;
       SCM_SIMPLE_VECTOR_SET (buckets, k, alist);
@@ -341,7 +342,9 @@ scm_i_rehash (SCM table,
   SCM_HASHTABLE (table)->upper = 9 * new_size / 10;
   buckets = SCM_HASHTABLE_VECTOR (table);
 
+  printf ("%i %p l=%zi\n", __LINE__, table, SCM_HASHTABLE_N_ITEMS (table));
   new_buckets = scm_c_make_vector (new_size, SCM_EOL);
+  printf ("%i %p l=%zi\n", __LINE__, table, SCM_HASHTABLE_N_ITEMS (table));
 
   /* When this is a weak hashtable, running the GC might change it.
      We need to cope with this while rehashing its elements.  We do

Information forwarded to bug-guile <at> gnu.org:
bug#19180; Package guile. (Wed, 03 Dec 2014 23:28:01 GMT) Full text and rfc822 format available.

Message #17 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Linas Vepstas <linasvepstas <at> gmail.com>
To: bug-guile <at> gnu.org
Subject: Re: bug#19180: vacuum_weak_hash_table error
Date: Wed, 3 Dec 2014 22:46:25 +0000 (UTC)
Anand Mohanadoss <anand108 <at> gmail.com> writes:

> 
> Hi,We have observed the following error a few times with guile 2.0.11 
(32-bit) on x86_64 Linux while processing large binary files (1.5GB+) and 
comparing messages contained therein.  fish: hashtab.c:137: 
vacuum_weak_hash_table: Assertion `removed <= len' failed.  

I'm seeing the same bug, intermittently, in a unit test.  The unit test 
creates a dozen c++ threads, tries to enter guile in each, and then tries 
to do racey things. It pounds away at it for 50 seconds, passes, and then 
maybe 1-of-5 times crashes when calling destructors to shut everything 
down, including guile.

fwiw, this is with stock unmodified guile-2.0.9 as shipped with linux mint 
qiana 17 (same as ubuntu 14.04)

I'm attaching a stack trace here.  Ludovic, I will try your patch, might 
take me a few days.

-- Linas

#0  0x00007ffff69a7bb9 in __GI_raise (sig=sig <at> entry=6)
    at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007ffff69aafc8 in __GI_abort () at abort.c:89
#2  0x00007ffff69a0a76 in __assert_fail_base (
    fmt=0x7ffff6af2370 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
    assertion=assertion <at> entry=0x7ffff50e0482 "removed <= len", 
    file=file <at> entry=0x7ffff50e0478 "hashtab.c", line=line <at> entry=137, 
    function=function <at> entry=0x7ffff50e0ba0 "vacuum_weak_hash_table")
    at assert.c:92
#3  0x00007ffff69a0b22 in __GI___assert_fail (
    assertion=0x7ffff50e0482 "removed <= len", file=0x7ffff50e0478 
"hashtab.c", 
    line=137, function=0x7ffff50e0ba0 "vacuum_weak_hash_table") at 
assert.c:101
#4  0x00007ffff504dd4f in ?? () from /usr/lib/libguile-2.0.so.22
#5  0x00007ffff504de02 in ?? () from /usr/lib/libguile-2.0.so.22
#6  0x00007ffff505056c in scm_c_hook_run () from /usr/lib/libguile-
2.0.so.22
#7  0x00007ffff3dcfc65 in GC_try_to_collect_inner ()
   from /usr/lib/x86_64-linux-gnu/libgc.so.1
#8  0x00007ffff3dcfedc in GC_try_to_collect_general ()
   from /usr/lib/x86_64-linux-gnu/libgc.so.1
#9  0x00007ffff3dcffad in GC_gcollect () from /usr/lib/x86_64-linux-
gnu/libgc.so.1
#10 0x00007ffff50440c9 in scm_gc () from /usr/lib/libguile-2.0.so.22

#11 0x00007ffff76be8c9 in opencog::SchemeEval::c_wrap_finish (p=0x80abe0)
    at /home/linas/src/novamente/src/opencog-
git/opencog/guile/SchemeEval.cc:119
#12 0x00007ffff502d2ca in ?? () from /usr/lib/libguile-2.0.so.22
#13 0x00007ffff50bec00 in ?? () from /usr/lib/libguile-2.0.so.22
#14 0x00007ffff5036863 in scm_call_4 () from /usr/lib/libguile-2.0.so.22
#15 0x00007ffff502d9ff in ?? () from /usr/lib/libguile-2.0.so.22
#16 0x00007ffff502da95 in scm_c_with_continuation_barrier ()
   from /usr/lib/libguile-2.0.so.22
#17 0x00007ffff3ddf950 in GC_call_with_gc_active ()
   from /usr/lib/x86_64-linux-gnu/libgc.so.1
#18 0x00007ffff50a67d1 in ?? () from /usr/lib/libguile-2.0.so.22
#19 0x00007ffff3dd9fe2 in GC_call_with_stack_base ()
   from /usr/lib/x86_64-linux-gnu/libgc.so.1
#20 0x00007ffff50a6b48 in scm_with_guile () from /usr/lib/libguile-
2.0.so.22
#21 0x00007ffff76be568 in opencog::SchemeEval::~SchemeEval (this=0x80abe0, 
    __in_chrg=<optimized out>)
    at /home/linas/src/novamente/src/opencog-
git/opencog/guile/SchemeEval.cc:273







Information forwarded to bug-guile <at> gnu.org:
bug#19180; Package guile. (Sat, 06 Dec 2014 19:09:03 GMT) Full text and rfc822 format available.

Message #20 received at 19180 <at> debbugs.gnu.org (full text, mbox):

From: Anand Mohanadoss <anand108 <at> gmail.com>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 19180 <at> debbugs.gnu.org
Subject: Re: bug#19180: vacuum_weak_hash_table error
Date: Sat, 6 Dec 2014 14:27:40 +0530
[Message part 1 (text/plain, inline)]
Dear Ludovic,

I apologize for the late response.  We rebuilt guile from scratch with the
latest GC (7.2f) and the patch you provided.  We got a 73GB output file in
reproducing the problem with the extra logging!  I am attaching the last
100,000 lines of output as a file.  Appended is the stack trace.  Can you
kindly take a look when you get a chance?

It appears that len becomes 0 after successive removals and then we hit the
assertion failure when len is 0 and removed is 1.

Thanks,
Anand

#0  0x00e93430 in __kernel_vsyscall ()
#1  0x001e4b01 in raise () from /lib/libc.so.6
#2  0x001e63da in abort () from /lib/libc.so.6
#3  0x001ddddb in __assert_fail_base () from /lib/libc.so.6
#4  0x001dde96 in __assert_fail () from /lib/libc.so.6
#5  0x080e9ad0 in vacuum_weak_hash_table (table=0xa86c840) at hashtab.c:138
#6  0x080e97f7 in weak_gc_callback (hook_data=0x0, fn_data=0x6, data=0x0)
at hashtab.c:440
#7  weak_gc_hook (hook_data=0x0, fn_data=0x6, data=0x0) at hashtab.c:449
#8  0x080eac7d in scm_c_hook_run (hook=0x8961ba0, data=0x0) at hooks.c:103
#9  0x080dff76 in run_before_gc_c_hook () at gc.c:240
#10 0x08188ed9 in GC_notify_full_gc (stop_func=0x8188090
<GC_never_stop_func>) at alloc.c:335
#11 GC_try_to_collect_inner (stop_func=0x8188090 <GC_never_stop_func>) at
alloc.c:430
#12 0x081894db in GC_collect_or_expand (needed_blocks=1, ignore_off_page=0,
retry=0) at alloc.c:1253
#13 0x0818963c in GC_allocobj (gran=2, kind=0) at alloc.c:1340
#14 0x0818c066 in GC_generic_malloc_inner (lb=16, k=0) at malloc.c:121
#15 0x0818ccb6 in GC_generic_malloc_many (lb=16, k=0, result=0xb4c372c) at
mallocx.c:423
#16 0x08194ee8 in GC_malloc_atomic (bytes=16) at thread_local_alloc.c:195
#17 0x080df522 in do_gc_malloc_atomic (size=0, what=0x6 <Address 0x6 out of
bounds>) at gc-malloc.c:106
#18 0x080f612b in make_bignum () at numbers.c:253
#19 scm_i_mkbig () at numbers.c:267
#20 0x080f8ae5 in scm_difference (x=0xbc8f260, y=0xd3412b0) at
numbers.c:7861
#21 0x08143434 in vm_regular_engine (vm=0xc001b50, program=0x0,
argv=0xbc8f260, nargs=0) at vm-i-scheme.c:414
#22 0x0813c5b7 in scm_c_vm_run (vm=0xc001b50, program=0xb44f400, argv=0x0,
nargs=0) at vm.c:768
#23 0x080d8f7a in scm_call_0 (proc=0xb44f400) at eval.c:480
#24 0x08137c9b in really_launch (d=0xfface2a0) at threads.c:1005
#25 0x08163dc2 in c_body (d=0xf61d620c) at continuations.c:517
#26 0x08143d7d in vm_regular_engine (vm=0xc001b50, program=0x0, argv=0x2,
nargs=4) at vm-i-system.c:858
#27 0x0813c5b7 in scm_c_vm_run (vm=0xc001b50, program=0xaa0ddc8,
argv=0xf61d6168, nargs=4) at vm.c:768
#28 0x080d8e91 in scm_call_4 (proc=0xaa0ddc8, arg1=0x404, arg2=0xc005ef0,
arg3=0xc005ee0, arg4=0xc005ed0) at eval.c:507
#29 0x08138c5e in scm_catch_with_pre_unwind_handler (key=0x404,
thunk=0xc005ef0, handler=0xc005ee0, pre_unwind_handler=0xc005ed0) at
throw.c:73
#30 0x08164067 in scm_i_with_continuation_barrier (body=0x8163db0 <c_body>,
body_data=0xf61d620c, handler=0x8163fb0 <c_handler>,
handler_data=0xf61d620c,
    pre_unwind_handler=0x8163e00 <pre_unwind_handler>,
pre_unwind_handler_data=0xa9daff8) at continuations.c:455
#31 0x08164102 in scm_c_with_continuation_barrier (func=0x8137c10
<really_launch>, data=0xfface2a0) at continuations.c:551
#32 0x08137a52 in with_guile_and_parent (base=0xf61d626c, data=0xf61d628c)
at threads.c:906
#33 0x081900ff in GC_call_with_stack_base (fn=0x8137a10
<with_guile_and_parent>, arg=0xf61d628c) at misc.c:1573
#34 0x081375c2 in scm_i_with_guile_and_parent (func=<value optimized out>,
data=0x6, parent=0x2740) at threads.c:949
#35 0x08137625 in launch_thread (d=0xfface2a0) at threads.c:1017
#36 0x08199cdb in GC_inner_start_routine (sb=0xf61d633c, arg=0xbc87fc0) at
pthread_start.c:56
#37 0x081900ff in GC_call_with_stack_base (fn=0x8199c80
<GC_inner_start_routine>, arg=0xbc87fc0) at misc.c:1573
#38 0x08195319 in GC_start_routine (arg=0xbc87fc0) at pthread_support.c:1549
#39 0x00146a49 in start_thread () from /lib/libpthread.so.0
#40 0x00298e1e in clone () from /lib/libc.so.6


On Wed, Dec 3, 2014 at 1:17 AM, Ludovic Courtès <ludo <at> gnu.org> wrote:

> Anand Mohanadoss <anand108 <at> gmail.com> skribis:
>
> > #4  0x003b8e96 in __assert_fail () from /lib/libc.so.6
> > #5  0x080e93d7 in vacuum_weak_hash_table (table=0x97f8b10) at
> hashtab.c:137
> > #6  0x080e9857 in weak_gc_callback (hook_data=0x0, fn_data=0x6, data=0x0)
> > at hashtab.c:437
> > #7  weak_gc_hook (hook_data=0x0, fn_data=0x6, data=0x0) at hashtab.c:446
> > #8  0x080eaa2d in scm_c_hook_run (hook=0x895acc0, data=0x0) at
> hooks.c:103
> > #9  0x080dfda6 in run_before_gc_c_hook () at gc.c:240
> > #10 0x08187719 in GC_notify_full_gc (stop_func=0x81868e0
> > <GC_never_stop_func>) at alloc.c:334
> > #11 GC_try_to_collect_inner (stop_func=0x81868e0 <GC_never_stop_func>) at
> > alloc.c:429
> > #12 0x08187d1b in GC_collect_or_expand (needed_blocks=4,
> ignore_off_page=0,
> > retry=0) at alloc.c:1242
> > #13 0x0818a64f in GC_alloc_large (lb=14080, k=1, flags=0) at malloc.c:63
> > #14 0x0818a9c2 in GC_generic_malloc (lb=14076, k=1) at malloc.c:175
> > #15 0x0818ace7 in GC_core_malloc (lb=14076) at malloc.c:263
> > #16 0x080df572 in do_gc_malloc (size=0, what=0x6 <Address 0x6 out of
> > bounds>) at gc-malloc.c:100
> > #17 0x0813a565 in scm_c_make_vector (k=3517, fill=0x304) at vectors.c:408
> > #18 0x080e9471 in scm_i_rehash (table=0x97f8b10, hash_fn=0x816c1d0
> > <scm_ihashq>, closure=0x0, func_name=0x854f8bd
> > "scm_hash_fn_create_handle_x") at hashtab.c:344
>
> Looking at this stack trace, it seems ‘table’ is being concurrently
> modified: a GC occurs while it is being rehashed.
>
> Could you apply with attached patch (with “patch -p1 < the-patch” run
> from the top of the source tree), and report the lines that are printed
> before the assertion failure?
>
> Also, please report the corresponding backtrace, as you did above (just
> to make sure this is the same scenario.)
>
> Thanks in advance,
> Ludo’.
>
>
[Message part 2 (text/html, inline)]
[guile.patch.txt (text/plain, attachment)]

Information forwarded to bug-guile <at> gnu.org:
bug#19180; Package guile. (Sun, 07 Dec 2014 01:44:02 GMT) Full text and rfc822 format available.

Message #23 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Linas Vepstas <linasvepstas <at> gmail.com>
To: bug-guile <at> gnu.org
Subject: Re: bug#19180: vacuum_weak_hash_table error
Date: Sun, 7 Dec 2014 01:42:44 +0000 (UTC)
Ludovic Courtès <ludo <at> gnu.org> writes:
> 
> Could you apply with attached patch (with “patch -p1 < the-patch” run
> from the top of the source tree), and report the lines that are printed
> before the assertion failure?

Not to be stupid, but: I pulled the latest guile from git, went to
apply the patch, and discovered that hashtab.c has changed: there
is no scm_fixup_weak_alist in it any more, nor is there a
vacuum_weak_hash_table.  I suppose that its also not surprise that
after a few dozen runs, I don't get any errors, crashes.   So ..
yayy, guile-2.2!

Should I attempt to reproduce on guile-2.0?

--linas





Information forwarded to bug-guile <at> gnu.org:
bug#19180; Package guile. (Mon, 08 Dec 2014 07:35:02 GMT) Full text and rfc822 format available.

Message #26 received at 19180 <at> debbugs.gnu.org (full text, mbox):

From: Anand Mohanadoss <anand108 <at> gmail.com>
To: linasvepstas <at> gmail.com, Ludovic Courtès <ludo <at> gnu.org>, 
 19180 <at> debbugs.gnu.org
Subject: Re: guile bug#19180: vacuum_weak_hash_table error
Date: Mon, 8 Dec 2014 13:04:30 +0530
[Message part 1 (text/plain, inline)]
Dear Linas,

Thank you so much!  I did indeed miss your earlier responses and I
appreciate your reaching out to me!

I downloaded the current guile code in git and will give it a try.  But,
given that this is not a stable release, I am not sure if we can really use
it for our purposes.

Do you or Ludovic know when a stable 2.2 version will be released?  From
stability point of you, would it be better to use a patched version of
2.0.11 that handles the case we noticed (e.g. don't attempt to remove items
from hash table if len = 0)?

Thanks,
Anand



On Sun, Dec 7, 2014 at 7:22 AM, Linas Vepstas <linasvepstas <at> gmail.com>
wrote:

> emailing you directly, in case you did not see it:
>
> I've seen the same bug; I tried to apply Ludo's patch, and discovered that
> the current guile code, in git, no longer even has the
>  scm_fixup_weak_alist or the vacuum_weak_hash_table calls anywhere in
> them.  Also .. my version of the bug goes away.
>
> The code in git seems to be guile-2.2, which has not yet been released --
> so your mileage may vary..  but you certainly won't get that error print
> message any more.
>
> -- Linas
>
[Message part 2 (text/html, inline)]

Information forwarded to bug-guile <at> gnu.org:
bug#19180; Package guile. (Mon, 08 Dec 2014 09:35:01 GMT) Full text and rfc822 format available.

Message #29 received at 19180 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Anand Mohanadoss <anand108 <at> gmail.com>
Cc: 19180 <at> debbugs.gnu.org, linasvepstas <at> gmail.com
Subject: Re: guile bug#19180: vacuum_weak_hash_table error
Date: Mon, 08 Dec 2014 10:34:01 +0100
Anand Mohanadoss <anand108 <at> gmail.com> skribis:

> I downloaded the current guile code in git and will give it a try.  But,
> given that this is not a stable release, I am not sure if we can really use
> it for our purposes.
>
> Do you or Ludovic know when a stable 2.2 version will be released?  From
> stability point of you, would it be better to use a patched version of
> 2.0.11 that handles the case we noticed (e.g. don't attempt to remove items
> from hash table if len = 0)?

There is not ETA for 2.2.

For now, I would recommend using the ‘stable-2.0’ branch, which
corresponds to what will (hopefully soon) become 2.0.12.

Thanks,
Ludo’.




Information forwarded to bug-guile <at> gnu.org:
bug#19180; Package guile. (Mon, 15 Dec 2014 06:37:01 GMT) Full text and rfc822 format available.

Message #32 received at 19180 <at> debbugs.gnu.org (full text, mbox):

From: Anand Mohanadoss <anand108 <at> gmail.com>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 19180 <at> debbugs.gnu.org, Linas Vepstas <linasvepstas <at> gmail.com>
Subject: Re: guile bug#19180: vacuum_weak_hash_table error
Date: Mon, 15 Dec 2014 12:06:44 +0530
[Message part 1 (text/plain, inline)]
Dear Ludovic,

Given your recommendation to use the 'stable-2.0' branch, we tried to
create a fix for the issue with our limited knowledge (assuming that the
size of the hash is getting corrupted by multiple threads).  The process
ran to completion, but, we got different results compared to what we get
with threading disabled.  So, it didn't really help and our assumption
appears to be incorrect. Is there something else we can try, to help you
with creating a proper patch for this issue?

Here is what we changed in hashtab.c -

130a131
>   size_t orig_len = len;
137,138c138,144
<       assert (removed <= len);
<       len -= removed;
---
>       if (removed <= len)
>         len -= removed;
>       else
>       {
>         printf ("Vacuum weak hash table assert Table=%p len=%zi
removed=%zi orig_len=%zi n_items=%zi\n", table, len, removed, orig_len,
SCM_HASHTABLE_N_ITEMS (table));
>         len = 0;
>       }

With this change, we got lines similar to the following printed
periodically -

Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=2321
n_items=2321
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=2321
n_items=2321
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=2321
n_items=2321
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=2 orig_len=2321
n_items=2321
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=2321
n_items=2321
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=2 orig_len=2321
n_items=2321
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=2321
n_items=2321
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=2321
n_items=2321
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=2321
n_items=2321
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=2321
n_items=2321
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=3 orig_len=2321
n_items=2321
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=2321
n_items=2321
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=2 orig_len=2321
n_items=2321
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=2321
n_items=2321
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=2321
n_items=2321
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=2321
n_items=2321
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=2321
n_items=2321
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=2321
n_items=2321
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=2321
n_items=2321
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=2321
n_items=2321
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=2321
n_items=2321
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=2321
n_items=2321
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=4 orig_len=2321
n_items=2321
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=2321
n_items=2321
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=2 orig_len=2321
n_items=2321
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=2321
n_items=2321
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=2321
n_items=2321
......
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=205
n_items=205
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=205
n_items=205
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=205
n_items=205
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=205
n_items=205
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=205
n_items=205
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=205
n_items=205
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=205
n_items=205
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=205
n_items=205
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=205
n_items=205
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=205
n_items=205
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=205
n_items=205
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=205
n_items=205
.......
Vacuum weak hash table assert Table=0x9bdb840 len=1 removed=2 orig_len=2274
n_items=2274
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=2274
n_items=2274
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=2274
n_items=2274
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=2 orig_len=2274
n_items=2274
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=2274
n_items=2274
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=2274
n_items=2274
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=2274
n_items=2274
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=2274
n_items=2274
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=2274
n_items=2274
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=2274
n_items=2274
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=2274
n_items=2274
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=2274
n_items=2274
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=2274
n_items=2274
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=2274
n_items=2274
Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1 orig_len=2274
n_items=2274

Thanks,
Anand



On Mon, Dec 8, 2014 at 3:04 PM, Ludovic Courtès <ludo <at> gnu.org> wrote:

> Anand Mohanadoss <anand108 <at> gmail.com> skribis:
>
> > I downloaded the current guile code in git and will give it a try.  But,
> > given that this is not a stable release, I am not sure if we can really
> use
> > it for our purposes.
> >
> > Do you or Ludovic know when a stable 2.2 version will be released?  From
> > stability point of you, would it be better to use a patched version of
> > 2.0.11 that handles the case we noticed (e.g. don't attempt to remove
> items
> > from hash table if len = 0)?
>
> There is not ETA for 2.2.
>
> For now, I would recommend using the ‘stable-2.0’ branch, which
> corresponds to what will (hopefully soon) become 2.0.12.
>
> Thanks,
> Ludo’.
>
[Message part 2 (text/html, inline)]

Information forwarded to bug-guile <at> gnu.org:
bug#19180; Package guile. (Wed, 22 Jun 2016 14:56:02 GMT) Full text and rfc822 format available.

Message #35 received at 19180 <at> debbugs.gnu.org (full text, mbox):

From: Andy Wingo <wingo <at> pobox.com>
To: Anand Mohanadoss <anand108 <at> gmail.com>
Cc: 19180 <at> debbugs.gnu.org, Ludovic Courtès <ludo <at> gnu.org>,
 Linas Vepstas <linasvepstas <at> gmail.com>
Subject: Re: bug#19180: guile bug#19180: vacuum_weak_hash_table error
Date: Wed, 22 Jun 2016 16:55:38 +0200
Hi :)

On Mon 15 Dec 2014 07:36, Anand Mohanadoss <anand108 <at> gmail.com> writes:

> Here is what we changed in hashtab.c -
>
> 130a131
>> size_t orig_len = len;
> 137,138c138,144
> < assert (removed <= len);
> < len -= removed;
> ---
>> if (removed <= len)
>> len -= removed;
>> else
>> {
>> printf ("Vacuum weak hash table assert Table=%p len=%zi removed=%zi
> orig_len=%zi n_items=%zi\n", table, len, removed, orig_len,
> SCM_HASHTABLE_N_ITEMS (table));
>> len = 0;
>> }
>
> With this change, we got lines similar to the following printed
> periodically -
>
> Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1
> orig_len=2321 n_items=2321

I guess printing a warning is not worse than crashing.  I was unable to
make this table work in a reliable way in 2.0 without rewriting it, so
in 2.2 there's a new implementation with hopefully no bug in this
regard.

Ludovic what do you thing, should we just be sloppy in 2.0 and remove
the assertion?  I don't think it's fixable.  The other option I see is
to close as WONTFIX.

Andy




Information forwarded to bug-guile <at> gnu.org:
bug#19180; Package guile. (Wed, 22 Jun 2016 15:44:01 GMT) Full text and rfc822 format available.

Message #38 received at 19180 <at> debbugs.gnu.org (full text, mbox):

From: Anand Mohanadoss <anand108 <at> gmail.com>
To: Andy Wingo <wingo <at> pobox.com>
Cc: 19180 <at> debbugs.gnu.org, Ludovic Courtès <ludo <at> gnu.org>,
 Linas Vepstas <linasvepstas <at> gmail.com>
Subject: Re: bug#19180: guile bug#19180: vacuum_weak_hash_table error
Date: Wed, 22 Jun 2016 21:13:05 +0530
[Message part 1 (text/plain, inline)]
Hi Andy,

Thanks a lot for looking into this and your response!  Any idea when we
will have a stable 2.2 release that we can move to given that 2.1 has been
out for a few months.

Thanks,
Anand

On Wed, Jun 22, 2016 at 8:25 PM, Andy Wingo <wingo <at> pobox.com> wrote:

> Hi :)
>
> On Mon 15 Dec 2014 07:36, Anand Mohanadoss <anand108 <at> gmail.com> writes:
>
> > Here is what we changed in hashtab.c -
> >
> > 130a131
> >> size_t orig_len = len;
> > 137,138c138,144
> > < assert (removed <= len);
> > < len -= removed;
> > ---
> >> if (removed <= len)
> >> len -= removed;
> >> else
> >> {
> >> printf ("Vacuum weak hash table assert Table=%p len=%zi removed=%zi
> > orig_len=%zi n_items=%zi\n", table, len, removed, orig_len,
> > SCM_HASHTABLE_N_ITEMS (table));
> >> len = 0;
> >> }
> >
> > With this change, we got lines similar to the following printed
> > periodically -
> >
> > Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1
> > orig_len=2321 n_items=2321
>
> I guess printing a warning is not worse than crashing.  I was unable to
> make this table work in a reliable way in 2.0 without rewriting it, so
> in 2.2 there's a new implementation with hopefully no bug in this
> regard.
>
> Ludovic what do you thing, should we just be sloppy in 2.0 and remove
> the assertion?  I don't think it's fixable.  The other option I see is
> to close as WONTFIX.
>
> Andy
>
[Message part 2 (text/html, inline)]

Information forwarded to bug-guile <at> gnu.org:
bug#19180; Package guile. (Wed, 22 Jun 2016 21:12:02 GMT) Full text and rfc822 format available.

Message #41 received at 19180 <at> debbugs.gnu.org (full text, mbox):

From: Linas Vepstas <linasvepstas <at> gmail.com>
To: Anand Mohanadoss <anand108 <at> gmail.com>
Cc: Andy Wingo <wingo <at> pobox.com>, 19180 <at> debbugs.gnu.org,
 Ludovic Courtès <ludo <at> gnu.org>
Subject: Re: bug#19180: guile bug#19180: vacuum_weak_hash_table error
Date: Wed, 22 Jun 2016 16:11:31 -0500
[Message part 1 (text/plain, inline)]
Anand, I've been using guile, straight from git for maybe almost 2
years(??) in a semi-harsh environment (lots of threads, lots of C++ smob
jiggery-pokery, entering and exiting guile, redirecting ports, using fluids
at the scheme/c++ boundary, catching and throwing exceptions to/from C++,
interleaving all this with python, too, -- and medium cpu burn over many
days/week) and it seems to work fine.  Try it -- fix a tag, run system
test, I bet it will work for you.  I think Ludo and Andy have done a good
job.

The only recent glitch is that the setvbuf API changed.  An old quasi-issue
is that garbage collection seems to not be aggressive enough.  After
several layers of C calling guile calling C and so on, mem usage seems to
bloat (a lot -- many many GB's) unless I forcibly run GC every 50th time I
re-enter guile. But I think it does that in guile-2.0 too.

--linas

On Wed, Jun 22, 2016 at 10:43 AM, Anand Mohanadoss <anand108 <at> gmail.com>
wrote:

> Hi Andy,
>
> Thanks a lot for looking into this and your response!  Any idea when we
> will have a stable 2.2 release that we can move to given that 2.1 has been
> out for a few months.
>
> Thanks,
> Anand
>
> On Wed, Jun 22, 2016 at 8:25 PM, Andy Wingo <wingo <at> pobox.com> wrote:
>
>> Hi :)
>>
>> On Mon 15 Dec 2014 07:36, Anand Mohanadoss <anand108 <at> gmail.com> writes:
>>
>> > Here is what we changed in hashtab.c -
>> >
>> > 130a131
>> >> size_t orig_len = len;
>> > 137,138c138,144
>> > < assert (removed <= len);
>> > < len -= removed;
>> > ---
>> >> if (removed <= len)
>> >> len -= removed;
>> >> else
>> >> {
>> >> printf ("Vacuum weak hash table assert Table=%p len=%zi removed=%zi
>> > orig_len=%zi n_items=%zi\n", table, len, removed, orig_len,
>> > SCM_HASHTABLE_N_ITEMS (table));
>> >> len = 0;
>> >> }
>> >
>> > With this change, we got lines similar to the following printed
>> > periodically -
>> >
>> > Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1
>> > orig_len=2321 n_items=2321
>>
>> I guess printing a warning is not worse than crashing.  I was unable to
>> make this table work in a reliable way in 2.0 without rewriting it, so
>> in 2.2 there's a new implementation with hopefully no bug in this
>> regard.
>>
>> Ludovic what do you thing, should we just be sloppy in 2.0 and remove
>> the assertion?  I don't think it's fixable.  The other option I see is
>> to close as WONTFIX.
>>
>> Andy
>>
>
>
[Message part 2 (text/html, inline)]

Information forwarded to bug-guile <at> gnu.org:
bug#19180; Package guile. (Wed, 22 Jun 2016 21:35:02 GMT) Full text and rfc822 format available.

Message #44 received at 19180 <at> debbugs.gnu.org (full text, mbox):

From: Andy Wingo <wingo <at> pobox.com>
To: Anand Mohanadoss <anand108 <at> gmail.com>
Cc: 19180 <at> debbugs.gnu.org, Ludovic Courtès <ludo <at> gnu.org>,
 Linas Vepstas <linasvepstas <at> gmail.com>
Subject: Re: bug#19180: guile bug#19180: vacuum_weak_hash_table error
Date: Wed, 22 Jun 2016 23:33:54 +0200
On Wed 22 Jun 2016 17:43, Anand Mohanadoss <anand108 <at> gmail.com> writes:

> Any idea when we will have a stable 2.2 release that we can move to
> given that 2.1 has been out for a few months.

I hope for 2.2 within the next couple months, but we'll see :)

Andy




Information forwarded to bug-guile <at> gnu.org:
bug#19180; Package guile. (Thu, 23 Jun 2016 04:25:01 GMT) Full text and rfc822 format available.

Message #47 received at 19180 <at> debbugs.gnu.org (full text, mbox):

From: Anand Mohanadoss <anand108 <at> gmail.com>
To: Andy Wingo <wingo <at> pobox.com>
Cc: 19180 <at> debbugs.gnu.org, Ludovic Courtès <ludo <at> gnu.org>,
 Linas Vepstas <linasvepstas <at> gmail.com>
Subject: Re: bug#19180: guile bug#19180: vacuum_weak_hash_table error
Date: Thu, 23 Jun 2016 09:54:11 +0530
[Message part 1 (text/plain, inline)]
Hi Andy and Linas,

Thank you very much for your responses!

Anand

On Thu, Jun 23, 2016 at 3:03 AM, Andy Wingo <wingo <at> pobox.com> wrote:

> On Wed 22 Jun 2016 17:43, Anand Mohanadoss <anand108 <at> gmail.com> writes:
>
> > Any idea when we will have a stable 2.2 release that we can move to
> > given that 2.1 has been out for a few months.
>
> I hope for 2.2 within the next couple months, but we'll see :)
>
> Andy
>
[Message part 2 (text/html, inline)]

Reply sent to Andy Wingo <wingo <at> pobox.com>:
You have taken responsibility. (Tue, 12 Jul 2016 07:33:02 GMT) Full text and rfc822 format available.

Notification sent to Anand Mohanadoss <anand108 <at> gmail.com>:
bug acknowledged by developer. (Tue, 12 Jul 2016 07:33:02 GMT) Full text and rfc822 format available.

Message #52 received at 19180-done <at> debbugs.gnu.org (full text, mbox):

From: Andy Wingo <wingo <at> pobox.com>
To: Anand Mohanadoss <anand108 <at> gmail.com>
Cc: 19180-done <at> debbugs.gnu.org, Ludovic Courtès <ludo <at> gnu.org>,
 Linas Vepstas <linasvepstas <at> gmail.com>
Subject: Re: bug#19180: guile bug#19180: vacuum_weak_hash_table error
Date: Tue, 12 Jul 2016 09:32:16 +0200
Hi,

On Wed 22 Jun 2016 16:55, Andy Wingo <wingo <at> pobox.com> writes:

> On Mon 15 Dec 2014 07:36, Anand Mohanadoss <anand108 <at> gmail.com> writes:
>
>> Vacuum weak hash table assert Table=0x9bdb840 len=0 removed=1
>> orig_len=2321 n_items=2321
>
> I guess printing a warning is not worse than crashing.  I was unable to
> make this table work in a reliable way in 2.0 without rewriting it, so
> in 2.2 there's a new implementation with hopefully no bug in this
> regard.

I changed this assert to a warning and added a comment like this:

          /* The move to BDW-GC with Guile 2.0 introduced some bugs
             related to weak hash tables, threads, memory usage, and the
             alloc lock.  We were unable to fix these issues
             satisfactorily in 2.0 but have addressed them via a rewrite
             in 2.2.  If you see this message often, you probably want
             to upgrade to 2.2.  */

Andy




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 09 Aug 2016 11:24:04 GMT) Full text and rfc822 format available.

bug unarchived. Request was from ludovic.courtes <at> inria.fr (Ludovic Courtès) to control <at> debbugs.gnu.org. (Tue, 24 Oct 2017 23:41:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-guile <at> gnu.org:
bug#19180; Package guile. (Wed, 25 Oct 2017 00:51:01 GMT) Full text and rfc822 format available.

Message #59 received at 19180 <at> debbugs.gnu.org (full text, mbox):

From: Christopher Allan Webber <cwebber <at> dustycloud.org>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: Andy Wingo <wingo <at> igalia.com>, 19180 <at> debbugs.gnu.org, guile-devel <at> gnu.org
Subject: Re: Weak tables harmful to GC?
Date: Tue, 24 Oct 2017 19:50:01 -0500
Ludovic Courtès writes:

> Christopher Allan Webber <cwebber <at> dustycloud.org> skribis:
>
>> Ludovic Courtès writes:
>>
>>> Also, it no longer displays the pathological behavior shown in
>>> <https://bugs.gnu.org/28590>.
>>>
>>> Of course, even better if people could test the two patches and confirm
>>> that it works for them.
>>>
>>> Then if there are no objections I’d like to merge them in ‘stable-2.2’.
>>
>> Sounds great indeed, but it didn't apply to master or stable-2.2 for me?
>
> Really?  The two patches should apply to stable-2.2, though you need to
> apply them in the right order (I have it applied over
> 80696023620eae12f9b2f167aee834f632a32739.)
>
> Ludo’.

Huh?  What object is this?  I don't see it in my git repo.

This is the latest commit I see to stable-2.2, which is also what
Savannah sees:

  https://git.savannah.gnu.org/cgit/guile.git/commit/?h=stable-2.2&id=a74d4ee4f6e062ff640f2532c9cfc9977bb68a49




Information forwarded to bug-guile <at> gnu.org:
bug#19180; Package guile. (Wed, 25 Oct 2017 08:26:01 GMT) Full text and rfc822 format available.

Message #62 received at 19180 <at> debbugs.gnu.org (full text, mbox):

From: Ricardo Wurmus <rekado <at> elephly.net>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: Christopher Allan Webber <cwebber <at> dustycloud.org>,
 Andy Wingo <wingo <at> igalia.com>, 19180 <at> debbugs.gnu.org, guile-devel <at> gnu.org
Subject: Re: Weak tables harmful to GC?
Date: Wed, 25 Oct 2017 08:38:35 +0200
[resending this because it may not have arrived]

Ricardo Wurmus <rekado <at> elephly.net> writes:

> Hi Ludo,
>
> does this apply to the latest release of Guile 2.2.2?  I’ve created this
> package definition:
>
> --8<---------------cut here---------------start------------->8---
> (define-public guile-2.2-awesome
>   (package (inherit guile-2.2)
>     (name "guile-awesome")
>     (source (origin (inherit (package-source guile-2.2))
>        (patches (list "0001-Remove-weak-tables-and-revert-to-weak-hash-tables.patch"
>                       "0002-Keep-weak-hash-table-item-count-consistent.patch"))))))
>
> --8<---------------cut here---------------end--------------->8---
>
> But the build fails:
>
> --8<---------------cut here---------------start------------->8---
> In file included from weak-table.c:37:0:
> ../libguile/weak-table.h:34:3: error: redeclaration of enumerator 'SCM_WEAK_TABLE_KIND_KEY'
>    SCM_WEAK_TABLE_KIND_KEY,
>    ^
> In file included from ../libguile.h:65:0,
>                  from ../libguile/programs.h:22,
>                  from ../libguile/_scm.h:85,
>                  from weak-table.c:30:
> ../libguile/hashtab.h:36:3: note: previous definition of 'SCM_WEAK_TABLE_KIND_KEY' was here
>    SCM_WEAK_TABLE_KIND_KEY = 1,
>    ^
> In file included from weak-table.c:37:0:
> ../libguile/weak-table.h:35:3: error: redeclaration of enumerator 'SCM_WEAK_TABLE_KIND_VALUE'
>    SCM_WEAK_TABLE_KIND_VALUE,
>    ^
> In file included from ../libguile.h:65:0,
>                  from ../libguile/programs.h:22,
>                  from ../libguile/_scm.h:85,
>                  from weak-table.c:30:
> ../libguile/hashtab.h:37:3: note: previous definition of 'SCM_WEAK_TABLE_KIND_VALUE' was here
>    SCM_WEAK_TABLE_KIND_VALUE = 2,
>    ^
> In file included from weak-table.c:37:0:
> ../libguile/weak-table.h:36:3: error: redeclaration of enumerator 'SCM_WEAK_TABLE_KIND_BOTH'
>    SCM_WEAK_TABLE_KIND_BOTH,
>    ^
> In file included from ../libguile.h:65:0,
>                  from ../libguile/programs.h:22,
>                  from ../libguile/_scm.h:85,
>                  from weak-table.c:30:
> ../libguile/hashtab.h:38:3: note: previous definition of 'SCM_WEAK_TABLE_KIND_BOTH' was here
>    SCM_WEAK_TABLE_KIND_BOTH = 1 | 2
>    ^
> In file included from weak-table.c:37:0:
> ../libguile/weak-table.h:37:3: error: conflicting types for 'scm_t_weak_table_kind'
>  } scm_t_weak_table_kind;
>    ^
> In file included from ../libguile.h:65:0,
>                  from ../libguile/programs.h:22,
>                  from ../libguile/_scm.h:85,
>                  from weak-table.c:30:
> ../libguile/hashtab.h:39:3: note: previous declaration of 'scm_t_weak_table_kind' was here
>  } scm_t_weak_table_kind;
>    ^
> In file included from weak-table.c:37:0:
> ../libguile/weak-table.h:46:18: error: conflicting types for 'scm_c_make_weak_table'
>  SCM_INTERNAL SCM scm_c_make_weak_table (unsigned long k,
>                   ^
> In file included from ../libguile.h:65:0,
>                  from ../libguile/programs.h:22,
>                  from ../libguile/_scm.h:85,
>                  from weak-table.c:30:
> ../libguile/hashtab.h:179:18: note: previous declaration of 'scm_c_make_weak_table' was here
>  SCM_INTERNAL SCM scm_c_make_weak_table (unsigned long k,
>                   ^
> weak-table.c: In function 'make_weak_table':
> weak-table.c:798:20: error: 'scm_tc7_weak_table' undeclared (first use in this function)
>    return scm_cell (scm_tc7_weak_table, (scm_t_bits)table);
>                     ^
> weak-table.c:798:20: note: each undeclared identifier is reported only once for each function it appears in
> weak-table.c: At top level:
> weak-table.c:848:1: error: conflicting types for 'scm_c_make_weak_table'
>  scm_c_make_weak_table (unsigned long k, scm_t_weak_table_kind kind)
>  ^
> In file included from ../libguile.h:65:0,
>                  from ../libguile/programs.h:22,
>                  from ../libguile/_scm.h:85,
>                  from weak-table.c:30:
> ../libguile/hashtab.h:179:18: note: previous declaration of 'scm_c_make_weak_table' was here
>  SCM_INTERNAL SCM scm_c_make_weak_table (unsigned long k,
>                   ^
> In file included from ../libguile/_scm.h:81:0,
>                  from weak-table.c:30:
> weak-table.c: In function 'scm_weak_table_p':
> weak-table.c:216:47: error: 'scm_tc7_weak_table' undeclared (first use in this function)
>  #define SCM_WEAK_TABLE_P(x) (SCM_HAS_TYP7 (x, scm_tc7_weak_table))
>                                                ^
> ../libguile/boolean.h:88:28: note: in definition of macro 'scm_from_bool'
>  #define scm_from_bool(x) ((x) ? SCM_BOOL_T : SCM_BOOL_F)
>                             ^
> ../libguile/tags.h:396:34: note: in expansion of macro 'SCM_HAS_HEAP_TYPE'
>  #define SCM_HAS_TYP7(x, tag)    (SCM_HAS_HEAP_TYPE (x, SCM_TYP7, tag))
>                                   ^
> weak-table.c:216:30: note: in expansion of macro 'SCM_HAS_TYP7'
>  #define SCM_WEAK_TABLE_P(x) (SCM_HAS_TYP7 (x, scm_tc7_weak_table))
>                               ^
> weak-table.c:864:25: note: in expansion of macro 'SCM_WEAK_TABLE_P'
>    return scm_from_bool (SCM_WEAK_TABLE_P (obj));
>                          ^
> In file included from ../libguile/_scm.h:40:0,
>                  from weak-table.c:30:
> weak-table.c: In function 'scm_c_weak_table_ref':
> weak-table.c:216:47: error: 'scm_tc7_weak_table' undeclared (first use in this function)
>  #define SCM_WEAK_TABLE_P(x) (SCM_HAS_TYP7 (x, scm_tc7_weak_table))
>                                                ^
> ../config.h:2584:34: note: in definition of macro '__builtin_expect'
>  # define __builtin_expect(e, c) (e)
>                                   ^
> ../libguile/error.h:42:12: note: in expansion of macro 'SCM_UNLIKELY'
>    do { if (SCM_UNLIKELY (!(_cond)))                                     \
>             ^
> ../libguile/validate.h:124:5: note: in expansion of macro 'SCM_ASSERT_TYPE'
>      SCM_ASSERT_TYPE (pred (var), var, pos, FUNC_NAME, msg); \
>      ^
> ../libguile/tags.h:396:34: note: in expansion of macro 'SCM_HAS_HEAP_TYPE'
>  #define SCM_HAS_TYP7(x, tag)    (SCM_HAS_HEAP_TYPE (x, SCM_TYP7, tag))
>                                   ^
> weak-table.c:216:30: note: in expansion of macro 'SCM_HAS_TYP7'
>  #define SCM_WEAK_TABLE_P(x) (SCM_HAS_TYP7 (x, scm_tc7_weak_table))
>                               ^
> ../libguile/validate.h:124:22: note: in expansion of macro 'SCM_WEAK_TABLE_P'
>      SCM_ASSERT_TYPE (pred (var), var, pos, FUNC_NAME, msg); \
>                       ^
> ../libguile/validate.h:128:3: note: in expansion of macro 'SCM_I_MAKE_VALIDATE_MSG2'
>    SCM_I_MAKE_VALIDATE_MSG2 (pos, var, SCM_ ## pred, msg)
>    ^
> weak-table.c:218:3: note: in expansion of macro 'SCM_MAKE_VALIDATE_MSG'
>    SCM_MAKE_VALIDATE_MSG (pos, arg, WEAK_TABLE_P, "weak-table")
>    ^
> weak-table.c:876:3: note: in expansion of macro 'SCM_VALIDATE_WEAK_TABLE'
>    SCM_VALIDATE_WEAK_TABLE (1, table);
>    ^
> weak-table.c: In function 'scm_c_weak_table_put_x':
> weak-table.c:216:47: error: 'scm_tc7_weak_table' undeclared (first use in this function)
>  #define SCM_WEAK_TABLE_P(x) (SCM_HAS_TYP7 (x, scm_tc7_weak_table))
>                                                ^
> ../config.h:2584:34: note: in definition of macro '__builtin_expect'
>  # define __builtin_expect(e, c) (e)
>                                   ^
> ../libguile/error.h:42:12: note: in expansion of macro 'SCM_UNLIKELY'
>    do { if (SCM_UNLIKELY (!(_cond)))                                     \
>             ^
> ../libguile/validate.h:124:5: note: in expansion of macro 'SCM_ASSERT_TYPE'
>      SCM_ASSERT_TYPE (pred (var), var, pos, FUNC_NAME, msg); \
>      ^
> ../libguile/tags.h:396:34: note: in expansion of macro 'SCM_HAS_HEAP_TYPE'
>  #define SCM_HAS_TYP7(x, tag)    (SCM_HAS_HEAP_TYPE (x, SCM_TYP7, tag))
>                                   ^
> weak-table.c:216:30: note: in expansion of macro 'SCM_HAS_TYP7'
>  #define SCM_WEAK_TABLE_P(x) (SCM_HAS_TYP7 (x, scm_tc7_weak_table))
>                               ^
> ../libguile/validate.h:124:22: note: in expansion of macro 'SCM_WEAK_TABLE_P'
>      SCM_ASSERT_TYPE (pred (var), var, pos, FUNC_NAME, msg); \
>                       ^
> ../libguile/validate.h:128:3: note: in expansion of macro 'SCM_I_MAKE_VALIDATE_MSG2'
>    SCM_I_MAKE_VALIDATE_MSG2 (pos, var, SCM_ ## pred, msg)
>    ^
> weak-table.c:218:3: note: in expansion of macro 'SCM_MAKE_VALIDATE_MSG'
>    SCM_MAKE_VALIDATE_MSG (pos, arg, WEAK_TABLE_P, "weak-table")
>    ^
> weak-table.c:898:3: note: in expansion of macro 'SCM_VALIDATE_WEAK_TABLE'
>    SCM_VALIDATE_WEAK_TABLE (1, table);
>    ^
> weak-table.c: In function 'scm_c_weak_table_remove_x':
> weak-table.c:216:47: error: 'scm_tc7_weak_table' undeclared (first use in this function)
>  #define SCM_WEAK_TABLE_P(x) (SCM_HAS_TYP7 (x, scm_tc7_weak_table))
>                                                ^
> ../config.h:2584:34: note: in definition of macro '__builtin_expect'
>  # define __builtin_expect(e, c) (e)
>                                   ^
> ../libguile/error.h:42:12: note: in expansion of macro 'SCM_UNLIKELY'
>    do { if (SCM_UNLIKELY (!(_cond)))                                     \
>             ^
> ../libguile/validate.h:124:5: note: in expansion of macro 'SCM_ASSERT_TYPE'
>      SCM_ASSERT_TYPE (pred (var), var, pos, FUNC_NAME, msg); \
>      ^
> ../libguile/tags.h:396:34: note: in expansion of macro 'SCM_HAS_HEAP_TYPE'
>  #define SCM_HAS_TYP7(x, tag)    (SCM_HAS_HEAP_TYPE (x, SCM_TYP7, tag))
>                                   ^
> weak-table.c:216:30: note: in expansion of macro 'SCM_HAS_TYP7'
>  #define SCM_WEAK_TABLE_P(x) (SCM_HAS_TYP7 (x, scm_tc7_weak_table))
>                               ^
> ../libguile/validate.h:124:22: note: in expansion of macro 'SCM_WEAK_TABLE_P'
>      SCM_ASSERT_TYPE (pred (var), var, pos, FUNC_NAME, msg); \
>                       ^
> ../libguile/validate.h:128:3: note: in expansion of macro 'SCM_I_MAKE_VALIDATE_MSG2'
>    SCM_I_MAKE_VALIDATE_MSG2 (pos, var, SCM_ ## pred, msg)
>    ^
> weak-table.c:218:3: note: in expansion of macro 'SCM_MAKE_VALIDATE_MSG'
>    SCM_MAKE_VALIDATE_MSG (pos, arg, WEAK_TABLE_P, "weak-table")
>    ^
> weak-table.c:918:3: note: in expansion of macro 'SCM_VALIDATE_WEAK_TABLE'
>    SCM_VALIDATE_WEAK_TABLE (1, table);
>    ^
> weak-table.c: In function 'scm_weak_table_clear_x':
> weak-table.c:216:47: error: 'scm_tc7_weak_table' undeclared (first use in this function)
>  #define SCM_WEAK_TABLE_P(x) (SCM_HAS_TYP7 (x, scm_tc7_weak_table))
>                                                ^
> ../config.h:2584:34: note: in definition of macro '__builtin_expect'
>  # define __builtin_expect(e, c) (e)
>                                   ^
> ../libguile/error.h:42:12: note: in expansion of macro 'SCM_UNLIKELY'
>    do { if (SCM_UNLIKELY (!(_cond)))                                     \
>             ^
> ../libguile/validate.h:124:5: note: in expansion of macro 'SCM_ASSERT_TYPE'
>      SCM_ASSERT_TYPE (pred (var), var, pos, FUNC_NAME, msg); \
>      ^
> ../libguile/tags.h:396:34: note: in expansion of macro 'SCM_HAS_HEAP_TYPE'
>  #define SCM_HAS_TYP7(x, tag)    (SCM_HAS_HEAP_TYPE (x, SCM_TYP7, tag))
>                                   ^
> weak-table.c:216:30: note: in expansion of macro 'SCM_HAS_TYP7'
>  #define SCM_WEAK_TABLE_P(x) (SCM_HAS_TYP7 (x, scm_tc7_weak_table))
>                               ^
> ../libguile/validate.h:124:22: note: in expansion of macro 'SCM_WEAK_TABLE_P'
>      SCM_ASSERT_TYPE (pred (var), var, pos, FUNC_NAME, msg); \
>                       ^
> ../libguile/validate.h:128:3: note: in expansion of macro 'SCM_I_MAKE_VALIDATE_MSG2'
>    SCM_I_MAKE_VALIDATE_MSG2 (pos, var, SCM_ ## pred, msg)
>    ^
> weak-table.c:218:3: note: in expansion of macro 'SCM_MAKE_VALIDATE_MSG'
>    SCM_MAKE_VALIDATE_MSG (pos, arg, WEAK_TABLE_P, "weak-table")
>    ^
> weak-table.c:965:3: note: in expansion of macro 'SCM_VALIDATE_WEAK_TABLE'
>    SCM_VALIDATE_WEAK_TABLE (1, table);
>    ^
> weak-table.c: In function 'scm_weak_table_fold':
> weak-table.c:216:47: error: 'scm_tc7_weak_table' undeclared (first use in this function)
>  #define SCM_WEAK_TABLE_P(x) (SCM_HAS_TYP7 (x, scm_tc7_weak_table))
>                                                ^
> ../config.h:2584:34: note: in definition of macro '__builtin_expect'
>  # define __builtin_expect(e, c) (e)
>                                   ^
> ../libguile/error.h:42:12: note: in expansion of macro 'SCM_UNLIKELY'
>    do { if (SCM_UNLIKELY (!(_cond)))                                     \
>             ^
> ../libguile/validate.h:124:5: note: in expansion of macro 'SCM_ASSERT_TYPE'
>      SCM_ASSERT_TYPE (pred (var), var, pos, FUNC_NAME, msg); \
>      ^
> ../libguile/tags.h:396:34: note: in expansion of macro 'SCM_HAS_HEAP_TYPE'
>  #define SCM_HAS_TYP7(x, tag)    (SCM_HAS_HEAP_TYPE (x, SCM_TYP7, tag))
>                                   ^
> weak-table.c:216:30: note: in expansion of macro 'SCM_HAS_TYP7'
>  #define SCM_WEAK_TABLE_P(x) (SCM_HAS_TYP7 (x, scm_tc7_weak_table))
>                               ^
> ../libguile/validate.h:124:22: note: in expansion of macro 'SCM_WEAK_TABLE_P'
>      SCM_ASSERT_TYPE (pred (var), var, pos, FUNC_NAME, msg); \
>                       ^
> ../libguile/validate.h:128:3: note: in expansion of macro 'SCM_I_MAKE_VALIDATE_MSG2'
>    SCM_I_MAKE_VALIDATE_MSG2 (pos, var, SCM_ ## pred, msg)
>    ^
> weak-table.c:218:3: note: in expansion of macro 'SCM_MAKE_VALIDATE_MSG'
>    SCM_MAKE_VALIDATE_MSG (pos, arg, WEAK_TABLE_P, "weak-table")
>    ^
> weak-table.c:1028:3: note: in expansion of macro 'SCM_VALIDATE_WEAK_TABLE'
>    SCM_VALIDATE_WEAK_TABLE (3, table);
>    ^
>   SNARF  variable.doc
> weak-table.c: In function 'scm_weak_table_for_each':
> weak-table.c:216:47: error: 'scm_tc7_weak_table' undeclared (first use in this function)
>  #define SCM_WEAK_TABLE_P(x) (SCM_HAS_TYP7 (x, scm_tc7_weak_table))
>                                                ^
> ../config.h:2584:34: note: in definition of macro '__builtin_expect'
>  # define __builtin_expect(e, c) (e)
>                                   ^
> ../libguile/error.h:42:12: note: in expansion of macro 'SCM_UNLIKELY'
>    do { if (SCM_UNLIKELY (!(_cond)))                                     \
>             ^
> ../libguile/validate.h:124:5: note: in expansion of macro 'SCM_ASSERT_TYPE'
>      SCM_ASSERT_TYPE (pred (var), var, pos, FUNC_NAME, msg); \
>      ^
> ../libguile/tags.h:396:34: note: in expansion of macro 'SCM_HAS_HEAP_TYPE'
>  #define SCM_HAS_TYP7(x, tag)    (SCM_HAS_HEAP_TYPE (x, SCM_TYP7, tag))
>                                   ^
> weak-table.c:216:30: note: in expansion of macro 'SCM_HAS_TYP7'
>  #define SCM_WEAK_TABLE_P(x) (SCM_HAS_TYP7 (x, scm_tc7_weak_table))
>                               ^
> ../libguile/validate.h:124:22: note: in expansion of macro 'SCM_WEAK_TABLE_P'
>      SCM_ASSERT_TYPE (pred (var), var, pos, FUNC_NAME, msg); \
>                       ^
> ../libguile/validate.h:128:3: note: in expansion of macro 'SCM_I_MAKE_VALIDATE_MSG2'
>    SCM_I_MAKE_VALIDATE_MSG2 (pos, var, SCM_ ## pred, msg)
>    ^
> weak-table.c:218:3: note: in expansion of macro 'SCM_MAKE_VALIDATE_MSG'
>    SCM_MAKE_VALIDATE_MSG (pos, arg, WEAK_TABLE_P, "weak-table")
>    ^
> weak-table.c:1046:3: note: in expansion of macro 'SCM_VALIDATE_WEAK_TABLE'
>    SCM_VALIDATE_WEAK_TABLE (2, table);
>    ^
> weak-table.c: In function 'scm_weak_table_map_to_list':
> weak-table.c:216:47: error: 'scm_tc7_weak_table' undeclared (first use in this function)
>  #define SCM_WEAK_TABLE_P(x) (SCM_HAS_TYP7 (x, scm_tc7_weak_table))
>                                                ^
> ../config.h:2584:34: note: in definition of macro '__builtin_expect'
>  # define __builtin_expect(e, c) (e)
>                                   ^
> ../libguile/error.h:42:12: note: in expansion of macro 'SCM_UNLIKELY'
>    do { if (SCM_UNLIKELY (!(_cond)))                                     \
>             ^
> ../libguile/validate.h:124:5: note: in expansion of macro 'SCM_ASSERT_TYPE'
>      SCM_ASSERT_TYPE (pred (var), var, pos, FUNC_NAME, msg); \
>      ^
> ../libguile/tags.h:396:34: note: in expansion of macro 'SCM_HAS_HEAP_TYPE'
>  #define SCM_HAS_TYP7(x, tag)    (SCM_HAS_HEAP_TYPE (x, SCM_TYP7, tag))
>                                   ^
> weak-table.c:216:30: note: in expansion of macro 'SCM_HAS_TYP7'
>  #define SCM_WEAK_TABLE_P(x) (SCM_HAS_TYP7 (x, scm_tc7_weak_table))
>                               ^
> ../libguile/validate.h:124:22: note: in expansion of macro 'SCM_WEAK_TABLE_P'
>      SCM_ASSERT_TYPE (pred (var), var, pos, FUNC_NAME, msg); \
>                       ^
> ../libguile/validate.h:128:3: note: in expansion of macro 'SCM_I_MAKE_VALIDATE_MSG2'
>    SCM_I_MAKE_VALIDATE_MSG2 (pos, var, SCM_ ## pred, msg)
>    ^
> weak-table.c:218:3: note: in expansion of macro 'SCM_MAKE_VALIDATE_MSG'
>    SCM_MAKE_VALIDATE_MSG (pos, arg, WEAK_TABLE_P, "weak-table")
>    ^
> weak-table.c:1063:3: note: in expansion of macro 'SCM_VALIDATE_WEAK_TABLE'
>    SCM_VALIDATE_WEAK_TABLE (2, table);
>    ^
> In file included from ../libguile/_scm.h:81:0,
>                  from weak-table.c:30:
> weak-table.c: In function 'scm_weak_key_hash_table_p':
> weak-table.c:216:47: error: 'scm_tc7_weak_table' undeclared (first use in this function)
>  #define SCM_WEAK_TABLE_P(x) (SCM_HAS_TYP7 (x, scm_tc7_weak_table))
>                                                ^
> ../libguile/boolean.h:88:28: note: in definition of macro 'scm_from_bool'
>  #define scm_from_bool(x) ((x) ? SCM_BOOL_T : SCM_BOOL_F)
>                             ^
> ../libguile/tags.h:396:34: note: in expansion of macro 'SCM_HAS_HEAP_TYPE'
>  #define SCM_HAS_TYP7(x, tag)    (SCM_HAS_HEAP_TYPE (x, SCM_TYP7, tag))
>                                   ^
> weak-table.c:216:30: note: in expansion of macro 'SCM_HAS_TYP7'
>  #define SCM_WEAK_TABLE_P(x) (SCM_HAS_TYP7 (x, scm_tc7_weak_table))
>                               ^
> weak-table.c:1124:25: note: in expansion of macro 'SCM_WEAK_TABLE_P'
>    return scm_from_bool (SCM_WEAK_TABLE_P (obj) &&
>                          ^
> weak-table.c: In function 'scm_weak_value_hash_table_p':
> weak-table.c:216:47: error: 'scm_tc7_weak_table' undeclared (first use in this function)
>  #define SCM_WEAK_TABLE_P(x) (SCM_HAS_TYP7 (x, scm_tc7_weak_table))
>                                                ^
> ../libguile/boolean.h:88:28: note: in definition of macro 'scm_from_bool'
>  #define scm_from_bool(x) ((x) ? SCM_BOOL_T : SCM_BOOL_F)
>                             ^
> ../libguile/tags.h:396:34: note: in expansion of macro 'SCM_HAS_HEAP_TYPE'
>  #define SCM_HAS_TYP7(x, tag)    (SCM_HAS_HEAP_TYPE (x, SCM_TYP7, tag))
>                                   ^
> weak-table.c:216:30: note: in expansion of macro 'SCM_HAS_TYP7'
>  #define SCM_WEAK_TABLE_P(x) (SCM_HAS_TYP7 (x, scm_tc7_weak_table))
>                               ^
> weak-table.c:1135:25: note: in expansion of macro 'SCM_WEAK_TABLE_P'
>    return scm_from_bool (SCM_WEAK_TABLE_P (obj) &&
>                          ^
> weak-table.c: In function 'scm_doubly_weak_hash_table_p':
> weak-table.c:216:47: error: 'scm_tc7_weak_table' undeclared (first use in this function)
>  #define SCM_WEAK_TABLE_P(x) (SCM_HAS_TYP7 (x, scm_tc7_weak_table))
>                                                ^
> ../libguile/boolean.h:88:28: note: in definition of macro 'scm_from_bool'
>  #define scm_from_bool(x) ((x) ? SCM_BOOL_T : SCM_BOOL_F)
>                             ^
> ../libguile/tags.h:396:34: note: in expansion of macro 'SCM_HAS_HEAP_TYPE'
>  #define SCM_HAS_TYP7(x, tag)    (SCM_HAS_HEAP_TYPE (x, SCM_TYP7, tag))
>                                   ^
> weak-table.c:216:30: note: in expansion of macro 'SCM_HAS_TYP7'
>  #define SCM_WEAK_TABLE_P(x) (SCM_HAS_TYP7 (x, scm_tc7_weak_table))
>                               ^
> weak-table.c:1146:25: note: in expansion of macro 'SCM_WEAK_TABLE_P'
>    return scm_from_bool (SCM_WEAK_TABLE_P (obj) &&
>                          ^
>   SNARF  vectors.doc
> weak-table.c: In function 'make_weak_table':
> weak-table.c:799:1: warning: control reaches end of non-void function [-Wreturn-type]
>  }
>  ^
> weak-table.c: In function 'scm_weak_table_p':
> weak-table.c:865:1: warning: control reaches end of non-void function [-Wreturn-type]
>  }
>  ^
>   SNARF  version.doc
>   SNARF  vports.doc
>   SNARF  weak-set.doc
>   SNARF  weak-table.doc
> weak-table.c: In function 'scm_weak_key_hash_table_p':
> weak-table.c:1126:1: warning: control reaches end of non-void function [-Wreturn-type]
>  }
>  ^
> weak-table.c: In function 'scm_weak_value_hash_table_p':
> weak-table.c:1137:1: warning: control reaches end of non-void function [-Wreturn-type]
>  }
>  ^
> weak-table.c: In function 'scm_doubly_weak_hash_table_p':
> weak-table.c:1148:1: warning: control reaches end of non-void function [-Wreturn-type]
>  }
>  ^
>   SNARF  weak-vector.doc
>   SNARF  posix.doc
>   SNARF  net_db.doc
>   SNARF  socket.doc
>   SNARF  regex-posix.doc
> make[3]: *** [Makefile:3368: libguile_2.2_la-weak-table.lo] Error 1
> make[3]: *** Waiting for unfinished jobs....
> make[3]: Leaving directory '/tmp/guix-build-guile-awesome-2.2.2.drv-0/guile-2.2.2/libguile'
> make[2]: *** [Makefile:2299: all] Error 2
> make[2]: Leaving directory '/tmp/guix-build-guile-awesome-2.2.2.drv-0/guile-2.2.2/libguile'
> make[1]: *** [Makefile:1857: all-recursive] Error 1
> make[1]: Leaving directory '/tmp/guix-build-guile-awesome-2.2.2.drv-0/guile-2.2.2'
> make: *** [Makefile:1743: all] Error 2
> phase `build' failed after 7.7 seconds
> builder for `/gnu/store/hgkdv4gx0rvhi8aawkh9ykyr50vi3lv0-guile-awesome-2.2.2.drv' failed with exit code 1
> @ build-failed /gnu/store/hgkdv4gx0rvhi8aawkh9ykyr50vi3lv0-guile-awesome-2.2.2.drv - 1 builder for `/gnu/store/hgkdv4gx0rvhi8aawkh9ykyr50vi3lv0-guile-awesome-2.2.2.drv' failed with exit code 1
> guix build: error: build failed: build of
> `/gnu/store/hgkdv4gx0rvhi8aawkh9ykyr50vi3lv0-guile-awesome-2.2.2.drv' failed
> --8<---------------cut here---------------end--------------->8---
>
> Am I doing something wrong?


-- 
Ricardo
  
  GPG: BCA6 89B6 3655 3801 C3C6  2150 197A 5888 235F ACAC
  https://elephly.net





Information forwarded to bug-guile <at> gnu.org:
bug#19180; Package guile. (Wed, 25 Oct 2017 17:13:02 GMT) Full text and rfc822 format available.

Message #65 received at 19180 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Christopher Allan Webber <cwebber <at> dustycloud.org>
Cc: Andy Wingo <wingo <at> igalia.com>, 19180 <at> debbugs.gnu.org, guile-devel <at> gnu.org
Subject: Re: Weak tables harmful to GC?
Date: Wed, 25 Oct 2017 10:11:58 -0700
Christopher Allan Webber <cwebber <at> dustycloud.org> skribis:

> Ludovic Courtès writes:
>
>> Christopher Allan Webber <cwebber <at> dustycloud.org> skribis:
>>
>>> Ludovic Courtès writes:
>>>
>>>> Also, it no longer displays the pathological behavior shown in
>>>> <https://bugs.gnu.org/28590>.
>>>>
>>>> Of course, even better if people could test the two patches and confirm
>>>> that it works for them.
>>>>
>>>> Then if there are no objections I’d like to merge them in ‘stable-2.2’.
>>>
>>> Sounds great indeed, but it didn't apply to master or stable-2.2 for me?
>>
>> Really?  The two patches should apply to stable-2.2, though you need to
>> apply them in the right order (I have it applied over
>> 80696023620eae12f9b2f167aee834f632a32739.)
>>
>> Ludo’.
>
> Huh?  What object is this?  I don't see it in my git repo.

My bad, it’s actually ac0d3dcc533850d25f3e533c04c1c238a83f190b.

Ludo’.




Information forwarded to bug-guile <at> gnu.org:
bug#19180; Package guile. (Thu, 26 Oct 2017 07:04:01 GMT) Full text and rfc822 format available.

Message #68 received at 19180 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Ricardo Wurmus <rekado <at> elephly.net>
Cc: Christopher Allan Webber <cwebber <at> dustycloud.org>,
 Andy Wingo <wingo <at> igalia.com>, 19180 <at> debbugs.gnu.org, guile-devel <at> gnu.org
Subject: Re: Weak tables harmful to GC?
Date: Thu, 26 Oct 2017 00:03:32 -0700
[Message part 1 (text/plain, inline)]
Hello!

Here’s an updated patch set (tested on top of
1008ea315483d1fb41b2a8c10680e511238836d0).

Let me know if things still go wrong.

Thanks,
Ludo’.

[0001-Remove-weak-tables-and-revert-to-weak-hash-tables.patch (text/x-patch, attachment)]
[0002-Keep-weak-hash-table-item-count-consistent.patch (text/x-patch, attachment)]

Information forwarded to bug-guile <at> gnu.org:
bug#19180; Package guile. (Thu, 26 Oct 2017 10:24:02 GMT) Full text and rfc822 format available.

Message #71 received at 19180 <at> debbugs.gnu.org (full text, mbox):

From: Ricardo Wurmus <rekado <at> elephly.net>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: Christopher Allan Webber <cwebber <at> dustycloud.org>,
 Andy Wingo <wingo <at> igalia.com>, 19180 <at> debbugs.gnu.org, guile-devel <at> gnu.org
Subject: Re: Weak tables harmful to GC?
Date: Thu, 26 Oct 2017 10:35:06 +0200
Hi Ludo,

I tried building Guile with the following Guix package definition:

--8<---------------cut here---------------start------------->8---
(define-public guile-2.2-awesome
  (package (inherit guile-2.2)
    (name "guile-awesome")
    (source (origin (inherit (package-source guile-2.2))
       (patches (list "/home/rwurmus/0001-Remove-weak-tables-and-revert-to-weak-hash-tables.patch"
                      "/home/rwurmus/0002-Keep-weak-hash-table-item-count-consistent.patch"))))
    (arguments
      (substitute-keyword-arguments (package-arguments guile-2.2)
        ((#:phases phases)
         `(modify-phases ,phases
            (add-before 'pre-configure 'bootstrap
              (lambda _
                (zero? (system* "autoreconf" "-vif"))))))))
    (native-inputs
     `(("autoconf" ,autoconf)
       ("automake" ,automake)
       ("libtool" ,libtool)
       ("flex" ,flex)
       ("texinfo" ,texinfo)
       ("gettext" ,gettext-minimal)
       ,@(package-native-inputs guile-2.2)))))
--8<---------------cut here---------------end--------------->8---

Unfortunately, I cannot bootstrap Guile on this 1.5 TB RAM server:

--8<---------------cut here---------------start------------->8---
…
  BOOTSTRAP GUILEC system/vm/program.go
  BOOTSTRAP GUILEC system/vm/vm.go
  BOOTSTRAP GUILEC system/foreign.go
GC Warning: Repeated allocation of very large block (appr. size 230096896):
        May lead to memory leak and poor performance
GC Warning: Repeated allocation of very large block (appr. size 230096896):
        May lead to memory leak and poor performance
GC Warning: Repeated allocation of very large block (appr. size 230096896):
        May lead to memory leak and poor performance
GC Warning: Repeated allocation of very large block (appr. size 230096896):
        May lead to memory leak and poor performance
GC Warning: Repeated allocation of very large block (appr. size 230096896):
        May lead to memory leak and poor performance
GC Warning: Repeated allocation of very large block (appr. size 230096896):
        May lead to memory leak and poor performance
GC Warning: Repeated allocation of very large block (appr. size 230096896):
        May lead to memory leak and poor performance
Too many heap sections: Increase MAXHINCR or MAX_HEAP_SECTS
/gnu/store/kpxi8h3669afr9r1bgvaf9ij3y4wdyyn-bash-minimal-4.4.12/bin/bash: line 6: 30796 Aborted                 GUILE_AUTO_COMPILE=0 ../meta/build-env guild compile --target="x86_64-unknown-linux-gnu" -O1 -L "/tmp/guix-build-guile-awesome-2.2.2.drv-0/guile-2.2.2/module" -L "/tmp/guix-build-guile-awesome-2.2.2.drv-0/guile-2.2.2/guile-readline" -o "language/scheme/compile-tree-il.go" "../module/language/scheme/compile-tree-il.scm"
make[2]: *** [Makefile:1928: language/scheme/compile-tree-il.go] Error 134
make[2]: *** Waiting for unfinished jobs....
^GGC Warning: Repeated allocation of very large block (appr. size 230096896):
        May lead to memory leak and poor performance
Too many heap sections: Increase MAXHINCR or MAX_HEAP_SECTS
/gnu/store/kpxi8h3669afr9r1bgvaf9ij3y4wdyyn-bash-minimal-4.4.12/bin/bash: line 6: 30386 Aborted                 GUILE_AUTO_COMPILE=0 ../meta/build-env guild compile --target="x86_64-unknown-linux-gnu" -O1 -L "/tmp/guix-build-guile-awesome-2.2.2.drv-0/guile-2.2.2/module" -L "/tmp/guix-build-guile-awesome-2.2.2.drv-0/guile-2.2.2/guile-readline" -o "language/tree-il/fix-letrec.go" "../module/language/tree-il/fix-letrec.scm"
make[2]: *** [Makefile:1928: language/tree-il/fix-letrec.go] Error 134
GC Warning: Repeated allocation of very large block (appr. size 230096896):
        May lead to memory leak and poor performance
GC Warning: Repeated allocation of very large block (appr. size 230096896):
        May lead to memory leak and poor performance
Too many heap sections: Increase MAXHINCR or MAX_HEAP_SECTS
Too many heap sections: Increase MAXHINCR or MAX_HEAP_SECTS
/gnu/store/kpxi8h3669afr9r1bgvaf9ij3y4wdyyn-bash-minimal-4.4.12/bin/bash: line 6: 30839 Aborted                 GUILE_AUTO_COMPILE=0 ../meta/build-env guild compile --target="x86_64-unknown-linux-gnu" -O1 -L "/tmp/guix-build-guile-awesome-2.2.2.drv-0/guile-2.2.2/module" -L "/tmp/guix-build-guile-awesome-2.2.2.drv-0/guile-2.2.2/guile-readline" -o "language/value/spec.go" "../module/language/value/spec.scm"
make[2]: *** [Makefile:1928: language/value/spec.go] Error 134
/gnu/store/kpxi8h3669afr9r1bgvaf9ij3y4wdyyn-bash-minimal-4.4.12/bin/bash: line 6: 30917 Aborted                 GUILE_AUTO_COMPILE=0 ../meta/build-env guild compile --target="x86_64-unknown-linux-gnu" -O1 -L "/tmp/guix-build-guile-awesome-2.2.2.drv-0/guile-2.2.2/module" -L "/tmp/guix-build-guile-awesome-2.2.2.drv-0/guile-2.2.2/guile-readline" -o "system/base/syntax.go" "../module/system/base/syntax.scm"
make[2]: *** [Makefile:1928: system/base/syntax.go] Error 134
GC Warning: Repeated allocation of very large block (appr. size 230096896):
        May lead to memory leak and poor performance
Too many heap sections: Increase MAXHINCR or MAX_HEAP_SECTS
Too many heap sections: Increase MAXHINCR or MAX_HEAP_SECTS
GC Warning: Repeated allocation of very large block (appr. size 230096896):
        May lead to memory leak and poor performance
GC Warning: Repeated allocation of very large block (appr. size 230096896):
        May lead to memory leak and poor performance
/gnu/store/kpxi8h3669afr9r1bgvaf9ij3y4wdyyn-bash-minimal-4.4.12/bin/bash: line 6: 30344 Aborted                 GUILE_AUTO_COMPILE=0 ../meta/build-env guild compile --target="x86_64-unknown-linux-gnu" -O1 -L "/tmp/guix-build-guile-awesome-2.2.2.drv-0/guile-2.2.2/module" -L "/tmp/guix-build-guile-awesome-2.2.2.drv-0/guile-2.2.2/guile-readline" -o "ice-9/psyntax-pp.go" "../module/ice-9/psyntax-pp.scm"
make[2]: *** [Makefile:1928: ice-9/psyntax-pp.go] Error 134
/gnu/store/kpxi8h3669afr9r1bgvaf9ij3y4wdyyn-bash-minimal-4.4.12/bin/bash: line 6: 30354 Aborted                 GUILE_AUTO_COMPILE=0 ../meta/build-env guild compile --target="x86_64-unknown-linux-gnu" -O1 -L "/tmp/guix-build-guile-awesome-2.2.2.drv-0/guile-2.2.2/module" -L "/tmp/guix-build-guile-awesome-2.2.2.drv-0/guile-2.2.2/guile-readline" -o "srfi/srfi-1.go" "../module/srfi/srfi-1.scm"
make[2]: *** [Makefile:1928: srfi/srfi-1.go] Error 134
Too many heap sections: Increase MAXHINCR or MAX_HEAP_SECTS
GC Warning: Repeated allocation of very large block (appr. size 230096896):
        May lead to memory leak and poor performance
/gnu/store/kpxi8h3669afr9r1bgvaf9ij3y4wdyyn-bash-minimal-4.4.12/bin/bash: line 6: 30548 Aborted                 GUILE_AUTO_COMPILE=0 ../meta/build-env guild compile --target="x86_64-unknown-linux-gnu" -O1 -L "/tmp/guix-build-guile-awesome-2.2.2.drv-0/guile-2.2.2/module" -L "/tmp/guix-build-guile-awesome-2.2.2.drv-0/guile-2.2.2/guile-readline" -o "language/cps/peel-loops.go" "../module/language/cps/peel-loops.scm"
make[2]: *** [Makefile:1928: language/cps/peel-loops.go] Error 134
GC Warning: Repeated allocation of very large block (appr. size 230096896):
        May lead to memory leak and poor performance
GC Warning: Repeated allocation of very large block (appr. size 230096896):
        May lead to memory leak and poor performance
GC Warning: Repeated allocation of very large block (appr. size 230096896):
        May lead to memory leak and poor performance
GC Warning: Repeated allocation of very large block (appr. size 230096896):
        May lead to memory leak and poor performance
GC Warning: Repeated allocation of very large block (appr. size 230096896):
        May lead to memory leak and poor performance
Too many heap sections: Increase MAXHINCR or MAX_HEAP_SECTS
/gnu/store/kpxi8h3669afr9r1bgvaf9ij3y4wdyyn-bash-minimal-4.4.12/bin/bash: line 6: 31410 Aborted                 GUILE_AUTO_COMPILE=0 ../meta/build-env guild compile --target="x86_64-unknown-linux-gnu" -O1 -L "/tmp/guix-build-guile-awesome-2.2.2.drv-0/guile-2.2.2/module" -L "/tmp/guix-build-guile-awesome-2.2.2.drv-0/guile-2.2.2/guile-readline" -o "system/vm/dwarf.go" "../module/system/vm/dwarf.scm"
make[2]: *** [Makefile:1928: system/vm/dwarf.go] Error 134
Too many heap sections: Increase MAXHINCR or MAX_HEAP_SECTS
GC Warning: Repeated allocation of very large block (appr. size 230096896):
        May lead to memory leak and poor performance
/gnu/store/kpxi8h3669afr9r1bgvaf9ij3y4wdyyn-bash-minimal-4.4.12/bin/bash: line 6: 31415 Aborted                 GUILE_AUTO_COMPILE=0 ../meta/build-env guild compile --target="x86_64-unknown-linux-gnu" -O1 -L "/tmp/guix-build-guile-awesome-2.2.2.drv-0/guile-2.2.2/module" -L "/tmp/guix-build-guile-awesome-2.2.2.drv-0/guile-2.2.2/guile-readline" -o "system/vm/elf.go" "../module/system/vm/elf.scm"
make[2]: *** [Makefile:1928: system/vm/elf.go] Error 134
Too many heap sections: Increase MAXHINCR or MAX_HEAP_SECTS
/gnu/store/kpxi8h3669afr9r1bgvaf9ij3y4wdyyn-bash-minimal-4.4.12/bin/bash: line 6: 31028 Aborted                 GUILE_AUTO_COMPILE=0 ../meta/build-env guild compile --target="x86_64-unknown-linux-gnu" -O1 -L "/tmp/guix-build-guile-awesome-2.2.2.drv-0/guile-2.2.2/module" -L "/tmp/guix-build-guile-awesome-2.2.2.drv-0/guile-2.2.2/guile-readline" -o "ice-9/boot-9.go" "../module/ice-9/boot-9.scm"
make[2]: *** [Makefile:1928: ice-9/boot-9.go] Error 134
GC Warning: Repeated allocation of very large block (appr. size 230096896):
        May lead to memory leak and poor performance
Too many heap sections: Increase MAXHINCR or MAX_HEAP_SECTS
/gnu/store/kpxi8h3669afr9r1bgvaf9ij3y4wdyyn-bash-minimal-4.4.12/bin/bash: line 6: 30348 Aborted                 GUILE_AUTO_COMPILE=0 ../meta/build-env guild compile --target="x86_64-unknown-linux-gnu" -O1 -L "/tmp/guix-build-guile-awesome-2.2.2.drv-0/guile-2.2.2/module" -L "/tmp/guix-build-guile-awesome-2.2.2.drv-0/guile-2.2.2/guile-readline" -o "language/cps/intset.go" "../module/language/cps/intset.scm"
make[2]: *** [Makefile:1928: language/cps/intset.go] Error 134
…
--8<---------------cut here---------------end--------------->8---

I will try this on a workstation with fewer cores and less memory later.

--
Ricardo

GPG: BCA6 89B6 3655 3801 C3C6  2150 197A 5888 235F ACAC
https://elephly.net





Information forwarded to bug-guile <at> gnu.org:
bug#19180; Package guile. (Thu, 26 Oct 2017 16:54:02 GMT) Full text and rfc822 format available.

Message #74 received at 19180 <at> debbugs.gnu.org (full text, mbox):

From: Ricardo Wurmus <rekado <at> elephly.net>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: Andy Wingo <wingo <at> igalia.com>, 19180 <at> debbugs.gnu.org,
 Christopher Allan Webber <cwebber <at> dustycloud.org>, guile-devel <at> gnu.org
Subject: Re: Weak tables harmful to GC?
Date: Thu, 26 Oct 2017 18:52:25 +0200
Hi again,

I tried building this on my workstation with 32GB RAM and the bootstrap
compilation got killed after consuming too much memory.

--8<---------------cut here---------------start------------->8---
…
Making all in bootstrap
make[2]: Entering directory '/tmp/guix-build-guile-awesome-2.2.2.drv-0/guile-2.2.2/bootstrap'
  BOOTSTRAP GUILEC ice-9/eval.go
wrote `ice-9/eval.go'
  BOOTSTRAP GUILEC ice-9/psyntax-pp.go
  BOOTSTRAP GUILEC language/cps/intmap.go
  BOOTSTRAP GUILEC language/cps/intset.go
  BOOTSTRAP GUILEC language/cps/utils.go
  BOOTSTRAP GUILEC ice-9/vlist.go
  BOOTSTRAP GUILEC srfi/srfi-1.go
  BOOTSTRAP GUILEC language/tree-il.go
  BOOTSTRAP GUILEC language/tree-il/analyze.go
  BOOTSTRAP GUILEC language/tree-il/compile-cps.go
  BOOTSTRAP GUILEC language/tree-il/canonicalize.go
  BOOTSTRAP GUILEC language/tree-il/debug.go
  BOOTSTRAP GUILEC language/tree-il/effects.go
  BOOTSTRAP GUILEC language/tree-il/fix-letrec.go
  BOOTSTRAP GUILEC language/tree-il/optimize.go
  BOOTSTRAP GUILEC language/tree-il/peval.go
  BOOTSTRAP GUILEC language/tree-il/primitives.go
/gnu/store/kpxi8h3669afr9r1bgvaf9ij3y4wdyyn-bash-minimal-4.4.12/bin/bash: line 6: 30173 Killed                  GUILE_AUTO_COMPILE=0 ../meta/build-env guild compile --target="x86_64-unknown-linux-gnu" -O1 -L "/tmp/guix-build-guile-awesome-2.2.2.drv-0/guile-2.2.2/module" -L "/tmp/guix-build-guile-awesome-2.2.2.drv-0/guile-2.2.2/guile-readline" -o "ice-9/vlist.go" "../module/ice-9/vlist.scm"
make[2]: *** [Makefile:1928: ice-9/vlist.go] Error 137
make[2]: *** Waiting for unfinished jobs....
…
--8<---------------cut here---------------end--------------->8---

This is still with the same Guix package definition as before:

--8<---------------cut here---------------start------------->8---
(define-public guile-2.2-awesome
  (package (inherit guile-2.2)
    (name "guile-awesome")
    (source (origin (inherit (package-source guile-2.2))
       (patches (list "/home/rwurmus/0001-Remove-weak-tables-and-revert-to-weak-hash-tables.patch"
                      "/home/rwurmus/0002-Keep-weak-hash-table-item-count-consistent.patch"))))
    (arguments
      (substitute-keyword-arguments (package-arguments guile-2.2)
        ((#:phases phases)
         `(modify-phases ,phases
            (add-before 'pre-configure 'bootstrap
              (lambda _
                (zero? (system* "autoreconf" "-vif"))))))))
    (native-inputs
     `(("autoconf" ,autoconf)
       ("automake" ,automake)
       ("libtool" ,libtool)
       ("flex" ,flex)
       ("texinfo" ,texinfo)
       ("gettext" ,gettext-minimal)
       ,@(package-native-inputs guile-2.2)))))
--8<---------------cut here---------------end--------------->8---


--
Ricardo

GPG: BCA6 89B6 3655 3801 C3C6  2150 197A 5888 235F ACAC
https://elephly.net





Information forwarded to bug-guile <at> gnu.org:
bug#19180; Package guile. (Thu, 26 Oct 2017 17:18:01 GMT) Full text and rfc822 format available.

Message #77 received at 19180 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Ricardo Wurmus <rekado <at> elephly.net>
Cc: Christopher Allan Webber <cwebber <at> dustycloud.org>,
 Andy Wingo <wingo <at> igalia.com>, 19180 <at> debbugs.gnu.org, guile-devel <at> gnu.org
Subject: Re: Weak tables harmful to GC?
Date: Thu, 26 Oct 2017 10:17:15 -0700
Hi Ricardo,

Ricardo Wurmus <rekado <at> elephly.net> skribis:

> I tried building Guile with the following Guix package definition:
>
> (define-public guile-2.2-awesome
>   (package (inherit guile-2.2)
>     (name "guile-awesome")
>     (source (origin (inherit (package-source guile-2.2))
>        (patches (list "/home/rwurmus/0001-Remove-weak-tables-and-revert-to-weak-hash-tables.patch"
>                       "/home/rwurmus/0002-Keep-weak-hash-table-item-count-consistent.patch"))))

[...]

>   BOOTSTRAP GUILEC system/foreign.go
> GC Warning: Repeated allocation of very large block (appr. size 230096896):
>         May lead to memory leak and poor performance
> GC Warning: Repeated allocation of very large block (appr. size 230096896):
>         May lead to memory leak and poor performance
> GC Warning: Repeated allocation of very large block (appr. size 230096896):
>         May lead to memory leak and poor performance
> GC Warning: Repeated allocation of very large block (appr. size 230096896):
>         May lead to memory leak and poor performance
> GC Warning: Repeated allocation of very large block (appr. size 230096896):
>         May lead to memory leak and poor performance
> GC Warning: Repeated allocation of very large block (appr. size 230096896):
>         May lead to memory leak and poor performance
> GC Warning: Repeated allocation of very large block (appr. size 230096896):
>         May lead to memory leak and poor performance
> Too many heap sections: Increase MAXHINCR or MAX_HEAP_SECTS
> /gnu/store/kpxi8h3669afr9r1bgvaf9ij3y4wdyyn-bash-minimal-4.4.12/bin/bash: line 6: 30796 Aborted                 GUILE_AUTO_COMPILE=0 ../meta/build-env guild compile --target="x86_64-unknown-linux-gnu" -O1 -L "/tmp/guix-build-guile-awesome-2.2.2.drv-0/guile-2.2.2/module" -L "/tmp/guix-build-guile-awesome-2.2.2.drv-0/guile-2.2.2/guile-readline" -o "language/scheme/compile-tree-il.go" "../module/language/scheme/compile-tree-il.scm"
> make[2]: *** [Makefile:1928: language/scheme/compile-tree-il.go] Error 134

Blech, that doesn’t sound like an improvement.

“make clean -C module && make” works for me, but now that I think of it
it might be reusing stuff from the prebuilt/ directory.

I’ll try again later.

Thanks for testing,
Ludo’.




Information forwarded to bug-guile <at> gnu.org:
bug#19180; Package guile. (Fri, 27 Oct 2017 05:29:01 GMT) Full text and rfc822 format available.

Message #80 received at 19180 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Ricardo Wurmus <rekado <at> elephly.net>
Cc: Andy Wingo <wingo <at> igalia.com>, 19180 <at> debbugs.gnu.org,
 Christopher Allan Webber <cwebber <at> dustycloud.org>, guile-devel <at> gnu.org
Subject: Re: Weak tables harmful to GC?
Date: Thu, 26 Oct 2017 22:28:37 -0700
[Message part 1 (text/plain, inline)]
Hi,

Ricardo Wurmus <rekado <at> elephly.net> skribis:

>   BOOTSTRAP GUILEC language/tree-il/primitives.go
> /gnu/store/kpxi8h3669afr9r1bgvaf9ij3y4wdyyn-bash-minimal-4.4.12/bin/bash: line 6: 30173 Killed                  GUILE_AUTO_COMPILE=0 ../meta/build-env guild compile --target="x86_64-unknown-linux-gnu" -O1 -L "/tmp/guix-build-guile-awesome-2.2.2.drv-0/guile-2.2.2/module" -L "/tmp/guix-build-guile-awesome-2.2.2.drv-0/guile-2.2.2/guile-readline" -o "ice-9/vlist

Funny: with the “cleanup” that led to the patches you tried, those weak
hash tables were not weak at all, because SCM_WEAK_TABLE_KIND_KEY was
now zero, and thus SCM_HASHTABLE_WEAK_P would always return false.  The
fix:

[Message part 2 (text/x-patch, inline)]
diff --git a/libguile/hashtab.h b/libguile/hashtab.h
index 8f422b0b5..1705cf744 100644
--- a/libguile/hashtab.h
+++ b/libguile/hashtab.h
@@ -33,6 +33,7 @@
 
 /* Types of weak hash tables.  */
 typedef enum {
+  SCM_WEAK_TABLE_KIND_NONE = 0,
   SCM_WEAK_TABLE_KIND_KEY,
   SCM_WEAK_TABLE_KIND_VALUE,
   SCM_WEAK_TABLE_KIND_BOTH
@@ -51,7 +52,9 @@ typedef enum {
 #define SCM_HASHTABLE_DOUBLY_WEAK_P(x) \
   (SCM_HASHTABLE_FLAGS (x) == SCM_WEAK_TABLE_KIND_BOTH)
 
-#define SCM_HASHTABLE_WEAK_P(x)	   SCM_HASHTABLE_FLAGS (x)
+#define SCM_HASHTABLE_WEAK_P(x)				\
+  (SCM_HASHTABLE_FLAGS (x) != SCM_WEAK_TABLE_KIND_NONE)
+
 #define SCM_HASHTABLE_N_ITEMS(x)   (SCM_HASHTABLE (x)->n_items)
 #define SCM_SET_HASHTABLE_N_ITEMS(x, n)   (SCM_HASHTABLE (x)->n_items = n)
 #define SCM_HASHTABLE_INCREMENT(x) (SCM_HASHTABLE_N_ITEMS(x)++)
[Message part 3 (text/plain, inline)]
(Updated patches below.)

With your package definition, I see the first few Guile processes peak
at ~100 MiB resident (would be interesting to compare with stock 2.2.2).

Let me know if it’s better this time!

Thanks again for testing,
Ludo’.

[0001-Remove-weak-tables-and-revert-to-weak-hash-tables.patch (text/x-patch, attachment)]
[0002-Keep-weak-hash-table-item-count-consistent.patch (text/x-patch, attachment)]

Information forwarded to bug-guile <at> gnu.org:
bug#19180; Package guile. (Sat, 28 Oct 2017 09:57:02 GMT) Full text and rfc822 format available.

Message #83 received at 19180 <at> debbugs.gnu.org (full text, mbox):

From: Ricardo Wurmus <ricardo.wurmus <at> mdc-berlin.de>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: Andy Wingo <wingo <at> igalia.com>, 19180 <at> debbugs.gnu.org,
 Christopher Allan Webber <cwebber <at> dustycloud.org>, guile-devel <at> gnu.org
Subject: Re: Weak tables harmful to GC?
Date: Sat, 28 Oct 2017 11:56:43 +0200
Hi Ludo,

the bootstrap phase now succeeds but the build crashes:

--8<---------------cut here---------------start------------->8---
…
make[2]: Leaving directory '/tmp/guix-build-guile-awesome-2.2.2.drv-0/guile-2.2.2/bootstrap'
Making all in module
make[2]: Entering directory '/tmp/guix-build-guile-awesome-2.2.2.drv-0/guile-2.2.2/module'
…
wrote `language/tree-il/spec.go'
  GUILEC srfi/srfi-37.go
wrote `ice-9/textual-ports.go'
wrote `ice-9/time.go'
wrote `ice-9/q.go'
  GUILEC srfi/srfi-38.go
wrote `ice-9/hash-table.go'
wrote `rnrs/r5rs.go'
Backtrace:
In ice-9/eval.scm:
    163:9 19 (_ _)
In ice-9/boot-9.scm:
    152:2 18 (with-fluid* _ _ _)
In system/base/target.scm:
     57:6 17 (with-target _ _)
In system/base/compile.scm:
    152:6 16 (compile-file _ #:output-file _ #:from _ #:to _ #:env _ ?)
     43:4 15 (call-once _)
In ice-9/boot-9.scm:
    849:4 14 (with-throw-handler _ _ _)
In system/base/compile.scm:
    59:11 13 (_)
   155:11 12 (_ #<closed: file a77e00>)
    219:8 11 (read-and-compile _ #:from _ #:to _ #:env _ #:opts _)
    255:6 10 (compile _ #:from _ #:to _ #:env _ #:opts _)
   183:32  9 (lp (#<procedure compile-cps (exp env opts)> #<proce?>) ?)
In language/tree-il/compile-cps.scm:
  1084:25  8 (compile-cps #<tree-il (seq (let (m) (m-1ccde3380cfd0b?> ?)
    974:4  7 (optimize-tree-il #<tree-il (seq (let (m) (m-1ccde3380?> ?)
In language/tree-il/analyze.scm:
    563:4  6 (analyze-tree (#<<tree-analysis> down: #<procedure ?> ?) ?)
In srfi/srfi-1.scm:
   656:11  5 (for-each2 (#<<tree-analysis> down: #<procedure 7ff?> ?) ?)
In ice-9/vlist.scm:
   267:16  4 (loop _ _ _)
In language/tree-il/analyze.scm:
  1053:33  3 Exception thrown while printing backtrace:
ERROR: In procedure assq: Wrong type argument in position 2 (expecting association list): ((system base pmatch) car . #f)

ice-9/boot-9.scm:760:25: In procedure dispatch-exception:
ice-9/boot-9.scm:760:25: In procedure assq: Wrong type argument in position 2 (expecting association list): ((system base pmatch) car . #f)
make[2]: *** [Makefile:2258: rnrs/records/procedural.go] Error 1
make[2]: *** Waiting for unfinished jobs....
  GUILEC srfi/srfi-41.go
wrote `rnrs/programs.go'
wrote `ice-9/history.go'
wrote `language/elisp/spec.go'
wrote `language/tree-il/optimize.go'
Backtrace:
In ice-9/eval.scm:
    163:9 19 (_ _)
In ice-9/boot-9.scm:
    152:2 18 (with-fluid* _ _ _)
In system/base/target.scm:
     57:6 17 (with-target _ _)
In system/base/compile.scm:
    152:6 16 (compile-file _ #:output-file _ #:from _ #:to _ #:env _ ?)
     43:4 15 (call-once _)
In ice-9/boot-9.scm:
    849:4 14 (with-throw-handler _ _ _)
In system/base/compile.scm:
    59:11 13 (_)
   155:11 12 (_ #<closed: file 9cfaf0>)
    219:8 11 (read-and-compile _ #:from _ #:to _ #:env _ #:opts _)
    255:6 10 (compile _ #:from _ #:to _ #:env _ #:opts _)
   183:32  9 (lp (#<procedure compile-cps (exp env opts)> #<proce?>) ?)
In language/tree-il/compile-cps.scm:
  1084:25  8 (compile-cps _ #<module (#{ g132}#) a5f320> _)
    974:4  7 (optimize-tree-il #<tree-il (define primitive-eval (la?> ?)
In language/tree-il/analyze.scm:
    563:4  6 (analyze-tree (#<<tree-analysis> down: #<procedure ?> ?) ?)
In srfi/srfi-1.scm:
   656:11  5 (for-each2 (#<<tree-analysis> down: #<procedure 7ff?> ?) ?)
In ice-9/vlist.scm:
   267:16  4 (loop _ _ #t)
In language/tree-il/analyze.scm:
  1053:33  3 Exception thrown while printing backtrace:
ERROR: In procedure assq: Wrong type argument in position 2 (expecting association list): 36
wrote `rnrs/unicode.go'

ice-9/boot-9.scm:760:25: In procedure dispatch-exception:
ice-9/boot-9.scm:760:25: In procedure assq: Wrong type argument in position 2 (expecting association list): 36
wrote `ice-9/curried-definitions.go'
make[2]: *** [Makefile:2258: ice-9/eval.go] Error 1
…
--8<---------------cut here---------------end--------------->8---

This is on the machine with 1.5TB RAM with the same package definition
but using your new patches.

[I’m sending this from my work address, because zoho.com currently has
problems delivering mail to gnu.org.]

--
Ricardo

GPG: BCA6 89B6 3655 3801 C3C6  2150 197A 5888 235F ACAC
https://elephly.net




Information forwarded to bug-guile <at> gnu.org:
bug#19180; Package guile. (Mon, 30 Oct 2017 12:37:01 GMT) Full text and rfc822 format available.

Message #86 received at 19180 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Ricardo Wurmus <ricardo.wurmus <at> mdc-berlin.de>
Cc: Andy Wingo <wingo <at> igalia.com>, 19180 <at> debbugs.gnu.org,
 Christopher Allan Webber <cwebber <at> dustycloud.org>, guile-devel <at> gnu.org
Subject: Re: Weak tables harmful to GC?
Date: Mon, 30 Oct 2017 13:35:56 +0100
[Message part 1 (text/plain, inline)]
Hi Ricardo,

Ricardo Wurmus <ricardo.wurmus <at> mdc-berlin.de> skribis:

> In language/tree-il/analyze.scm:
>   1053:33  3 Exception thrown while printing backtrace:
> ERROR: In procedure assq: Wrong type argument in position 2 (expecting association list): ((system base pmatch) car . #f)
>
> ice-9/boot-9.scm:760:25: In procedure dispatch-exception:
> ice-9/boot-9.scm:760:25: In procedure assq: Wrong type argument in position 2 (expecting association list): ((system base pmatch) car . #f)

It’s a sign that the weak tables were too weak, this time.  :-)

The problem stems from the fact that weak pairs were initialized too
late.  Thus, the first calls to ‘scm_weak_car_pair’ were happening
before the weak-car pair GC descriptor had been initialized; they were
therefore using 0 as their descriptor, and ended up not being traced at
all by the GC.

The fix is to initialize weak pairs before symbols, as in 2.0:

[Message part 2 (text/x-patch, inline)]
modified   libguile/hashtab.c
@@ -1608,10 +1608,11 @@ scm_c_weak_table_fold (scm_t_hash_fold_fn fn, void *closure,
 
 
 
+/* Initialize weak pairs, used by weak hash tables.  This needs to be
+   done early on because it's used by interned symbols etc.  */
 void
-scm_init_hashtab ()
+scm_init_weak_pairs ()
 {
-  /* Initialize weak pairs.  */
   GC_word wcar_pair_bitmap[GC_BITMAP_SIZE (scm_t_cell)] = { 0 };
   GC_word wcdr_pair_bitmap[GC_BITMAP_SIZE (scm_t_cell)] = { 0 };
 
@@ -1627,6 +1628,11 @@ scm_init_hashtab ()
   wcdr_pair_descr = GC_make_descriptor (wcdr_pair_bitmap,
 					GC_WORD_LEN (scm_t_cell));
 
+}
+
+void
+scm_init_hashtab ()
+{
 #include "libguile/hashtab.x"
 }
 
modified   libguile/hashtab.h
@@ -174,6 +174,7 @@ SCM_API SCM scm_hash_map_to_list (SCM proc, SCM hash);
 SCM_API SCM scm_hash_count (SCM hash, SCM pred);
 SCM_INTERNAL void scm_i_hashtable_print (SCM exp, SCM port, scm_print_state *pstate);
 SCM_INTERNAL void scm_init_hashtab (void);
+SCM_INTERNAL void scm_init_weak_pairs (void);
 
 
 /* Guile 2.2.x (x <= 2) weak-table API.  */
modified   libguile/init.c
@@ -390,7 +390,8 @@ scm_i_init_guile (void *base)
 #ifdef GUILE_DEBUG_MALLOC
   scm_debug_malloc_prehistory ();
 #endif
-  scm_symbols_prehistory ();      /* requires weak_table_prehistory */
+  scm_init_weak_pairs ();
+  scm_symbols_prehistory ();      /* requires weak_pairs */
   scm_modules_prehistory ();
   scm_init_array_handle ();
[Message part 3 (text/plain, inline)]
I’m attaching updated patches.  I’ve let the Guix build run to
completion this time.  Let me know if it works for you!

Ludo’.

[0001-Remove-weak-tables-and-revert-to-weak-hash-tables.patch (text/x-patch, attachment)]
[0002-Keep-weak-hash-table-item-count-consistent.patch (text/x-patch, attachment)]

Information forwarded to bug-guile <at> gnu.org:
bug#19180; Package guile. (Mon, 30 Oct 2017 14:49:02 GMT) Full text and rfc822 format available.

Message #89 received at 19180 <at> debbugs.gnu.org (full text, mbox):

From: Ricardo Wurmus <ricardo.wurmus <at> mdc-berlin.de>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: Andy Wingo <wingo <at> igalia.com>, 19180 <at> debbugs.gnu.org,
 Christopher Allan Webber <cwebber <at> dustycloud.org>, guile-devel <at> gnu.org
Subject: Re: Weak tables harmful to GC?
Date: Mon, 30 Oct 2017 15:48:31 +0100
Hi Ludo,

> I’m attaching updated patches.  I’ve let the Guix build run to
> completion this time.  Let me know if it works for you!

The “guile-awesome” package finished compiling (after about 46 minutes).
I’m now testing “guix pull” with a version of Guix that uses
“guile-awesome”.

I’m very hopeful :)

--
Ricardo




Information forwarded to bug-guile <at> gnu.org:
bug#19180; Package guile. (Mon, 30 Oct 2017 17:22:02 GMT) Full text and rfc822 format available.

Message #92 received at 19180 <at> debbugs.gnu.org (full text, mbox):

From: Ricardo Wurmus <ricardo.wurmus <at> mdc-berlin.de>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: Andy Wingo <wingo <at> igalia.com>, 19180 <at> debbugs.gnu.org,
 Christopher Allan Webber <cwebber <at> dustycloud.org>, guile-devel <at> gnu.org
Subject: Re: Weak tables harmful to GC?
Date: Mon, 30 Oct 2017 18:20:58 +0100
Hi again,

previously I wrote:

> The “guile-awesome” package finished compiling (after about 46 minutes).
> I’m now testing “guix pull” with a version of Guix that uses
> “guile-awesome”.

I’m sure I’m doing something wrong (see below for guesses).  Here’s what
I get:

--8<---------------cut here---------------start------------->8---
./pre-inst-env guix pull
loading...       26.0% of 645 filesrandom seed for tests: 1509382171
compiling...     18.9% of 645 filesIn thread:
ERROR: In procedure return: return used outside of 'with-monad'Error while printing exception.
compiling...     54.7% of 645 files^C
--8<---------------cut here---------------end--------------->8---

I modified build-self.scm to use the modified Guile:

--8<---------------cut here---------------start------------->8---
diff --git a/build-aux/build-self.scm b/build-aux/build-self.scm
index ed8ff5f..9af6504 100644
--- a/build-aux/build-self.scm
+++ b/build-aux/build-self.scm
@@ -126,7 +126,7 @@ running Guile."
   (package->derivation (cond-expand
                          (guile-2.2
                           (canonical-package
-                           (specification->package "guile <at> 2.2")))
+                           (specification->package "guile-awesome <at> 2.2")))
                          (else
                           (canonical-package
                            (specification->package "guile <at> 2.0"))))))
--8<---------------cut here---------------end--------------->8---

I also confirmed that the Guile process that is spawned as “bin/guile
--no-auto-compile /home/rwurmus/guix/scripts/guix pull” is indeed the
modified Guile, but I noticed that it spawns yet another Guile process
to load and compile Guix.

I guess that comes from the daemon?  If that’s the case I can’t really
test this on this big server, because the daemon is currently in use, so
I can’t just reconfigure it to use the modified Guile.

When compiling Guix from source with “make -j 32” using that version of
Guile I got a segfault.

--
Ricardo




Information forwarded to bug-guile <at> gnu.org:
bug#19180; Package guile. (Mon, 30 Oct 2017 17:30:02 GMT) Full text and rfc822 format available.

Message #95 received at 19180 <at> debbugs.gnu.org (full text, mbox):

From: Andy Wingo <wingo <at> igalia.com>
To: ludo <at> gnu.org (Ludovic Courtès)
Cc: 19180 <at> debbugs.gnu.org, Christopher Allan Webber <cwebber <at> dustycloud.org>,
 Ricardo Wurmus <ricardo.wurmus <at> mdc-berlin.de>, guile-devel <at> gnu.org
Subject: Re: Weak tables harmful to GC?
Date: Mon, 30 Oct 2017 18:29:45 +0100
[Message part 1 (text/plain, inline)]
Hi!

As discussed on IRC, what do you think of this patch?  It preserves the
thread-safety properties of weak tables and just adapts them to be
bucket-and-chain tables.  Let me know how it works for you.  If it
works, we'll need to adapt weak sets as well.

Andy

[0001-Weak-tables-are-now-bucket-and-chain-tables.patch (text/x-patch, inline)]
From 6ec4642516eaabf7a63644463a7836eb3efbcd60 Mon Sep 17 00:00:00 2001
From: Andy Wingo <wingo <at> pobox.com>
Date: Mon, 30 Oct 2017 18:19:37 +0100
Subject: [PATCH] Weak tables are now bucket-and-chain tables

This change should make weak tables work better with libgc, as the weak
components that need mark functions are smaller, so they don't overflow
the mark queue.  Also this prevents the need to move disappearing
links.

* libguile/weak-table.c (scm_t_weak_entry): Change to be a hash table
  chain entry.
  (struct weak_entry_data, do_read_weak_entry, read_weak_entry): Read
  out the key and value directly.
  (GC_move_disappearing_link, move_disappearing_links, move_weak_entry):
  Remove.
  (scm_t_weak_table): Rename "entries" member to "buckets", and "size" to
  "n_buckets".
  (hash_to_index, entry_distance, rob_from_rich, give_to_poor): Remove.
  (mark_weak_key_entry, mark_weak_value_entry): Mark a single link, and
  the next link.
  (mark_doubly_weak_entry): New kind.
  (allocate_entry): Allocate a single entry.
  (add_entry): New helper.
  (resize_table): Reimplement more like normal hash tables.
  (vacuum_weak_table): Adapt to new implementation.
  (weak_table_ref, weak_table_put_x, weak_table_remove_x): Adapt.
  (make_weak_table): Adapt.
  (scm_weak_table_clear_x): Actually unregister the links to prevent a
  memory leak.
  (scm_c_weak_table_fold): Collect items in an alist, then fold outside
  the lock.
  (scm_weak_table_prehistory): Initialize doubly_weak_gc_kind.
---
 libguile/weak-table.c | 723 +++++++++++++++-----------------------------------
 1 file changed, 212 insertions(+), 511 deletions(-)

diff --git a/libguile/weak-table.c b/libguile/weak-table.c
index 599c4cf0e..ff8a01fb0 100644
--- a/libguile/weak-table.c
+++ b/libguile/weak-table.c
@@ -1,4 +1,4 @@
-/* Copyright (C) 2011, 2012, 2013, 2014 Free Software Foundation, Inc.
+/* Copyright (C) 2011, 2012, 2013, 2014, 2017 Free Software Foundation, Inc.
  * 
  * This library is free software; you can redistribute it and/or
  * modify it under the terms of the GNU Lesser General Public License
@@ -44,83 +44,62 @@
    data, but when you don't have space to store the data in the object.
    For example, procedure properties are implemented with weak tables.
 
-   Weak tables are implemented using an open-addressed hash table.
-   Basically this means that there is an array of entries, and the item
-   is expected to be found the slot corresponding to its hash code,
-   modulo the length of the array.
-
-   Collisions are handled using linear probing with the Robin Hood
-   technique.  See Pedro Celis' paper, "Robin Hood Hashing":
-
-     http://www.cs.uwaterloo.ca/research/tr/1986/CS-86-14.pdf
-
-   The vector of entries is allocated in such a way that the GC doesn't
-   trace the weak values.  For doubly-weak tables, this means that the
-   entries are allocated as an "atomic" piece of memory.  Key-weak and
-   value-weak tables use a special GC kind with a custom mark procedure.
-   When items are added weakly into table, a disappearing link is
-   registered to their locations.  If the referent is collected, then
-   that link will be zeroed out.
+   This is a normal bucket-and-chain hash table, except that the chain
+   entries are allocated in such a way that the GC doesn't trace the
+   weak values.  For doubly-weak tables, this means that the entries are
+   allocated as an "atomic" piece of memory.  Key-weak and value-weak
+   tables use a special GC kind with a custom mark procedure.  When
+   items are added weakly into table, a disappearing link is registered
+   to their locations.  If the referent is collected, then that link
+   will be zeroed out.
 
    An entry in the table consists of the key and the value, together
-   with the hash code of the key.  We munge hash codes so that they are
-   never 0.  In this way we can detect removed entries (key of zero but
-   nonzero hash code), and can then reshuffle elements as needed to
-   maintain the robin hood ordering.
-
-   Compared to buckets-and-chains hash tables, open addressing has the
-   advantage that it is very cache-friendly.  It also uses less memory.
-
-   Implementation-wise, there are two things to note.
-
-     1. We assume that hash codes are evenly distributed across the
-        range of unsigned longs.  The actual hash code stored in the
-        entry is left-shifted by 1 bit (losing 1 bit of hash precision),
-        and then or'd with 1.  In this way we ensure that the hash field
-        of an occupied entry is nonzero.  To map to an index, we
-        right-shift the hash by one, divide by the size, and take the
-        remainder.
-
-     2. Since the weak references are stored in an atomic region with
-        disappearing links, they need to be accessed with the GC alloc
-        lock.  `copy_weak_entry' will do that for you.  The hash code
-        itself can be read outside the lock, though.
+   with the hash code of the key.
+
+   Note that since the weak references are stored in an atomic region
+   with disappearing links, they need to be accessed with the GC alloc
+   lock.  `read_weak_entry' will do that for you.  The hash code itself
+   can be read outside the lock, though.
   */
 
 
-typedef struct {
+typedef struct scm_weak_entry scm_t_weak_entry;
+
+struct scm_weak_entry {
   unsigned long hash;
+  scm_t_weak_entry *next;
   scm_t_bits key;
   scm_t_bits value;
-} scm_t_weak_entry;
+};
 
 
 struct weak_entry_data {
-  scm_t_weak_entry *in;
-  scm_t_weak_entry *out;
+  scm_t_weak_entry *entry;
+  scm_t_bits key;
+  scm_t_bits value;
 };
   
 static void*
-do_copy_weak_entry (void *data)
+do_read_weak_entry (void *data)
 {
   struct weak_entry_data *e = data;
 
-  e->out->hash = e->in->hash;
-  e->out->key = e->in->key;
-  e->out->value = e->in->value;
+  e->key = e->entry->key;
+  e->value = e->entry->value;
 
   return NULL;
 }
 
 static void
-copy_weak_entry (scm_t_weak_entry *src, scm_t_weak_entry *dst)
+read_weak_entry (scm_t_weak_entry *entry, scm_t_bits *key, scm_t_bits *value)
 {
   struct weak_entry_data data;
 
-  data.in = src;
-  data.out = dst;
-      
-  GC_call_with_alloc_lock (do_copy_weak_entry, &data);
+  data.entry = entry;
+  GC_call_with_alloc_lock (do_read_weak_entry, &data);
+
+  *key = data.key;
+  *value = data.value;
 }
   
 static void
@@ -152,59 +131,11 @@ unregister_disappearing_links (scm_t_weak_entry *entry,
     GC_unregister_disappearing_link ((void **) &entry->value);
 }
 
-#ifndef HAVE_GC_MOVE_DISAPPEARING_LINK
-static void
-GC_move_disappearing_link (void **from, void **to)
-{
-  GC_unregister_disappearing_link (from);
-  SCM_I_REGISTER_DISAPPEARING_LINK (to, *to);
-}
-#endif
-
-static void
-move_disappearing_links (scm_t_weak_entry *from, scm_t_weak_entry *to,
-                         SCM key, SCM value, scm_t_weak_table_kind kind)
-{
-  if ((kind == SCM_WEAK_TABLE_KIND_KEY || kind == SCM_WEAK_TABLE_KIND_BOTH)
-      && SCM_HEAP_OBJECT_P (key))
-    GC_move_disappearing_link ((void **) &from->key, (void **) &to->key);
-
-  if ((kind == SCM_WEAK_TABLE_KIND_VALUE || kind == SCM_WEAK_TABLE_KIND_BOTH)
-      && SCM_HEAP_OBJECT_P (value))
-    GC_move_disappearing_link ((void **) &from->value, (void **) &to->value);
-}
-
-static void
-move_weak_entry (scm_t_weak_entry *from, scm_t_weak_entry *to,
-                 scm_t_weak_table_kind kind)
-{
-  if (from->hash)
-    {
-      scm_t_weak_entry copy;
-      
-      copy_weak_entry (from, &copy);
-      to->hash = copy.hash;
-      to->key = copy.key;
-      to->value = copy.value;
-
-      move_disappearing_links (from, to,
-                               SCM_PACK (copy.key), SCM_PACK (copy.value),
-                               kind);
-    }
-  else
-    {
-      to->hash = 0;
-      to->key = 0;
-      to->value = 0;
-    }
-}
-
-
 typedef struct {
-  scm_t_weak_entry *entries;    /* the data */
+  scm_t_weak_entry **buckets;   /* the data */
   scm_i_pthread_mutex_t lock;   /* the lock */
   scm_t_weak_table_kind kind;   /* what kind of table it is */
-  unsigned long size;    	/* total number of slots. */
+  unsigned long n_buckets;    	/* total number of buckets. */
   unsigned long n_items;	/* number of items in table */
   unsigned long lower;		/* when to shrink */
   unsigned long upper;		/* when to grow */
@@ -219,171 +150,114 @@ typedef struct {
 #define SCM_WEAK_TABLE(x) ((scm_t_weak_table *) SCM_CELL_WORD_1 (x))
 
 
-static unsigned long
-hash_to_index (unsigned long hash, unsigned long size)
-{
-  return (hash >> 1) % size;
-}
-
-static unsigned long
-entry_distance (unsigned long hash, unsigned long k, unsigned long size)
-{
-  unsigned long origin = hash_to_index (hash, size);
-
-  if (k >= origin)
-    return k - origin;
-  else
-    /* The other key was displaced and wrapped around.  */
-    return size - origin + k;
-}
-
-static void
-rob_from_rich (scm_t_weak_table *table, unsigned long k)
-{
-  unsigned long empty, size;
-
-  size = table->size;
-
-  /* If we are to free up slot K in the table, we need room to do so.  */
-  assert (table->n_items < size);
-  
-  empty = k;
-  do 
-    empty = (empty + 1) % size;
-  while (table->entries[empty].hash);
-
-  do
-    {
-      unsigned long last = empty ? (empty - 1) : (size - 1);
-      move_weak_entry (&table->entries[last], &table->entries[empty],
-                       table->kind);
-      empty = last;
-    }
-  while (empty != k);
-
-  table->entries[empty].hash = 0;
-  table->entries[empty].key = 0;
-  table->entries[empty].value = 0;
-}
-
-static void
-give_to_poor (scm_t_weak_table *table, unsigned long k)
-{
-  /* Slot K was just freed up; possibly shuffle others down.  */
-  unsigned long size = table->size;
-
-  while (1)
-    {
-      unsigned long next = (k + 1) % size;
-      unsigned long hash;
-      scm_t_weak_entry copy;
-
-      hash = table->entries[next].hash;
-
-      if (!hash || hash_to_index (hash, size) == next)
-        break;
-
-      copy_weak_entry (&table->entries[next], &copy);
-
-      if (!copy.key || !copy.value)
-        /* Lost weak reference.  */
-        {
-          give_to_poor (table, next);
-          table->n_items--;
-          continue;
-        }
-
-      move_weak_entry (&table->entries[next], &table->entries[k],
-                       table->kind);
-
-      k = next;
-    }
-
-  /* We have shuffled down any entries that should be shuffled down; now
-     free the end.  */
-  table->entries[k].hash = 0;
-  table->entries[k].key = 0;
-  table->entries[k].value = 0;
-}
-
-
 
 
 /* The GC "kinds" for singly-weak tables.  */
 static int weak_key_gc_kind;
 static int weak_value_gc_kind;
+static int doubly_weak_gc_kind;
 
 static struct GC_ms_entry *
-mark_weak_key_table (GC_word *addr, struct GC_ms_entry *mark_stack_ptr,
+mark_weak_key_entry (GC_word *addr, struct GC_ms_entry *mark_stack_ptr,
                      struct GC_ms_entry *mark_stack_limit, GC_word env)
 {
-  scm_t_weak_entry *entries = (scm_t_weak_entry*) addr;
-  unsigned long k, size = GC_size (addr) / sizeof (scm_t_weak_entry);
+  scm_t_weak_entry *entry = (scm_t_weak_entry*) addr;
 
-  for (k = 0; k < size; k++)
-    if (entries[k].hash && entries[k].key)
-      {
-        SCM value = SCM_PACK (entries[k].value);
+  if (entry->next)
+    mark_stack_ptr = GC_MARK_AND_PUSH ((GC_word*) entry->next,
+                                       mark_stack_ptr, mark_stack_limit,
+                                       NULL);
+
+  if (entry->hash && entry->key)
+    {
+      SCM value = SCM_PACK (entry->value);
+      if (SCM_HEAP_OBJECT_P (value))
         mark_stack_ptr = GC_MARK_AND_PUSH ((GC_word*) SCM2PTR (value),
                                            mark_stack_ptr, mark_stack_limit,
                                            NULL);
-      }
+    }
 
   return mark_stack_ptr;
 }
 
 static struct GC_ms_entry *
-mark_weak_value_table (GC_word *addr, struct GC_ms_entry *mark_stack_ptr,
+mark_weak_value_entry (GC_word *addr, struct GC_ms_entry *mark_stack_ptr,
                        struct GC_ms_entry *mark_stack_limit, GC_word env)
 {
-  scm_t_weak_entry *entries = (scm_t_weak_entry*) addr;
-  unsigned long k, size = GC_size (addr) / sizeof (scm_t_weak_entry);
+  scm_t_weak_entry *entry = (scm_t_weak_entry*) addr;
 
-  for (k = 0; k < size; k++)
-    if (entries[k].hash && entries[k].value)
-      {
-        SCM key = SCM_PACK (entries[k].key);
+  if (entry->next)
+    mark_stack_ptr = GC_MARK_AND_PUSH ((GC_word*) entry->next,
+                                       mark_stack_ptr, mark_stack_limit,
+                                       NULL);
+
+  if (entry->hash && entry->value)
+    {
+      SCM key = SCM_PACK (entry->key);
+      if (SCM_HEAP_OBJECT_P (key))
         mark_stack_ptr = GC_MARK_AND_PUSH ((GC_word*) SCM2PTR (key),
                                            mark_stack_ptr, mark_stack_limit,
                                            NULL);
-      }
+    }
+
+  return mark_stack_ptr;
+}
+
+static struct GC_ms_entry *
+mark_doubly_weak_entry (GC_word *addr, struct GC_ms_entry *mark_stack_ptr,
+                        struct GC_ms_entry *mark_stack_limit, GC_word env)
+{
+  scm_t_weak_entry *entry = (scm_t_weak_entry*) addr;
+
+  if (entry->next)
+    mark_stack_ptr = GC_MARK_AND_PUSH ((GC_word*) entry->next,
+                                       mark_stack_ptr, mark_stack_limit,
+                                       NULL);
 
   return mark_stack_ptr;
 }
 
 static scm_t_weak_entry *
-allocate_entries (unsigned long size, scm_t_weak_table_kind kind)
+allocate_entry (scm_t_weak_table_kind kind)
 {
   scm_t_weak_entry *ret;
-  size_t bytes = size * sizeof (*ret);
 
   switch (kind)
     {
     case SCM_WEAK_TABLE_KIND_KEY:
-      ret = GC_generic_malloc (bytes, weak_key_gc_kind);
+      ret = GC_generic_malloc (sizeof (*ret), weak_key_gc_kind);
       break;
     case SCM_WEAK_TABLE_KIND_VALUE:
-      ret = GC_generic_malloc (bytes, weak_value_gc_kind);
+      ret = GC_generic_malloc (sizeof (*ret), weak_value_gc_kind);
       break;
     case SCM_WEAK_TABLE_KIND_BOTH:
-      ret = scm_gc_malloc_pointerless (bytes, "weak-table");
+      ret = GC_generic_malloc (sizeof (*ret), doubly_weak_gc_kind);
       break;
     default:
       abort ();
     }
 
-  memset (ret, 0, bytes);
+  memset (ret, 0, sizeof (*ret));
 
   return ret;
 }
 
+static void
+add_entry (scm_t_weak_table *table, scm_t_weak_entry *entry)
+{
+  unsigned long bucket = entry->hash % table->n_buckets;
+  entry->next = table->buckets[bucket];
+  table->buckets[bucket] = entry;
+  table->n_items++;
+}
+
 
 
 /* Growing or shrinking is triggered when the load factor
  *
  *   L = N / S    (N: number of items in table, S: bucket vector length)
  *
- * passes an upper limit of 0.9 or a lower limit of 0.2.
+ * passes an upper limit of 0.9 or a lower limit of 0.25.
  *
  * The implementation stores the upper and lower number of items which
  * trigger a resize in the hashtable object.
@@ -400,168 +274,91 @@ static unsigned long hashtable_size[] = {
 
 #define HASHTABLE_SIZE_N (sizeof(hashtable_size)/sizeof(unsigned long))
 
-static int
-compute_size_index (scm_t_weak_table *table)
+static void
+resize_table (scm_t_weak_table *table)
 {
-  int i = table->size_index;
+  scm_t_weak_entry **old_buckets, **new_buckets;
+  int new_size_index;
+  unsigned long old_n_buckets, new_n_buckets, old_k;
 
+  new_size_index = table->size_index;
   if (table->n_items < table->lower)
     {
-      /* rehashing is not triggered when i <= min_size */
+      /* Rehashing is not triggered when i <= min_size.  */
       do
-	--i;
-      while (i > table->min_size_index
-	     && table->n_items < hashtable_size[i] / 5);
+	new_size_index -= 1;
+      while (new_size_index > table->min_size_index
+	     && table->n_items < hashtable_size[new_size_index] / 4);
     }
   else if (table->n_items > table->upper)
     {
-      ++i;
-      if (i >= HASHTABLE_SIZE_N)
-        /* The biggest size currently is 230096423, which for a 32-bit
-           machine will occupy 2.3GB of memory at a load of 80%.  There
-           is probably something better to do here, but if you have a
-           weak map of that size, you are hosed in any case.  */
-        abort ();
-    }
-
-  return i;
-}
-
-static int
-is_acceptable_size_index (scm_t_weak_table *table, int size_index)
-{
-  int computed = compute_size_index (table);
-
-  if (size_index == computed)
-    /* We were going to grow or shrink, and allocating the new vector
-       didn't change the target size.  */
-    return 1;
-
-  if (size_index == computed + 1)
-    {
-      /* We were going to enlarge the table, but allocating the new
-         vector finalized some objects, making an enlargement
-         unnecessary.  It might still be a good idea to use the larger
-         table, though.  (This branch also gets hit if, while allocating
-         the vector, some other thread was actively removing items from
-         the table.  That is less likely, though.)  */
-      unsigned long new_lower = hashtable_size[size_index] / 5;
-
-      return table->size > new_lower;
-    }
-
-  if (size_index == computed - 1)
-    {
-      /* We were going to shrink the table, but when we dropped the lock
-         to allocate the new vector, some other thread added elements to
-         the table.  */
-      return 0;
-    }
-
-  /* The computed size differs from our newly allocated size by more
-     than one size index -- recalculate.  */
-  return 0;
-}
-
-static void
-resize_table (scm_t_weak_table *table)
-{
-  scm_t_weak_entry *old_entries, *new_entries;
-  int new_size_index;
-  unsigned long old_size, new_size, old_k;
-
-  do 
-    {
-      new_size_index = compute_size_index (table);
-      if (new_size_index == table->size_index)
+      new_size_index += 1;
+      if (new_size_index >= HASHTABLE_SIZE_N)
+        /* Limit max bucket count.  */
         return;
-      new_size = hashtable_size[new_size_index];
-      new_entries = allocate_entries (new_size, table->kind);
     }
-  while (!is_acceptable_size_index (table, new_size_index));
+  else
+    /* Nothing to do.  */
+    return;
+
+  new_n_buckets = hashtable_size[new_size_index];
+  new_buckets = scm_gc_malloc (sizeof (*new_buckets) * new_n_buckets,
+                               "weak table buckets");
 
-  old_entries = table->entries;
-  old_size = table->size;
+  old_buckets = table->buckets;
+  old_n_buckets = table->n_buckets;
   
   table->size_index = new_size_index;
-  table->size = new_size;
+  table->n_buckets = new_n_buckets;
   if (new_size_index <= table->min_size_index)
     table->lower = 0;
   else
-    table->lower = new_size / 5;
-  table->upper = 9 * new_size / 10;
+    table->lower = new_n_buckets / 4;
+  table->upper = 9 * new_n_buckets / 10;
   table->n_items = 0;
-  table->entries = new_entries;
+  table->buckets = new_buckets;
 
-  for (old_k = 0; old_k < old_size; old_k++)
+  for (old_k = 0; old_k < old_n_buckets; old_k++)
     {
-      scm_t_weak_entry copy;
-      unsigned long new_k, distance;
-
-      if (!old_entries[old_k].hash)
-        continue;
-      
-      copy_weak_entry (&old_entries[old_k], &copy);
-      
-      if (!copy.key || !copy.value)
-        continue;
-      
-      new_k = hash_to_index (copy.hash, new_size);
-
-      for (distance = 0; ; distance++, new_k = (new_k + 1) % new_size)
+      scm_t_weak_entry *entry = old_buckets[old_k];
+      while (entry)
         {
-          unsigned long other_hash = new_entries[new_k].hash;
-
-          if (!other_hash)
-            /* Found an empty entry. */
-            break;
-
-          /* Displace the entry if our distance is less, otherwise keep
-             looking. */
-          if (entry_distance (other_hash, new_k, new_size) < distance)
-            {
-              rob_from_rich (table, new_k);
-              break;
-            }
+          scm_t_weak_entry *next = entry->next;
+          entry->next = NULL;
+          add_entry (table, entry);
+          entry = next;
         }
-          
-      table->n_items++;
-      new_entries[new_k].hash = copy.hash;
-      new_entries[new_k].key = copy.key;
-      new_entries[new_k].value = copy.value;
-
-      register_disappearing_links (&new_entries[new_k],
-                                   SCM_PACK (copy.key), SCM_PACK (copy.value),
-                                   table->kind);
     }
 }
 
 /* Run after GC via do_vacuum_weak_table, this function runs over the
    whole table, removing lost weak references, reshuffling the table as it
-   goes.  It might resize the table if it reaps enough entries.  */
+   goes.  It might resize the table if it reaps enough buckets.  */
 static void
 vacuum_weak_table (scm_t_weak_table *table)
 {
-  scm_t_weak_entry *entries = table->entries;
-  unsigned long size = table->size;
   unsigned long k;
 
-  for (k = 0; k < size; k++)
+  for (k = 0; k < table->n_buckets; k++)
     {
-      unsigned long hash = entries[k].hash;
-      
-      if (hash)
-        {
-          scm_t_weak_entry copy;
+      scm_t_weak_entry **loc = table->buckets + k;
+      scm_t_weak_entry *entry;
 
-          copy_weak_entry (&entries[k], &copy);
+      for (entry = *loc; entry; entry = *loc)
+        {
+          scm_t_bits key, value;
 
-          if (!copy.key || !copy.value)
-            /* Lost weak reference; reshuffle.  */
+          read_weak_entry (entry, &key, &value);
+          if (!key || !value)
+            /* Lost weak reference; prune entry.  */
             {
-              give_to_poor (table, k);
+              *loc = entry->next;
               table->n_items--;
+              entry->next = NULL;
+              unregister_disappearing_links (entry, table->kind);
             }
+          else
+            loc = &entry->next;
         }
     }
 
@@ -577,52 +374,22 @@ weak_table_ref (scm_t_weak_table *table, unsigned long hash,
                 scm_t_table_predicate_fn pred, void *closure,
                 SCM dflt)
 {
-  unsigned long k, distance, size;
-  scm_t_weak_entry *entries;
-  
-  size = table->size;
-  entries = table->entries;
+  unsigned long bucket = hash % table->n_buckets;
+  scm_t_weak_entry *entry;
 
-  hash = (hash << 1) | 0x1;
-  k = hash_to_index (hash, size);
-  
-  for (distance = 0; distance < size; distance++, k = (k + 1) % size)
+  for (entry = table->buckets[bucket]; entry; entry = entry->next)
     {
-      unsigned long other_hash;
-
-    retry:
-      other_hash = entries[k].hash;
-
-      if (!other_hash)
-        /* Not found. */
-        return dflt;
-
-      if (hash == other_hash)
+      if (entry->hash == hash)
         {
-          scm_t_weak_entry copy;
-          
-          copy_weak_entry (&entries[k], &copy);
+          scm_t_bits key, value;
 
-          if (!copy.key || !copy.value)
-            /* Lost weak reference; reshuffle.  */
-            {
-              give_to_poor (table, k);
-              table->n_items--;
-              goto retry;
-            }
-
-          if (pred (SCM_PACK (copy.key), SCM_PACK (copy.value), closure))
+          read_weak_entry (entry, &key, &value);
+          if (key && value && pred (SCM_PACK (key), SCM_PACK (value), closure))
             /* Found. */
-            return SCM_PACK (copy.value);
+            return SCM_PACK (value);
         }
-
-      /* If the entry's distance is less, our key is not in the table.  */
-      if (entry_distance (other_hash, k, size) < distance)
-        return dflt;
     }
 
-  /* If we got here, then we were unfortunate enough to loop through the
-     whole table.  Shouldn't happen, but hey.  */
   return dflt;
 }
 
@@ -632,81 +399,37 @@ weak_table_put_x (scm_t_weak_table *table, unsigned long hash,
                   scm_t_table_predicate_fn pred, void *closure,
                   SCM key, SCM value)
 {
-  unsigned long k, distance, size;
-  scm_t_weak_entry *entries;
-  
-  size = table->size;
-  entries = table->entries;
-
-  hash = (hash << 1) | 0x1;
-  k = hash_to_index (hash, size);
+  unsigned long bucket = hash % table->n_buckets;
+  scm_t_weak_entry *entry;
 
-  for (distance = 0; ; distance++, k = (k + 1) % size)
+  for (entry = table->buckets[bucket]; entry; entry = entry->next)
     {
-      unsigned long other_hash;
-
-    retry:
-      other_hash = entries[k].hash;
-
-      if (!other_hash)
-        /* Found an empty entry. */
-        break;
-
-      if (other_hash == hash)
+      if (entry->hash == hash)
         {
-          scm_t_weak_entry copy;
+          scm_t_bits k, v;
 
-          copy_weak_entry (&entries[k], &copy);
-          
-          if (!copy.key || !copy.value)
-            /* Lost weak reference; reshuffle.  */
+          read_weak_entry (entry, &k, &v);
+          if (k && v && pred (SCM_PACK (k), SCM_PACK (v), closure))
             {
-              give_to_poor (table, k);
-              table->n_items--;
-              goto retry;
+              unregister_disappearing_links (entry, table->kind);
+              key = SCM_PACK (k);
+              entry->value = SCM_UNPACK (value);
+              register_disappearing_links (entry, key, value, table->kind);
+              return;
             }
-
-          if (pred (SCM_PACK (copy.key), SCM_PACK (copy.value), closure))
-            /* Found an entry with this key. */
-            break;
-        }
-
-      if (table->n_items > table->upper)
-        /* Full table, time to resize.  */
-        {
-          resize_table (table);
-          return weak_table_put_x (table, hash >> 1, pred, closure, key, value);
         }
-
-      /* Displace the entry if our distance is less, otherwise keep
-         looking. */
-      if (entry_distance (other_hash, k, size) < distance)
-        {
-          rob_from_rich (table, k);
-          break;
-        }
-    }
-          
-  /* Fast path for updated values for existing entries of weak-key
-     tables.  */
-  if (table->kind == SCM_WEAK_TABLE_KIND_KEY &&
-      entries[k].hash == hash &&
-      entries[k].key == SCM_UNPACK (key))
-    {
-      entries[k].value = SCM_UNPACK (value);
-      return;
     }
 
-  if (entries[k].hash)
-    unregister_disappearing_links (&entries[k], table->kind);
-  else
-    table->n_items++;
-
-  entries[k].hash = hash;
-  entries[k].key = SCM_UNPACK (key);
-  entries[k].value = SCM_UNPACK (value);
+  if (table->n_items > table->upper)
+    /* Full table, time to resize.  */
+    resize_table (table);
 
-  register_disappearing_links (&entries[k], key, value, table->kind);
+  entry = allocate_entry (table->kind);
+  entry->hash = hash;
+  entry->key = SCM_UNPACK (key);
+  entry->value = SCM_UNPACK (value);
+  register_disappearing_links (entry, key, value, table->kind);
+  add_entry (table, entry);
 }
 
 
@@ -714,62 +437,34 @@ static void
 weak_table_remove_x (scm_t_weak_table *table, unsigned long hash,
                    scm_t_table_predicate_fn pred, void *closure)
 {
-  unsigned long k, distance, size;
-  scm_t_weak_entry *entries;
-  
-  size = table->size;
-  entries = table->entries;
-
-  hash = (hash << 1) | 0x1;
-  k = hash_to_index (hash, size);
+  unsigned long bucket = hash % table->n_buckets;
+  scm_t_weak_entry **loc = table->buckets + bucket;
+  scm_t_weak_entry *entry;
 
-  for (distance = 0; distance < size; distance++, k = (k + 1) % size)
+  for (entry = *loc; entry; entry = *loc)
     {
-      unsigned long other_hash;
-
-    retry:
-      other_hash = entries[k].hash;
-
-      if (!other_hash)
-        /* Not found. */
-        return;
-
-      if (other_hash == hash)
+      if (entry->hash == hash)
         {
-          scm_t_weak_entry copy;
-      
-          copy_weak_entry (&entries[k], &copy);
-          
-          if (!copy.key || !copy.value)
-            /* Lost weak reference; reshuffle.  */
-            {
-              give_to_poor (table, k);
-              table->n_items--;
-              goto retry;
-            }
+          scm_t_bits k, v;
 
-          if (pred (SCM_PACK (copy.key), SCM_PACK (copy.value), closure))
-            /* Found an entry with this key. */
+          read_weak_entry (entry, &k, &v);
+          if (k && v && pred (SCM_PACK (k), SCM_PACK (v), closure))
             {
-              entries[k].hash = 0;
-              entries[k].key = 0;
-              entries[k].value = 0;
-
-              unregister_disappearing_links (&entries[k], table->kind);
+              *loc = entry->next;
+              table->n_items--;
+              entry->next = NULL;
+              unregister_disappearing_links (entry, table->kind);
 
-              if (--table->n_items < table->lower)
+              if (table->n_items < table->lower)
                 resize_table (table);
-              else
-                give_to_poor (table, k);
 
               return;
             }
         }
-
-      /* If the entry's distance is less, our key is not in the table.  */
-      if (entry_distance (other_hash, k, size) < distance)
-        return;
+      loc = &entry->next;
     }
+
+  return;
 }
 
 
@@ -785,10 +480,11 @@ make_weak_table (unsigned long k, scm_t_weak_table_kind kind)
   n = hashtable_size[i];
 
   table = scm_gc_malloc (sizeof (*table), "weak-table");
-  table->entries = allocate_entries (n, kind);
+  table->buckets = scm_gc_malloc (sizeof (*table->buckets) * n,
+                                  "weak table buckets");
   table->kind = kind;
   table->n_items = 0;
-  table->size = n;
+  table->n_buckets = n;
   table->lower = 0;
   table->upper = 9 * n / 10;
   table->size_index = i;
@@ -805,7 +501,7 @@ scm_i_weak_table_print (SCM exp, SCM port, scm_print_state *pstate)
   scm_puts ("weak-table ", port);
   scm_uintprint (SCM_WEAK_TABLE (exp)->n_items, 10, port);
   scm_putc ('/', port);
-  scm_uintprint (SCM_WEAK_TABLE (exp)->size, 10, port);
+  scm_uintprint (SCM_WEAK_TABLE (exp)->n_buckets, 10, port);
   scm_puts (">", port);
 }
 
@@ -961,6 +657,8 @@ scm_weak_table_clear_x (SCM table)
 #define FUNC_NAME "weak-table-clear!"
 {
   scm_t_weak_table *t;
+  unsigned long k;
+  scm_t_weak_entry *entry;
 
   SCM_VALIDATE_WEAK_TABLE (1, table);
 
@@ -968,7 +666,12 @@ scm_weak_table_clear_x (SCM table)
 
   scm_i_pthread_mutex_lock (&t->lock);
 
-  memset (t->entries, 0, sizeof (scm_t_weak_entry) * t->size);
+  for (k = 0; k < t->n_buckets; k++)
+    {
+      for (entry = t->buckets[k]; entry; entry = entry->next)
+        unregister_disappearing_links (entry, t->kind);
+      t->buckets[k] = NULL;
+    }
   t->n_items = 0;
 
   scm_i_pthread_mutex_unlock (&t->lock);
@@ -980,38 +683,32 @@ scm_c_weak_table_fold (scm_t_table_fold_fn proc, void *closure,
                        SCM init, SCM table)
 {
   scm_t_weak_table *t;
-  scm_t_weak_entry *entries;
-  unsigned long k, size;
+  unsigned long k;
+  SCM alist = SCM_EOL;
 
   t = SCM_WEAK_TABLE (table);
 
   scm_i_pthread_mutex_lock (&t->lock);
 
-  size = t->size;
-  entries = t->entries;
-
-  for (k = 0; k < size; k++)
+  for (k = 0; k < t->n_buckets; k++)
     {
-      if (entries[k].hash)
+      scm_t_weak_entry *entry;
+      for (entry = t->buckets[k]; entry; entry = entry->next)
         {
-          scm_t_weak_entry copy;
-          
-          copy_weak_entry (&entries[k], &copy);
+          scm_t_bits key, value;
+          read_weak_entry (entry, &key, &value);
       
-          if (copy.key && copy.value)
-            {
-              /* Release table lock while we call the function.  */
-              scm_i_pthread_mutex_unlock (&t->lock);
-              init = proc (closure,
-                           SCM_PACK (copy.key), SCM_PACK (copy.value),
-                           init);
-              scm_i_pthread_mutex_lock (&t->lock);
-            }
+          if (key && value)
+            alist = scm_acons (SCM_PACK (key), SCM_PACK (value), alist);
         }
     }
   
   scm_i_pthread_mutex_unlock (&t->lock);
   
+  /* Call the proc outside the lock.  */
+  for (; !scm_is_null (alist); alist = scm_cdr (alist))
+    init = proc (closure, scm_caar (alist), scm_cdar (alist), init);
+
   return init;
 }
 
@@ -1157,11 +854,15 @@ scm_weak_table_prehistory (void)
 {
   weak_key_gc_kind =
     GC_new_kind (GC_new_free_list (),
-		 GC_MAKE_PROC (GC_new_proc (mark_weak_key_table), 0),
+		 GC_MAKE_PROC (GC_new_proc (mark_weak_key_entry), 0),
 		 0, 0);
   weak_value_gc_kind =
     GC_new_kind (GC_new_free_list (),
-		 GC_MAKE_PROC (GC_new_proc (mark_weak_value_table), 0),
+		 GC_MAKE_PROC (GC_new_proc (mark_weak_value_entry), 0),
+		 0, 0);
+  doubly_weak_gc_kind =
+    GC_new_kind (GC_new_free_list (),
+		 GC_MAKE_PROC (GC_new_proc (mark_doubly_weak_entry), 0),
 		 0, 0);
 }
 
-- 
2.14.1


Information forwarded to bug-guile <at> gnu.org:
bug#19180; Package guile. (Mon, 30 Oct 2017 22:19:02 GMT) Full text and rfc822 format available.

Message #98 received at 19180 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Ricardo Wurmus <ricardo.wurmus <at> mdc-berlin.de>
Cc: Andy Wingo <wingo <at> igalia.com>, 19180 <at> debbugs.gnu.org,
 Christopher Allan Webber <cwebber <at> dustycloud.org>, guile-devel <at> gnu.org
Subject: Re: Weak tables harmful to GC?
Date: Mon, 30 Oct 2017 23:18:51 +0100
Hi,

Ricardo Wurmus <ricardo.wurmus <at> mdc-berlin.de> skribis:

> previously I wrote:
>
>> The “guile-awesome” package finished compiling (after about 46 minutes).
>> I’m now testing “guix pull” with a version of Guix that uses
>> “guile-awesome”.
>
> I’m sure I’m doing something wrong (see below for guesses).  Here’s what
> I get:
>
> ./pre-inst-env guix pull
> loading...       26.0% of 645 filesrandom seed for tests: 1509382171
> compiling...     18.9% of 645 filesIn thread:
> ERROR: In procedure return: return used outside of 'with-monad'Error while printing exception.
> compiling...     54.7% of 645 files^C

The error above is the other bug you reported, not related (but just as
serious): <https://bugs.gnu.org/27476>.

> I modified build-self.scm to use the modified Guile:
>
> diff --git a/build-aux/build-self.scm b/build-aux/build-self.scm
> index ed8ff5f..9af6504 100644
> --- a/build-aux/build-self.scm
> +++ b/build-aux/build-self.scm
> @@ -126,7 +126,7 @@ running Guile."
>    (package->derivation (cond-expand
>                           (guile-2.2
>                            (canonical-package
> -                           (specification->package "guile <at> 2.2")))
> +                           (specification->package "guile-awesome <at> 2.2")))
>                           (else
>                            (canonical-package
>                             (specification->package "guile <at> 2.0"))))))
>
> I also confirmed that the Guile process that is spawned as “bin/guile
> --no-auto-compile /home/rwurmus/guix/scripts/guix pull” is indeed the
> modified Guile, but I noticed that it spawns yet another Guile process
> to load and compile Guix.
>
> I guess that comes from the daemon?  If that’s the case I can’t really
> test this on this big server, because the daemon is currently in use, so
> I can’t just reconfigure it to use the modified Guile.

Your patch above should have led to the use of “guile-awesome” to
compile Guix; I’m not sure why it didn’t.

> When compiling Guix from source with “make -j 32” using that version of
> Guile I got a segfault.

Oh?

Let’s put this on hold since Andy offers a different solution.

Thanks for testing!

Ludo’.




Information forwarded to bug-guile <at> gnu.org:
bug#19180; Package guile. (Mon, 30 Oct 2017 23:14:02 GMT) Full text and rfc822 format available.

Message #101 received at 19180 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Andy Wingo <wingo <at> igalia.com>
Cc: 19180 <at> debbugs.gnu.org, Christopher Allan Webber <cwebber <at> dustycloud.org>,
 Ricardo Wurmus <ricardo.wurmus <at> mdc-berlin.de>, guile-devel <at> gnu.org
Subject: Re: Weak tables harmful to GC?
Date: Tue, 31 Oct 2017 00:13:51 +0100
Hi Andy,

Andy Wingo <wingo <at> igalia.com> skribis:

> As discussed on IRC, what do you think of this patch?  It preserves the
> thread-safety properties of weak tables and just adapts them to be
> bucket-and-chain tables.  Let me know how it works for you.

That was fast!  The code certainly looks nicer than the old entangled
weak hash table code, and it preserves thread-safety, so that’s great.

> If it works, we'll need to adapt weak sets as well.

Yes, though I think weak sets are less critical.

I built libguile with the patch (I haven’t yet taken the time to rebuild
all of Guile).  It works, but unfortunately it still shows quick growth
of the heap on this example (<https://bugs.gnu.org/28590>):

--8<---------------cut here---------------start------------->8---
(use-modules (ice-9 format))

(define (display-heap-size)
  (format #t "heap size: ~,2h MiB~%"
          (/ (assoc-ref (gc-stats) 'heap-size) (expt 2. 20))))

(define table
  (make-weak-key-hash-table))

(let loop ((i 0))
  (unless #f
    (when (zero? (modulo i 100000))
      (pk 'table table)
      (display-heap-size))
    (hashq-set! table (make-list 10) (make-list 10))
    (loop (1+ i))))
--8<---------------cut here---------------end--------------->8---

Could it me that some of the disappearing links are not getting
unregistered?

> From 6ec4642516eaabf7a63644463a7836eb3efbcd60 Mon Sep 17 00:00:00 2001
> From: Andy Wingo <wingo <at> pobox.com>
> Date: Mon, 30 Oct 2017 18:19:37 +0100
> Subject: [PATCH] Weak tables are now bucket-and-chain tables
>
> This change should make weak tables work better with libgc, as the weak
> components that need mark functions are smaller, so they don't overflow
> the mark queue.  Also this prevents the need to move disappearing
> links.
>
> * libguile/weak-table.c (scm_t_weak_entry): Change to be a hash table
>   chain entry.
>   (struct weak_entry_data, do_read_weak_entry, read_weak_entry): Read
>   out the key and value directly.
>   (GC_move_disappearing_link, move_disappearing_links, move_weak_entry):
>   Remove.
>   (scm_t_weak_table): Rename "entries" member to "buckets", and "size" to
>   "n_buckets".
>   (hash_to_index, entry_distance, rob_from_rich, give_to_poor): Remove.
>   (mark_weak_key_entry, mark_weak_value_entry): Mark a single link, and
>   the next link.
>   (mark_doubly_weak_entry): New kind.
>   (allocate_entry): Allocate a single entry.
>   (add_entry): New helper.
>   (resize_table): Reimplement more like normal hash tables.
>   (vacuum_weak_table): Adapt to new implementation.
>   (weak_table_ref, weak_table_put_x, weak_table_remove_x): Adapt.
>   (make_weak_table): Adapt.
>   (scm_weak_table_clear_x): Actually unregister the links to prevent a
>   memory leak.
>   (scm_c_weak_table_fold): Collect items in an alist, then fold outside
>   the lock.
>   (scm_weak_table_prehistory): Initialize doubly_weak_gc_kind.

[...]

> +mark_weak_key_entry (GC_word *addr, struct GC_ms_entry *mark_stack_ptr,
>                       struct GC_ms_entry *mark_stack_limit, GC_word env)

[...]

>    weak_key_gc_kind =
>      GC_new_kind (GC_new_free_list (),
> -		 GC_MAKE_PROC (GC_new_proc (mark_weak_key_table), 0),
> +		 GC_MAKE_PROC (GC_new_proc (mark_weak_key_entry), 0),
>  		 0, 0);

I think we should avoid custom mark procedures and use a bitmap here, as
recommended in the libgc headers (like ‘wcar_pair_descr’ in weaks.c in
2.0.)

Other than that, it looks good on a cursory look.  We’ll have to do some
more testing afterwards to gain more confidence, like what Ricardo has
been doing.

Thanks a lot for your help!

Ludo’.




Information forwarded to bug-guile <at> gnu.org:
bug#19180; Package guile. (Tue, 31 Oct 2017 08:27:02 GMT) Full text and rfc822 format available.

Message #104 received at 19180 <at> debbugs.gnu.org (full text, mbox):

From: Andy Wingo <wingo <at> igalia.com>
To: ludo <at> gnu.org (Ludovic Courtès)
Cc: 19180 <at> debbugs.gnu.org, Christopher Allan Webber <cwebber <at> dustycloud.org>,
 Ricardo Wurmus <ricardo.wurmus <at> mdc-berlin.de>, guile-devel <at> gnu.org
Subject: Re: Weak tables harmful to GC?
Date: Tue, 31 Oct 2017 09:25:53 +0100
[Message part 1 (text/plain, inline)]
On Tue 31 Oct 2017 00:13, ludo <at> gnu.org (Ludovic Courtès) writes:

> I built libguile with the patch (I haven’t yet taken the time to rebuild
> all of Guile).  It works, but unfortunately it still shows quick growth
> of the heap on this example (<https://bugs.gnu.org/28590>):

Fixed in attached patches (on top of the other one).  This was a race
between the periodic vacuum process, which should run after GC via a
finalizer, and the mutator.  If the mutator had the weak table lock, the
vacuum would never be run.  Of course in the test below, the mutator
usually has the table lock, so we ended up often skipping the vacuum,
which causes the table size to grow, which causes the active heap size
to grow, which causes the bytes-allocated-before-GC to increase, and
which ultimately is a vicious circle.

In my tests this patch fixes the issue, though the level at which the
heap stabilizes can vary slightly as there's a bit of nondeterministic
concurrency as the mutator and the vacuum process still race a bit.
Right now for me it stabilizes at about 6.2M of heap.

>>    weak_key_gc_kind =
>>      GC_new_kind (GC_new_free_list (),
>> -		 GC_MAKE_PROC (GC_new_proc (mark_weak_key_table), 0),
>> +		 GC_MAKE_PROC (GC_new_proc (mark_weak_key_entry), 0),
>>  		 0, 0);
>
> I think we should avoid custom mark procedures and use a bitmap here, as
> recommended in the libgc headers (like ‘wcar_pair_descr’ in weaks.c in
> 2.0.)

Good idea; fixed.

Andy

[0002-Refactor-weak-table-to-use-bitmaps-for-weak-entries.patch (text/x-patch, inline)]
From 098c4171ef53791d97b5c675218f302efc7bcf26 Mon Sep 17 00:00:00 2001
From: Andy Wingo <wingo <at> pobox.com>
Date: Tue, 31 Oct 2017 09:10:55 +0100
Subject: [PATCH 2/3] Refactor weak table to use bitmaps for weak entries

---
 libguile/weak-table.c | 107 ++++++++++++--------------------------------------
 1 file changed, 25 insertions(+), 82 deletions(-)

diff --git a/libguile/weak-table.c b/libguile/weak-table.c
index ff8a01fb0..fab98149f 100644
--- a/libguile/weak-table.c
+++ b/libguile/weak-table.c
@@ -25,7 +25,7 @@
 #include <assert.h>
 
 #include "libguile/bdw-gc.h"
-#include <gc/gc_mark.h>
+#include <gc/gc_typed.h>
 
 #include "libguile/_scm.h"
 #include "libguile/hash.h"
@@ -152,70 +152,10 @@ typedef struct {
 
 
 
-/* The GC "kinds" for singly-weak tables.  */
-static int weak_key_gc_kind;
-static int weak_value_gc_kind;
-static int doubly_weak_gc_kind;
-
-static struct GC_ms_entry *
-mark_weak_key_entry (GC_word *addr, struct GC_ms_entry *mark_stack_ptr,
-                     struct GC_ms_entry *mark_stack_limit, GC_word env)
-{
-  scm_t_weak_entry *entry = (scm_t_weak_entry*) addr;
-
-  if (entry->next)
-    mark_stack_ptr = GC_MARK_AND_PUSH ((GC_word*) entry->next,
-                                       mark_stack_ptr, mark_stack_limit,
-                                       NULL);
-
-  if (entry->hash && entry->key)
-    {
-      SCM value = SCM_PACK (entry->value);
-      if (SCM_HEAP_OBJECT_P (value))
-        mark_stack_ptr = GC_MARK_AND_PUSH ((GC_word*) SCM2PTR (value),
-                                           mark_stack_ptr, mark_stack_limit,
-                                           NULL);
-    }
-
-  return mark_stack_ptr;
-}
-
-static struct GC_ms_entry *
-mark_weak_value_entry (GC_word *addr, struct GC_ms_entry *mark_stack_ptr,
-                       struct GC_ms_entry *mark_stack_limit, GC_word env)
-{
-  scm_t_weak_entry *entry = (scm_t_weak_entry*) addr;
-
-  if (entry->next)
-    mark_stack_ptr = GC_MARK_AND_PUSH ((GC_word*) entry->next,
-                                       mark_stack_ptr, mark_stack_limit,
-                                       NULL);
-
-  if (entry->hash && entry->value)
-    {
-      SCM key = SCM_PACK (entry->key);
-      if (SCM_HEAP_OBJECT_P (key))
-        mark_stack_ptr = GC_MARK_AND_PUSH ((GC_word*) SCM2PTR (key),
-                                           mark_stack_ptr, mark_stack_limit,
-                                           NULL);
-    }
-
-  return mark_stack_ptr;
-}
-
-static struct GC_ms_entry *
-mark_doubly_weak_entry (GC_word *addr, struct GC_ms_entry *mark_stack_ptr,
-                        struct GC_ms_entry *mark_stack_limit, GC_word env)
-{
-  scm_t_weak_entry *entry = (scm_t_weak_entry*) addr;
-
-  if (entry->next)
-    mark_stack_ptr = GC_MARK_AND_PUSH ((GC_word*) entry->next,
-                                       mark_stack_ptr, mark_stack_limit,
-                                       NULL);
-
-  return mark_stack_ptr;
-}
+/* GC descriptors for the various kinds of scm_t_weak_entry.  */
+static GC_descr weak_key_descr;
+static GC_descr weak_value_descr;
+static GC_descr doubly_weak_descr;
 
 static scm_t_weak_entry *
 allocate_entry (scm_t_weak_table_kind kind)
@@ -225,20 +165,18 @@ allocate_entry (scm_t_weak_table_kind kind)
   switch (kind)
     {
     case SCM_WEAK_TABLE_KIND_KEY:
-      ret = GC_generic_malloc (sizeof (*ret), weak_key_gc_kind);
+      ret = GC_malloc_explicitly_typed (sizeof (*ret), weak_key_descr);
       break;
     case SCM_WEAK_TABLE_KIND_VALUE:
-      ret = GC_generic_malloc (sizeof (*ret), weak_value_gc_kind);
+      ret = GC_malloc_explicitly_typed (sizeof (*ret), weak_value_descr);
       break;
     case SCM_WEAK_TABLE_KIND_BOTH:
-      ret = GC_generic_malloc (sizeof (*ret), doubly_weak_gc_kind);
+      ret = GC_malloc_explicitly_typed (sizeof (*ret), doubly_weak_descr);
       break;
     default:
       abort ();
     }
 
-  memset (ret, 0, sizeof (*ret));
-
   return ret;
 }
 
@@ -852,18 +790,23 @@ SCM_DEFINE (scm_doubly_weak_hash_table_p, "doubly-weak-hash-table?", 1, 0, 0,
 void
 scm_weak_table_prehistory (void)
 {
-  weak_key_gc_kind =
-    GC_new_kind (GC_new_free_list (),
-		 GC_MAKE_PROC (GC_new_proc (mark_weak_key_entry), 0),
-		 0, 0);
-  weak_value_gc_kind =
-    GC_new_kind (GC_new_free_list (),
-		 GC_MAKE_PROC (GC_new_proc (mark_weak_value_entry), 0),
-		 0, 0);
-  doubly_weak_gc_kind =
-    GC_new_kind (GC_new_free_list (),
-		 GC_MAKE_PROC (GC_new_proc (mark_doubly_weak_entry), 0),
-		 0, 0);
+  GC_word weak_key_bitmap[GC_BITMAP_SIZE (scm_t_weak_entry)] = { 0 };
+  GC_word weak_value_bitmap[GC_BITMAP_SIZE (scm_t_weak_entry)] = { 0 };
+  GC_word doubly_weak_bitmap[GC_BITMAP_SIZE (scm_t_weak_entry)] = { 0 };
+
+  GC_set_bit (weak_key_bitmap, GC_WORD_OFFSET (scm_t_weak_entry, next));
+  GC_set_bit (weak_value_bitmap, GC_WORD_OFFSET (scm_t_weak_entry, next));
+  GC_set_bit (doubly_weak_bitmap, GC_WORD_OFFSET (scm_t_weak_entry, next));
+
+  GC_set_bit (weak_key_bitmap, GC_WORD_OFFSET (scm_t_weak_entry, value));
+  GC_set_bit (weak_value_bitmap, GC_WORD_OFFSET (scm_t_weak_entry, key));
+
+  weak_key_descr = GC_make_descriptor (weak_key_bitmap,
+                                       GC_WORD_LEN (scm_t_weak_entry));
+  weak_value_descr = GC_make_descriptor (weak_value_bitmap,
+                                         GC_WORD_LEN (scm_t_weak_entry));
+  doubly_weak_descr = GC_make_descriptor (doubly_weak_bitmap,
+                                          GC_WORD_LEN (scm_t_weak_entry));
 }
 
 void
-- 
2.14.1

[0003-More-robust-vacuuming-of-in-use-weak-tables.patch (text/x-patch, inline)]
From 3304f9dc2cf106426570acc8437b4e39fe5edf91 Mon Sep 17 00:00:00 2001
From: Andy Wingo <wingo <at> pobox.com>
Date: Tue, 31 Oct 2017 08:43:09 +0100
Subject: [PATCH 3/3] More robust vacuuming of in-use weak tables

* libguile/weak-table.c (scm_t_weak_table); Add last_gc_no member.
* libguile/weak-table.c (vacuum_weak_table): Only vacuum if we haven't
  done so since the last GC.
  (scm_c_weak_table_ref, scm_c_weak_table_put_x, scm_c_weak_table_remove_x)
  (scm_c_weak_table_fold): Vacuum the weak table if needed.
  (scm_weak_table_clear_x): Update last_gc_no flag, as no more vacuuming
  will be needed.
---
 libguile/weak-table.c | 25 ++++++++++++++++++++++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/libguile/weak-table.c b/libguile/weak-table.c
index fab98149f..461d4a47c 100644
--- a/libguile/weak-table.c
+++ b/libguile/weak-table.c
@@ -31,7 +31,6 @@
 #include "libguile/hash.h"
 #include "libguile/eval.h"
 #include "libguile/ports.h"
-
 #include "libguile/validate.h"
 #include "libguile/weak-list.h"
 #include "libguile/weak-table.h"
@@ -141,6 +140,7 @@ typedef struct {
   unsigned long upper;		/* when to grow */
   int size_index;		/* index into hashtable_size */
   int min_size_index;		/* minimum size_index */
+  GC_word last_gc_no;
 } scm_t_weak_table;
 
 
@@ -275,8 +275,14 @@ resize_table (scm_t_weak_table *table)
 static void
 vacuum_weak_table (scm_t_weak_table *table)
 {
+  GC_word gc_no = GC_get_gc_no ();
   unsigned long k;
 
+  if (gc_no == table->last_gc_no)
+    return;
+
+  table->last_gc_no = gc_no;
+
   for (k = 0; k < table->n_buckets; k++)
     {
       scm_t_weak_entry **loc = table->buckets + k;
@@ -427,6 +433,7 @@ make_weak_table (unsigned long k, scm_t_weak_table_kind kind)
   table->upper = 9 * n / 10;
   table->size_index = i;
   table->min_size_index = i;
+  table->last_gc_no = GC_get_gc_no ();
   scm_i_pthread_mutex_init (&table->lock, NULL);
 
   return scm_cell (scm_tc7_weak_table, (scm_t_bits)table);
@@ -456,8 +463,10 @@ do_vacuum_weak_table (SCM table)
      custom predicate, or via finalizers run explicitly by (gc) or in an
      async (for non-threaded Guile).  We add a restriction that
      prohibits the first case, by convention.  But since we can't
-     prohibit the second case, here we trylock instead of lock.  Not so
-     nice.  */
+     prohibit the second case, here we trylock instead of lock.  In any
+     case, if the mutex is held by another thread, then the table is in
+     active use, so the next user of the table will handle the vacuum
+     for us.  */
   if (scm_i_pthread_mutex_trylock (&t->lock) == 0)
     {
       vacuum_weak_table (t);
@@ -513,6 +522,8 @@ scm_c_weak_table_ref (SCM table, unsigned long raw_hash,
 
   scm_i_pthread_mutex_lock (&t->lock);
 
+  vacuum_weak_table (t);
+
   ret = weak_table_ref (t, raw_hash, pred, closure, dflt);
 
   scm_i_pthread_mutex_unlock (&t->lock);
@@ -535,6 +546,8 @@ scm_c_weak_table_put_x (SCM table, unsigned long raw_hash,
 
   scm_i_pthread_mutex_lock (&t->lock);
 
+  vacuum_weak_table (t);
+
   weak_table_put_x (t, raw_hash, pred, closure, key, value);
 
   scm_i_pthread_mutex_unlock (&t->lock);
@@ -555,6 +568,8 @@ scm_c_weak_table_remove_x (SCM table, unsigned long raw_hash,
 
   scm_i_pthread_mutex_lock (&t->lock);
 
+  vacuum_weak_table (t);
+
   weak_table_remove_x (t, raw_hash, pred, closure);
 
   scm_i_pthread_mutex_unlock (&t->lock);
@@ -604,6 +619,8 @@ scm_weak_table_clear_x (SCM table)
 
   scm_i_pthread_mutex_lock (&t->lock);
 
+  t->last_gc_no = GC_get_gc_no ();
+
   for (k = 0; k < t->n_buckets; k++)
     {
       for (entry = t->buckets[k]; entry; entry = entry->next)
@@ -628,6 +645,8 @@ scm_c_weak_table_fold (scm_t_table_fold_fn proc, void *closure,
 
   scm_i_pthread_mutex_lock (&t->lock);
 
+  vacuum_weak_table (t);
+
   for (k = 0; k < t->n_buckets; k++)
     {
       scm_t_weak_entry *entry;
-- 
2.14.1


Information forwarded to bug-guile <at> gnu.org:
bug#19180; Package guile. (Tue, 31 Oct 2017 16:57:01 GMT) Full text and rfc822 format available.

Message #107 received at 19180 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Andy Wingo <wingo <at> igalia.com>
Cc: 19180 <at> debbugs.gnu.org, Christopher Allan Webber <cwebber <at> dustycloud.org>,
 Ricardo Wurmus <ricardo.wurmus <at> mdc-berlin.de>, guile-devel <at> gnu.org
Subject: Re: Weak tables harmful to GC?
Date: Tue, 31 Oct 2017 17:56:52 +0100
Heya!

Andy Wingo <wingo <at> igalia.com> skribis:

> On Tue 31 Oct 2017 00:13, ludo <at> gnu.org (Ludovic Courtès) writes:
>
>> I built libguile with the patch (I haven’t yet taken the time to rebuild
>> all of Guile).  It works, but unfortunately it still shows quick growth
>> of the heap on this example (<https://bugs.gnu.org/28590>):
>
> Fixed in attached patches (on top of the other one).  This was a race
> between the periodic vacuum process, which should run after GC via a
> finalizer, and the mutator.  If the mutator had the weak table lock, the
> vacuum would never be run.  Of course in the test below, the mutator
> usually has the table lock, so we ended up often skipping the vacuum,
> which causes the table size to grow, which causes the active heap size
> to grow, which causes the bytes-allocated-before-GC to increase, and
> which ultimately is a vicious circle.
>
> In my tests this patch fixes the issue, though the level at which the
> heap stabilizes can vary slightly as there's a bit of nondeterministic
> concurrency as the mutator and the vacuum process still race a bit.
> Right now for me it stabilizes at about 6.2M of heap.

Yes.  I can confirm that it fixes this issue.

One of my benchmarks is to ‘read’ the 26M file that contains the
CPS-as-sexps representation of gnu/packages/python.scm, with
source-location recording on (thus with a big source property table).

Compared to 2.2.2, execution time is almost divided by two; heap size is
usually 540M and 620M, whereas it varies between 560M and 920M with
2.2.2:

--8<---------------cut here---------------start------------->8---
ludo <at> ribbon ~/src/guix$ guile profile-cps-read.scm
clock utime stime cutime cstime gctime
 8.23 10.00  0.06   0.00   0.00   5.17

;;; (#<weak-table 1863537/3595271>)
with Guile 2.2.2:
heap-size from 2.53 MiB to 780.03 MiB
ludo <at> ribbon ~/src/guix$ guile profile-cps-read.scm
clock utime stime cutime cstime gctime
 8.29  9.94  0.04   0.00   0.00   4.97

;;; (#<weak-table 1820831/3595271>)
with Guile 2.2.2:
heap-size from 2.53 MiB to 660.36 MiB
ludo <at> ribbon ~/src/guix$ guile profile-cps-read.scm
clock utime stime cutime cstime gctime
 8.11  9.33  0.05   0.00   0.00   3.90

;;; (#<weak-table 1829710/3595271>)
with Guile 2.2.2:
heap-size from 2.53 MiB to 562.21 MiB
ludo <at> ribbon ~/src/guix$ guile profile-cps-read.scm
clock utime stime cutime cstime gctime
 8.23  9.90  0.08   0.00   0.00   5.22

;;; (#<weak-table 1828628/3595271>)
with Guile 2.2.2:
heap-size from 2.53 MiB to 918.09 MiB
ludo <at> ribbon ~/src/guix$ /data/src/guile-2.1/meta/guile profile-cps-read.scm
clock utime stime cutime cstime gctime
 4.68  6.71  0.02   0.00   0.00   3.88

;;; (#<weak-table 1799062/3595271>)
with Guile 2.2.2.17-8069-dirty:
heap-size from 1.89 MiB to 540.54 MiB
ludo <at> ribbon ~/src/guix$ /data/src/guile-2.1/meta/guile profile-cps-read.scm
clock utime stime cutime cstime gctime
 4.73  6.77  0.05   0.00   0.00   4.03

;;; (#<weak-table 1801035/3595271>)
with Guile 2.2.2.17-8069-dirty:
heap-size from 1.89 MiB to 620.22 MiB
ludo <at> ribbon ~/src/guix$ /data/src/guile-2.1/meta/guile profile-cps-read.scm
clock utime stime cutime cstime gctime
 4.69  6.80  0.06   0.00   0.00   3.94

;;; (#<weak-table 1796895/3595271>)
with Guile 2.2.2.17-8069-dirty:
heap-size from 1.89 MiB to 573.36 MiB
ludo <at> ribbon ~/src/guix$ /data/src/guile-2.1/meta/guile profile-cps-read.scm
clock utime stime cutime cstime gctime
 4.67  6.81  0.06   0.00   0.00   4.05

;;; (#<weak-table 1797102/3595271>)
with Guile 2.2.2.17-8069-dirty:
heap-size from 1.89 MiB to 547.39 MiB
--8<---------------cut here---------------end--------------->8---

Another benchmark is the ‘emit-bytecode’ procedure for this CPS, without
optimizations turned off (like ‘%lightweight-optimizations’ in Guix.)

The heap size after the ‘emit-bytecode’ call is still at 1.3G (338M
before the call), about the same as with 2.2.2.  That’s not surprising
because I think memory consumption comes from the data structures used
at that compilation stage, as discussed before.

The wall-clock time is ~45s, whereas it’s ~54s with 2.2.2.

In other news, I’ve rebuilt 2.2.2 + these patches with Guix, and
everything went fine (Guile processes seem to peak at ~150M resident
when compiling).

So, all the lights are green on my side!

Thanks a whole lot for coming up with this solution!

Ludo’.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Wed, 29 Nov 2017 12:24:05 GMT) Full text and rfc822 format available.

This bug report was last modified 6 years and 143 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.