GNU bug report logs - #10641
[2.0.3+] start_signal_delivery_thread failure on x86_64-freebsd8.2

Previous Next

Package: guile;

Reported by: ludo <at> gnu.org (Ludovic Courtès)

Date: Sun, 29 Jan 2012 18:01:01 UTC

Severity: normal

Done: Andy Wingo <wingo <at> pobox.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 10641 in the body.
You can then email your comments to 10641 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guile <at> gnu.org:
bug#10641; Package guile. (Sun, 29 Jan 2012 18:01:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to ludo <at> gnu.org (Ludovic Courtès):
New bug report received and forwarded. Copy sent to bug-guile <at> gnu.org. (Sun, 29 Jan 2012 18:01:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: bug-guile <at> gnu.org
Subject: [2.0.3+] start_signal_delivery_thread failure on x86_64-freebsd8.2
Date: Sun, 29 Jan 2012 19:00:22 +0100
Hello!

For the record, scm_spawn_thread sometimes return #f, instead of a valid
thread, when called from start_signal_delivery_thread, itself called
from the pthread key destructor.

For this reason, I added an assertion check in commit 0f4f2d9a, which
gets hit systematically on that platform, for instance when running
standalone/test-scm-spawn-thread.

The backtrace looks like this:

--8<---------------cut here---------------start------------->8---
(gdb) thread apply all bt

Thread 3 (Thread 801a041c0 (LWP 100363)):
#0  0x00000008012903cc in __error () from /lib/libthr.so.3
#1  0x000000080128e501 in pthread_cond_signal () from /lib/libthr.so.3
#2  0x00000008007196e1 in scm_pthread_cond_timedwait (cond=0x640e28, mutex=0x640ca8, wt=0x7fffffffd980) at ../../libguile/threads.c:2024
#3  0x00000008007198cd in block_self (queue=0x855e20, sleep_object=Variable "sleep_object" is not available.
) at ../../libguile/threads.c:454
#4  0x0000000800719d65 in scm_join_thread_timed (thread=0x855e50, timeout=0x13c963492, timeoutval=Variable "timeoutval" is not available.
) at ../../libguile/threads.c:1295
#5  0x0000000000400c33 in inner_main (data=Variable "data" is not available.
) at ../../../test-suite/standalone/test-scm-spawn-thread.c:55
#6  0x00000008006a067a in c_body (d=0x7fffffffdbe0) at ../../libguile/continuations.c:518
#7  0x0000000800727803 in vm_regular_engine (vm=0x6add50, program=0x6b00c0, argv=0x1, nargs=8680160) at vm-i-system.c:1007
#8  0x000000080071fbc6 in scm_c_vm_run (vm=0x6add50, program=0x785930, argv=0x7fffffffdb40, nargs=4) at ../../libguile/vm.c:567
#9  0x00000008006a7ce3 in scm_call_4 (proc=0x785930, arg1=Variable "arg1" is not available.
) at ../../libguile/eval.c:506
#10 0x00000008006a09f9 in scm_i_with_continuation_barrier (body=0x8006a0670 <c_body>, body_data=0x7fffffffdbe0, handler=0x8006a08b0 <c_handler>, handler_data=0x7fffffffdbe0,
    pre_unwind_handler=0x8006a0710 <pre_unwind_handler>, pre_unwind_handler_data=0x6adc80) at ../../libguile/continuations.c:455
#11 0x00000008006a0ad5 in scm_c_with_continuation_barrier (func=Variable "func" is not available.
) at ../../libguile/continuations.c:552
#12 0x000000080071b470 in with_guile_and_parent (base=0x7fffffffdc40, data=Variable "data" is not available.
) at ../../libguile/threads.c:913
#13 0x000000080114ed75 in GC_call_with_stack_base () from /nix/store/rc9flhds6pwpb9wvi55v2f9h7mkzsj0x-boehm-gc-7.2pre20110122/lib/libgc.so.1
#14 0x000000080071abf1 in scm_i_with_guile_and_parent (func=Variable "func" is not available.
) at ../../libguile/threads.c:959
#15 0x0000000000400b80 in main (argc=Variable "argc" is not available.
) at ../../../test-suite/standalone/test-scm-spawn-thread.c:68

Thread 2 (Thread 801a0ae40 (LWP 100280)):
#0  0x00000008016250dc in thr_kill () from /lib/libc.so.7
#1  0x00000008016c1dcb in abort () from /lib/libc.so.7
#2  0x00000008016ab1a5 in __assert () from /lib/libc.so.7
#3  0x000000080071ab2d in scm_spawn_thread (body=Variable "body" is not available.
) at ../../libguile/threads.c:1175
#4  0x00000008006f7ab1 in start_signal_delivery_thread () at ../../libguile/scmsigs.c:211
#5  0x000000080128d9c8 in pthread_once () from /lib/libthr.so.3
#6  0x000000080071b1b0 in do_thread_exit (v=Variable "v" is not available.
) at ../../libguile/threads.c:666
#7  0x00000008006a067a in c_body (d=0x7fffffbfedc0) at ../../libguile/continuations.c:518
#8  0x0000000800727803 in vm_regular_engine (vm=0x855e00, program=0x8570c0, argv=0x1, nargs=9272864) at vm-i-system.c:1007
#9  0x000000080071fbc6 in scm_c_vm_run (vm=0x855e00, program=0x785930, argv=0x7fffffbfed20, nargs=4) at ../../libguile/vm.c:567
#10 0x00000008006a7ce3 in scm_call_4 (proc=0x785930, arg1=Variable "arg1" is not available.
) at ../../libguile/eval.c:506
#11 0x00000008006a09f9 in scm_i_with_continuation_barrier (body=0x8006a0670 <c_body>, body_data=0x7fffffbfedc0, handler=0x8006a08b0 <c_handler>, handler_data=0x7fffffbfedc0,
    pre_unwind_handler=0x8006a0710 <pre_unwind_handler>, pre_unwind_handler_data=0x6adc80) at ../../libguile/continuations.c:455
#12 0x00000008006a0ad5 in scm_c_with_continuation_barrier (func=Variable "func" is not available.
) at ../../libguile/continuations.c:552
#13 0x0000000801153bc8 in GC_call_with_gc_active () from /nix/store/rc9flhds6pwpb9wvi55v2f9h7mkzsj0x-boehm-gc-7.2pre20110122/lib/libgc.so.1
#14 0x000000080071b411 in with_guile_and_parent (base=0x7fffffbfee60, data=Variable "data" is not available.
) at ../../libguile/threads.c:236
#15 0x000000080114ed75 in GC_call_with_stack_base () from /nix/store/rc9flhds6pwpb9wvi55v2f9h7mkzsj0x-boehm-gc-7.2pre20110122/lib/libgc.so.1
#16 0x000000080071abf1 in scm_i_with_guile_and_parent (func=Variable "func" is not available.
) at ../../libguile/threads.c:959
#17 0x000000080114ed75 in GC_call_with_stack_base () from /nix/store/rc9flhds6pwpb9wvi55v2f9h7mkzsj0x-boehm-gc-7.2pre20110122/lib/libgc.so.1
#18 0x000000080071af2e in on_thread_exit (v=Variable "v" is not available.
) at ../../libguile/threads.c:748
#19 0x000000080128985b in pthread_key_delete () from /lib/libthr.so.3
#20 0x000000080128f5f3 in pthread_exit () from /lib/libthr.so.3
#21 0x00000008012864f9 in pthread_getprio () from /lib/libthr.so.3
#22 0x0000000000000000 in ?? ()
Cannot access memory at address 0x7fffffbff000

Thread 1 (Thread 801a35c80 (LWP 100403)):
#0  0x00000008016c4fcc in write () from /lib/libc.so.7
#1  0x00000008016c4a60 in memcpy () from /lib/libc.so.7
#2  0x00000008016c49ab in memcpy () from /lib/libc.so.7
#3  0x00000008016c359d in f_prealloc () from /lib/libc.so.7
#4  0x00000008016c2d5c in fwrite () from /lib/libc.so.7
#5  0x00000008016b5f3b in open () from /lib/libc.so.7
#6  0x00000008016b74e9 in open () from /lib/libc.so.7
#7  0x00000008016b98da in vfprintf () from /lib/libc.so.7
#8  0x00000008016a794a in printf () from /lib/libc.so.7
#9  0x000000080071b040 in really_spawn (d=0x7fffffbfeaa0) at ../../libguile/threads.c:1100
---Type <return> to continue, or q <return> to quit---
#10 0x00000008006a067a in c_body (d=0x7fffff9fde70) at ../../libguile/continuations.c:518
#11 0x0000000800727803 in vm_regular_engine (vm=0x801fc0, program=0x8d80c0, argv=0x1, nargs=9272448) at vm-i-system.c:1007
#12 0x000000080071fbc6 in scm_c_vm_run (vm=0x801fc0, program=0x785930, argv=0x7fffff9fddd0, nargs=4) at ../../libguile/vm.c:567
#13 0x00000008006a7ce3 in scm_call_4 (proc=0x785930, arg1=Variable "arg1" is not available.
) at ../../libguile/eval.c:506
#14 0x00000008006a09f9 in scm_i_with_continuation_barrier (body=0x8006a0670 <c_body>, body_data=0x7fffff9fde70, handler=0x8006a08b0 <c_handler>, handler_data=0x7fffff9fde70, 
    pre_unwind_handler=0x8006a0710 <pre_unwind_handler>, pre_unwind_handler_data=0x6adc80) at ../../libguile/continuations.c:455
#15 0x00000008006a0ad5 in scm_c_with_continuation_barrier (func=Variable "func" is not available.
) at ../../libguile/continuations.c:552
#16 0x000000080071b470 in with_guile_and_parent (base=0x7fffff9fded0, data=Variable "data" is not available.
) at ../../libguile/threads.c:913
#17 0x000000080114ed75 in GC_call_with_stack_base () from /nix/store/rc9flhds6pwpb9wvi55v2f9h7mkzsj0x-boehm-gc-7.2pre20110122/lib/libgc.so.1
#18 0x000000080071abf1 in scm_i_with_guile_and_parent (func=Variable "func" is not available.
) at ../../libguile/threads.c:959
#19 0x000000080071ac74 in spawn_thread (d=Variable "d" is not available.
) at ../../libguile/threads.c:1133
#20 0x000000080115337c in GC_inner_start_routine () from /nix/store/rc9flhds6pwpb9wvi55v2f9h7mkzsj0x-boehm-gc-7.2pre20110122/lib/libgc.so.1
#21 0x000000080114ed75 in GC_call_with_stack_base () from /nix/store/rc9flhds6pwpb9wvi55v2f9h7mkzsj0x-boehm-gc-7.2pre20110122/lib/libgc.so.1
#22 0x00000008012864f1 in pthread_getprio () from /lib/libthr.so.3
#23 0x0000000000000000 in ?? ()
Cannot access memory at address 0x7fffff9fe000
#0  0x00000008016250dc in thr_kill () from /lib/libc.so.7
--8<---------------cut here---------------end--------------->8---

Adding printfs shows that the thread calling scm_spawn_thread leaves
cond_wait before the signal thread has signaled the condition (in
really_spawn).

A similar program succeeds, suggesting that it’s not a bug/limitation of
libpthread:

--8<---------------cut here---------------start------------->8---
#include <pthread.h>
#include <assert.h>
#include <stdlib.h>
#include <stdio.h>

#define GC_THREADS 1
#include <gc/gc.h>

struct sync
{
  pthread_cond_t cond;
  pthread_mutex_t mutex;
};

static void *
hello (void *arg)
{
  int err;
  struct sync *s = (struct sync *) arg;

  printf ("hello from %p\n", pthread_self ());
  GC_MALLOC (123);

  err = pthread_mutex_lock (&s->mutex);
  assert (err == 0);

  err = pthread_cond_signal (&s->cond);
  assert (err == 0);

  err = pthread_mutex_unlock (&s->mutex);
  assert (err == 0);
}

static void
on_thread_exit ()
{
  int err;
  pthread_t child;
  struct sync s;

  pthread_mutex_init (&s.mutex, NULL);
  pthread_cond_init (&s.cond, NULL);

  pthread_mutex_lock (&s.mutex);
  err = pthread_create (&child, NULL, hello, &s);
  assert (err == 0);
  err = pthread_cond_wait (&s.cond, &s.mutex);
  assert (err == 0);
  err = pthread_mutex_unlock (&s.mutex);
  assert (err == 0);

  printf ("child %p seen from %p\n", child, pthread_self ());
}

static void *
entry (void *unused)
{
  pthread_key_t k;
  pthread_key_create (&k, on_thread_exit);
  pthread_setspecific (k, (void *) 123);
  return NULL;
}

int
main ()
{
  int err;
  pthread_t child;
  void *ret;

  GC_INIT ();

  err = pthread_create (&child, NULL, entry, NULL);
  assert (err == 0);

  err = pthread_join (child, &ret);
  assert (err == 0);

  return 0;
}
--8<---------------cut here---------------end--------------->8---

To be continued...

Ludo’.




Reply sent to Andy Wingo <wingo <at> pobox.com>:
You have taken responsibility. (Wed, 13 Mar 2013 10:04:02 GMT) Full text and rfc822 format available.

Notification sent to ludo <at> gnu.org (Ludovic Courtès):
bug acknowledged by developer. (Wed, 13 Mar 2013 10:04:02 GMT) Full text and rfc822 format available.

Message #10 received at 10641-done <at> debbugs.gnu.org (full text, mbox):

From: Andy Wingo <wingo <at> pobox.com>
To: ludo <at> gnu.org (Ludovic Courtès)
Cc: 10641-done <at> debbugs.gnu.org
Subject: Re: bug#10641: [2.0.3+] start_signal_delivery_thread failure on
	x86_64-freebsd8.2
Date: Wed, 13 Mar 2013 11:02:41 +0100
Hi,

On Sun 29 Jan 2012 19:00, ludo <at> gnu.org (Ludovic Courtès) writes:

> Adding printfs shows that the thread calling scm_spawn_thread leaves
> cond_wait before the signal thread has signaled the condition (in
> really_spawn).

From http://pubs.opengroup.org/onlinepubs/009604599/functions/pthread_cond_wait.html

  When using condition variables there is always a Boolean predicate
  involving shared variables associated with each condition wait that is
  true if the thread should proceed. Spurious wakeups from the
  pthread_cond_timedwait() or pthread_cond_wait() functions may
  occur. Since the return from pthread_cond_timedwait() or
  pthread_cond_wait() does not imply anything about the value of this
  predicate, the predicate should be re-evaluated upon such return.

It seems this code is not robust in the face of spurious wakeups.  I
pushed a patch that waits for data.thread to become non-false.  That
should fix this issue.

Cheers,

Andy
-- 
http://wingolog.org/




Information forwarded to bug-guile <at> gnu.org:
bug#10641; Package guile. (Fri, 29 Mar 2013 09:53:01 GMT) Full text and rfc822 format available.

Message #13 received at 10641-done <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Andy Wingo <wingo <at> pobox.com>
Cc: 10641-done <at> debbugs.gnu.org
Subject: Re: bug#10641: [2.0.3+] start_signal_delivery_thread failure on
	x86_64-freebsd8.2
Date: Fri, 29 Mar 2013 10:49:41 +0100
Andy Wingo <wingo <at> pobox.com> skribis:

> On Sun 29 Jan 2012 19:00, ludo <at> gnu.org (Ludovic Courtès) writes:
>
>> Adding printfs shows that the thread calling scm_spawn_thread leaves
>> cond_wait before the signal thread has signaled the condition (in
>> really_spawn).
>
> From http://pubs.opengroup.org/onlinepubs/009604599/functions/pthread_cond_wait.html
>
>   When using condition variables there is always a Boolean predicate
>   involving shared variables associated with each condition wait that is
>   true if the thread should proceed. Spurious wakeups from the
>   pthread_cond_timedwait() or pthread_cond_wait() functions may
>   occur. Since the return from pthread_cond_timedwait() or
>   pthread_cond_wait() does not imply anything about the value of this
>   predicate, the predicate should be re-evaluated upon such return.
>
> It seems this code is not robust in the face of spurious wakeups.  I
> pushed a patch that waits for data.thread to become non-false.  That
> should fix this issue.

Good catch, and congratulations!  I can confirm that this fixes
--with-thread builds on FreeBSD 8.2:

  http://hydra.nixos.org/build/4519811

Thanks!

Ludo’.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Fri, 26 Apr 2013 11:24:03 GMT) Full text and rfc822 format available.

This bug report was last modified 11 years and 1 day ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.