GNU bug report logs - #13611
SEGV during SMOB GC

Previous Next

Package: guile;

Reported by: Mike Gran <spk121 <at> yahoo.com>

Date: Sat, 2 Feb 2013 20:53:02 UTC

Severity: normal

Done: Andy Wingo <wingo <at> pobox.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 13611 in the body.
You can then email your comments to 13611 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guile <at> gnu.org:
bug#13611; Package guile. (Sat, 02 Feb 2013 20:53:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Mike Gran <spk121 <at> yahoo.com>:
New bug report received and forwarded. Copy sent to bug-guile <at> gnu.org. (Sat, 02 Feb 2013 20:53:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Mike Gran <spk121 <at> yahoo.com>
To: Bug Guile <bug-guile <at> gnu.org>
Subject: SEGV during SMOB GC
Date: Sat, 2 Feb 2013 12:51:40 -0800 (PST)
[Message part 1 (text/plain, inline)]
Hello-

I have a reproducible SEGV during GC of SMOBs on Guile 2.0.7.
It was also present in 2.0.6.


To reproduce compile main.c as 

$ gcc -std=gnu99 -shared -o smobbug.so -Wall -Wextra `pkg-config guile-2.0 --cflags --libs` -fPIC main.c


Then with
$ LD_PRELOAD=./smobbug.so LD_LIBRARY_PATH=. GUILE_LOAD_PATH=. guile

;; At the repl, load the lib

 (use-modules (smobbug))
;; Make a SMOB to be GC'd

 (handlesmob-init)
;; Trigger a GC from the GC thread
 (string-length (make-string 10000000))

This gives

  Program received signal SIGSEGV, Segmentation fault.
  [Switching to Thread 0xb7d98b40 (LWP 20488)]
  0xb7f251ab in smob_mark (addr=0x8608ff0, mark_stack_ptr=0xb7d90308, 
      mark_stack_limit=0xb7d982f0, env=0) at smob.c:325
  325           SCM_I_CURRENT_THREAD->current_mark_stack_ptr = mark_stack_ptr;

Here's what's happening internally.  When Guile starts up, it creates 3
threads
* Initial thread
* GC thread from scm_storage_prehistory GC_INIT()
* signal delivery thread

That second thread is the one from which automatic garbage collection
occurs.  The way that thread gets created, it has an
scm_i_current_thread == NULL, apparently.


So dereferencing scm_i_current_thread causes null dereference.
And smob_mark() will dereference scm_i_current_thread when collecting a
smob with a mark function.

-Mike
[smobbug.scm (text/x-scheme, attachment)]
[main.c (text/x-csrc, attachment)]

Information forwarded to bug-guile <at> gnu.org:
bug#13611; Package guile. (Tue, 05 Feb 2013 10:10:02 GMT) Full text and rfc822 format available.

Message #8 received at 13611 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Mike Gran <spk121 <at> yahoo.com>
Cc: 13611 <at> debbugs.gnu.org
Subject: Re: bug#13611: SEGV during SMOB GC
Date: Tue, 05 Feb 2013 11:07:51 +0100
[Message part 1 (text/plain, inline)]
Hi Mike,

Mike Gran <spk121 <at> yahoo.com> skribis:

> This gives
>
>   Program received signal SIGSEGV, Segmentation fault.
>   [Switching to Thread 0xb7d98b40 (LWP 20488)]
>   0xb7f251ab in smob_mark (addr=0x8608ff0, mark_stack_ptr=0xb7d90308, 
>       mark_stack_limit=0xb7d982f0, env=0) at smob.c:325
>   325           SCM_I_CURRENT_THREAD->current_mark_stack_ptr = mark_stack_ptr;
>
> Here's what's happening internally.  When Guile starts up, it creates 3
> threads
> * Initial thread
> * GC thread from scm_storage_prehistory GC_INIT()
> * signal delivery thread
>
> That second thread is the one from which automatic garbage collection
> occurs.  The way that thread gets created, it has an
> scm_i_current_thread == NULL, apparently.

Is there any chance that you’re using a GC 7.3 pre-release?

There was a similar report on IRC, and the fix appears to be:

[Message part 2 (text/x-patch, inline)]
--- a/libguile/smob.c
+++ b/libguile/smob.c
@@ -1,4 +1,4 @@
-/* Copyright (C) 1995,1996,1998,1999,2000,2001, 2003, 2004, 2006, 2009, 2010, 2011, 2012 Free Software Foundation, Inc.
+/* Copyright (C) 1995,1996,1998,1999,2000,2001, 2003, 2004, 2006, 2009, 2010, 2011, 2012, 2013 Free Software Foundation, Inc.
  * 
  * This library is free software; you can redistribute it and/or
  * modify it under the terms of the GNU Lesser General Public License
@@ -318,7 +318,7 @@ smob_mark (GC_word *addr, struct GC_ms_entry *mark_stack_ptr,
 				     mark_stack_ptr,
 				     mark_stack_limit, NULL);
 
-  if (scm_smobs[smobnum].mark)
+  if (scm_smobs[smobnum].mark && SCM_I_CURRENT_THREAD != NULL)
     {
       SCM obj;
 
[Message part 3 (text/plain, inline)]
(Note that on 2.0 SMOB mark procedures are unnecessary.)

Ludo’.

Information forwarded to bug-guile <at> gnu.org:
bug#13611; Package guile. (Tue, 05 Feb 2013 16:32:01 GMT) Full text and rfc822 format available.

Message #11 received at 13611 <at> debbugs.gnu.org (full text, mbox):

From: Mike Gran <spk121 <at> yahoo.com>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: "13611 <at> debbugs.gnu.org" <13611 <at> debbugs.gnu.org>
Subject: Re: bug#13611: SEGV during SMOB GC
Date: Tue, 5 Feb 2013 08:29:48 -0800 (PST)
[Message part 1 (text/plain, inline)]
> From: Ludovic Courtès <ludo <at> gnu.org>

> Is there any chance that you’re using a GC 7.3 pre-release?

Using gc-7.2b-2.fc17.i686
on Linux 3.6.10-2.fc17.i686 #1 SMP 

> There was a similar report on IRC, and the fix appears to be:

It does fix my SEGV

> (Note that on 2.0 SMOB mark procedures are unnecessary.)

Cool.  Let's yank it from the manual.  Case closed.

Yet...

For what it is worth, I decided to get som statistics on how
often smob_mark is called from a thread with scm_i_current_thread
== NULL vs how often it is called from a thread where it is 
not null.

I wrote the attached patch, and then, using the same little
library as in my initial report, I ran

 (use-modules (smobbug))
 ;; Create a SMOB type
 (handlesmob-init)
 (for-each (lambda (x) (gc))
       (iota 1000))
 (gc-smob-mark-report)

This returned

 Count of GC SMOB marks from null thread: 176
 Count of GC SMOB marks from current thread: 825

Is that expected that GC is sometimes called from a 
thread where scm_i_current_thread is null and sometimes
called from a thread where scm_i_current_thread is
not null?

-Mike 
[smob_gc_mark_report.patch (text/x-patch, attachment)]

Information forwarded to bug-guile <at> gnu.org:
bug#13611; Package guile. (Tue, 05 Feb 2013 16:43:01 GMT) Full text and rfc822 format available.

Message #14 received at 13611 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Mike Gran <spk121 <at> yahoo.com>
Cc: "13611 <at> debbugs.gnu.org" <13611 <at> debbugs.gnu.org>
Subject: Re: bug#13611: SEGV during SMOB GC
Date: Tue, 05 Feb 2013 17:41:22 +0100
Hi,

Mike Gran <spk121 <at> yahoo.com> skribis:

>> From: Ludovic Courtès <ludo <at> gnu.org>
>
>> Is there any chance that you’re using a GC 7.3 pre-release?
>
> Using gc-7.2b-2.fc17.i686

OK.

>> There was a similar report on IRC, and the fix appears to be:
>
> It does fix my SEGV

Good.

[...]

>  Count of GC SMOB marks from null thread: 176
>  Count of GC SMOB marks from current thread: 825
>
> Is that expected that GC is sometimes called from a 
> thread where scm_i_current_thread is null and sometimes
> called from a thread where scm_i_current_thread is
> not null?

Can you check whether your GC was built with --enable-parallel-mark?

I’m confident that the SMOB mark procedure is never called with null
scm_i_current_thread with 7.2 compiled with the default options (the
GnuTLS bindings rely on this, and I had not seen any such report until
someone tried with GC 7.3pre, which uses the parallel marker by
default.)

Thanks!

Ludo’.




Information forwarded to bug-guile <at> gnu.org:
bug#13611; Package guile. (Tue, 05 Feb 2013 17:07:01 GMT) Full text and rfc822 format available.

Message #17 received at 13611 <at> debbugs.gnu.org (full text, mbox):

From: Mike Gran <spk121 <at> yahoo.com>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: "13611 <at> debbugs.gnu.org" <13611 <at> debbugs.gnu.org>
Subject: Re: bug#13611: SEGV during SMOB GC
Date: Tue, 5 Feb 2013 09:04:48 -0800 (PST)
>>  Is that expected that GC is sometimes called from a 

>>  thread where scm_i_current_thread is null and sometimes
>>  called from a thread where scm_i_current_thread is
>>  not null?
> 
> Can you check whether your GC was built with --enable-parallel-mark?
> 
> I’m confident that the SMOB mark procedure is never called with null
> scm_i_current_thread with 7.2 compiled with the default options (the
> GnuTLS bindings rely on this, and I had not seen any such report until
> someone tried with GC 7.3pre, which uses the parallel marker by
> default.)

It looks like fedora gc rpms do use --enable-parallel-mark
for x86 architectures.

You can see that here:
  http://pkgs.fedoraproject.org/cgit/gc.git/tree/gc.spec?h=f17

But it looks like it has been that way for a long time.
Since 2005 at least.

-Mike




Information forwarded to bug-guile <at> gnu.org:
bug#13611; Package guile. (Tue, 05 Feb 2013 21:15:01 GMT) Full text and rfc822 format available.

Message #20 received at 13611 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Mike Gran <spk121 <at> yahoo.com>
Cc: "13611 <at> debbugs.gnu.org" <13611 <at> debbugs.gnu.org>
Subject: Re: bug#13611: SEGV during SMOB GC
Date: Tue, 05 Feb 2013 22:13:09 +0100
Mike Gran <spk121 <at> yahoo.com> skribis:

>>>  Is that expected that GC is sometimes called from a 
>
>>>  thread where scm_i_current_thread is null and sometimes
>>>  called from a thread where scm_i_current_thread is
>>>  not null?
>> 
>> Can you check whether your GC was built with --enable-parallel-mark?
>> 
>> I’m confident that the SMOB mark procedure is never called with null
>> scm_i_current_thread with 7.2 compiled with the default options (the
>> GnuTLS bindings rely on this, and I had not seen any such report until
>> someone tried with GC 7.3pre, which uses the parallel marker by
>> default.)
>
> It looks like fedora gc rpms do use --enable-parallel-mark
> for x86 architectures.

Then that’s the problem.

> But it looks like it has been that way for a long time.
> Since 2005 at least.

And you did not have the problem before?  That part of Guile hasn’t
changed in a long time, I think.

Ludo’.




Information forwarded to bug-guile <at> gnu.org:
bug#13611; Package guile. (Wed, 06 Feb 2013 04:58:02 GMT) Full text and rfc822 format available.

Message #23 received at 13611 <at> debbugs.gnu.org (full text, mbox):

From: Mike Gran <spk121 <at> yahoo.com>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: "13611 <at> debbugs.gnu.org" <13611 <at> debbugs.gnu.org>
Subject: Re: bug#13611: SEGV during SMOB GC
Date: Tue, 5 Feb 2013 20:56:18 -0800 (PST)
> From: Ludovic Courtès <ludo <at> gnu.org>

>>>  I’m confident that the SMOB mark procedure is never called with null
>>>  scm_i_current_thread with 7.2 compiled with the default options (the
>>>  GnuTLS bindings rely on this, and I had not seen any such report until
>>>  someone tried with GC 7.3pre, which uses the parallel marker by
>>>  default.)
>> 
>>  It looks like fedora gc rpms do use --enable-parallel-mark
>>  for x86 architectures.
> 
> Then that’s the problem.
> 
>>  But it looks like it has been that way for a long time.
>>  Since 2005 at least.
> 
> And you did not have the problem before?  That part of Guile hasn’t
> changed in a long time, I think.

I have a different box than before: more cores.

Well, I guess that, for my libraries, I can make a preprocessor conditional
on SCM_MAJOR_VERSION == 2 to eliminate all smob marking for guile-2.x.

Could parallel marking have other, non-SMOB-related, side effects?
I can disable it by setting the envirnomnent variable GC_MARKERS to 1.

Thanks,

Mike





Information forwarded to bug-guile <at> gnu.org:
bug#13611; Package guile. (Fri, 01 Mar 2013 17:05:01 GMT) Full text and rfc822 format available.

Message #26 received at 13611 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Mike Gran <spk121 <at> yahoo.com>
Cc: 13611 <at> debbugs.gnu.org
Subject: Re: bug#13611: SEGV during SMOB GC
Date: Fri, 01 Mar 2013 18:02:32 +0100
Hi Mike,

AFAICS, commit 01b69e7 fixes the problem for me.  I tested with your
test-smob-mark.c program, both with a 7.2ish and 7.3ish libgc.

Can you confirm that it works for you, and commit your test case with
the changes as discussed on the list?

Thanks!

Ludo’.




Reply sent to Andy Wingo <wingo <at> pobox.com>:
You have taken responsibility. (Wed, 13 Mar 2013 12:44:02 GMT) Full text and rfc822 format available.

Notification sent to Mike Gran <spk121 <at> yahoo.com>:
bug acknowledged by developer. (Wed, 13 Mar 2013 12:44:02 GMT) Full text and rfc822 format available.

Message #31 received at 13611-done <at> debbugs.gnu.org (full text, mbox):

From: Andy Wingo <wingo <at> pobox.com>
To: ludo <at> gnu.org (Ludovic Courtès)
Cc: 13611-done <at> debbugs.gnu.org, Mike Gran <spk121 <at> yahoo.com>
Subject: Re: bug#13611: SEGV during SMOB GC
Date: Wed, 13 Mar 2013 13:42:38 +0100
On Fri 01 Mar 2013 18:02, ludo <at> gnu.org (Ludovic Courtès) writes:

> AFAICS, commit 01b69e7 fixes the problem for me.  I tested with your
> test-smob-mark.c program, both with a 7.2ish and 7.3ish libgc.
>
> Can you confirm that it works for you, and commit your test case with
> the changes as discussed on the list?

Changes committed, closing bug.  Thanks, all!

Andy
-- 
http://wingolog.org/




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 11 Apr 2013 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 10 years and 354 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.