GNU bug report logs - #21694
'clone' syscall binding unreliable

Previous Next

Package: guix;

Reported by: ludo <at> gnu.org (Ludovic Courtès)

Date: Fri, 16 Oct 2015 20:41:02 UTC

Severity: normal

Done: ludo <at> gnu.org (Ludovic Courtès)

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 21694 in the body.
You can then email your comments to 21694 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guix <at> gnu.org:
bug#21694; Package guix. (Fri, 16 Oct 2015 20:41:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to ludo <at> gnu.org (Ludovic Courtès):
New bug report received and forwarded. Copy sent to bug-guix <at> gnu.org. (Fri, 16 Oct 2015 20:41:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: David Thompson <davet <at> gnu.org>
Cc: bug-guix <at> gnu.org
Subject: 'clone' syscall binding unreliable
Date: Fri, 16 Oct 2015 22:39:59 +0200
[Message part 1 (text/plain, inline)]
I’m reporting the problem and (hopefully) the solution, but I think we’d
better double-check this.

The problem: Running the test below in a loop sometimes gets a SIGSEGV
in the child process (on x86_64, libc 2.22.)

--8<---------------cut here---------------start------------->8---
(use-modules (guix build syscalls) (ice-9 match))

(match (clone (logior CLONE_NEWUSER
                      CLONE_CHILD_SETTID
                      CLONE_CHILD_CLEARTID
                      SIGCHLD))
  (0
   (throw 'x))                                    ;XXX: sometimes segfaults
  (pid
   (match (waitpid pid)
     ((_ . status)
      (pk 'status status)
      (exit (not (status:term-sig status)))))))
--8<---------------cut here---------------end--------------->8---

Looking at (guix build syscalls) though, I see an ABI mismatch between
our definition and the actual ‘syscall’ C function, and between our
‘clone’ definition and the actual C function.

This leads to the attached patch, which also fixes the above problem for me.

[Message part 2 (text/x-patch, inline)]
diff --git a/guix/build/syscalls.scm b/guix/build/syscalls.scm
index 80b9d00..f931f8d 100644
--- a/guix/build/syscalls.scm
+++ b/guix/build/syscalls.scm
@@ -322,10 +322,16 @@ string TMPL and return its file name.  TMPL must end with 'XXXXXX'."
 (define CLONE_NEWNET         #x40000000)
 
 ;; The libc interface to sys_clone is not useful for Scheme programs, so the
-;; low-level system call is wrapped instead.
+;; low-level system call is wrapped instead.  The 'syscall' function is
+;; declared in <unistd.h> as a variadic function; in practice, it expects 6
+;; pointer-sized arguments, as shown in, e.g., x86_64/syscall.S.
 (define clone
   (let* ((ptr        (dynamic-func "syscall" (dynamic-link)))
-         (proc       (pointer->procedure int ptr (list int int '*)))
+         (proc       (pointer->procedure long ptr
+                                         (list long                   ;sysno
+                                               unsigned-long          ;flags
+                                               '* '* '*
+                                               '*)))
          ;; TODO: Don't do this.
          (syscall-id (match (utsname:machine (uname))
                        ("i686"   120)
@@ -336,7 +342,10 @@ string TMPL and return its file name.  TMPL must end with 'XXXXXX'."
       "Create a new child process by duplicating the current parent process.
 Unlike the fork system call, clone accepts FLAGS that specify which resources
 are shared between the parent and child processes."
-      (let ((ret (proc syscall-id flags %null-pointer))
+      (let ((ret (proc syscall-id flags
+                       %null-pointer               ;child stack
+                       %null-pointer %null-pointer ;ptid & ctid
+                       %null-pointer))             ;unused
             (err (errno)))
         (if (= ret -1)
             (throw 'system-error "clone" "~d: ~A"
[Message part 3 (text/plain, inline)]
Could you test this patch?

Now, there remains the question of CLONE_CHILD_SETTID and
CLONE_CHILD_CLEARTID.  Since we’re passing NULL for ‘ctid’, I expect
that these flags have no effect at all.

Conversely, libc uses these flags to update the thread ID in the child
process (x86_64/arch-fork.h):

--8<---------------cut here---------------start------------->8---
#define ARCH_FORK() \
  INLINE_SYSCALL (clone, 4,                                                   \
                  CLONE_CHILD_SETTID | CLONE_CHILD_CLEARTID | SIGCHLD, 0,     \
                  NULL, &THREAD_SELF->tid)
--8<---------------cut here---------------end--------------->8---

This is certainly useful, but we’d have troubles doing it from the FFI…
It may that this is fine if the process doesn’t use threads.

Ludo’.

Information forwarded to bug-guix <at> gnu.org:
bug#21694; Package guix. (Fri, 16 Oct 2015 23:13:02 GMT) Full text and rfc822 format available.

Message #8 received at 21694 <at> debbugs.gnu.org (full text, mbox):

From: "Thompson, David" <dthompson2 <at> worcester.edu>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 21694 <at> debbugs.gnu.org, David Thompson <davet <at> gnu.org>
Subject: Re: bug#21694: 'clone' syscall binding unreliable
Date: Fri, 16 Oct 2015 19:12:53 -0400
On Fri, Oct 16, 2015 at 4:39 PM, Ludovic Courtès <ludo <at> gnu.org> wrote:
> I’m reporting the problem and (hopefully) the solution, but I think we’d
> better double-check this.
>
> The problem: Running the test below in a loop sometimes gets a SIGSEGV
> in the child process (on x86_64, libc 2.22.)
>
> --8<---------------cut here---------------start------------->8---
> (use-modules (guix build syscalls) (ice-9 match))
>
> (match (clone (logior CLONE_NEWUSER
>                       CLONE_CHILD_SETTID
>                       CLONE_CHILD_CLEARTID
>                       SIGCHLD))
>   (0
>    (throw 'x))                                    ;XXX: sometimes segfaults
>   (pid
>    (match (waitpid pid)
>      ((_ . status)
>       (pk 'status status)
>       (exit (not (status:term-sig status)))))))
> --8<---------------cut here---------------end--------------->8---
>
> Looking at (guix build syscalls) though, I see an ABI mismatch between
> our definition and the actual ‘syscall’ C function, and between our
> ‘clone’ definition and the actual C function.
>
> This leads to the attached patch, which also fixes the above problem for me.
>
> Could you test this patch?

The patch looks good.  Thanks for catching this!

> Now, there remains the question of CLONE_CHILD_SETTID and
> CLONE_CHILD_CLEARTID.  Since we’re passing NULL for ‘ctid’, I expect
> that these flags have no effect at all.

I added those flags in commit ee78d02 because they solved a real issue
I ran into.  Adding those flags made 'clone' look like a
'primitive-fork' call when examined with strace.

> Conversely, libc uses these flags to update the thread ID in the child
> process (x86_64/arch-fork.h):
>
> --8<---------------cut here---------------start------------->8---
> #define ARCH_FORK() \
>   INLINE_SYSCALL (clone, 4,                                                   \
>                   CLONE_CHILD_SETTID | CLONE_CHILD_CLEARTID | SIGCHLD, 0,     \
>                   NULL, &THREAD_SELF->tid)
> --8<---------------cut here---------------end--------------->8---
>
> This is certainly useful, but we’d have troubles doing it from the FFI…
> It may that this is fine if the process doesn’t use threads.

Right, so here's what 'primitive-fork' does:

    clone(child_stack=0,
flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
child_tidptr=0x7fc5398cea10) = 13247

Here's what 'clone' does:

    clone(child_stack=0,
flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0)
= 14038

In practice it may not be a problem since most of the time you'd
'exec' after cloning.  Is there any reliable way to get a hold of
whatever THREAD_SELF is?  I wish the libc 'clone' function didn't have
that silly callback and behaved like 'fork', then we could have
avoided these issues altogether.

- Dave




Information forwarded to bug-guix <at> gnu.org:
bug#21694; Package guix. (Sat, 17 Oct 2015 10:15:03 GMT) Full text and rfc822 format available.

Message #11 received at 21694 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: "Thompson\, David" <dthompson2 <at> worcester.edu>
Cc: 21694 <at> debbugs.gnu.org, David Thompson <davet <at> gnu.org>
Subject: Re: bug#21694: 'clone' syscall binding unreliable
Date: Sat, 17 Oct 2015 12:14:34 +0200
"Thompson, David" <dthompson2 <at> worcester.edu> skribis:

> On Fri, Oct 16, 2015 at 4:39 PM, Ludovic Courtès <ludo <at> gnu.org> wrote:
>> I’m reporting the problem and (hopefully) the solution, but I think we’d
>> better double-check this.
>>
>> The problem: Running the test below in a loop sometimes gets a SIGSEGV
>> in the child process (on x86_64, libc 2.22.)
>>
>> --8<---------------cut here---------------start------------->8---
>> (use-modules (guix build syscalls) (ice-9 match))
>>
>> (match (clone (logior CLONE_NEWUSER
>>                       CLONE_CHILD_SETTID
>>                       CLONE_CHILD_CLEARTID
>>                       SIGCHLD))
>>   (0
>>    (throw 'x))                                    ;XXX: sometimes segfaults
>>   (pid
>>    (match (waitpid pid)
>>      ((_ . status)
>>       (pk 'status status)
>>       (exit (not (status:term-sig status)))))))
>> --8<---------------cut here---------------end--------------->8---
>>
>> Looking at (guix build syscalls) though, I see an ABI mismatch between
>> our definition and the actual ‘syscall’ C function, and between our
>> ‘clone’ definition and the actual C function.
>>
>> This leads to the attached patch, which also fixes the above problem for me.
>>
>> Could you test this patch?
>
> The patch looks good.  Thanks for catching this!

Great, pushed as 0e3cc31.

>> Now, there remains the question of CLONE_CHILD_SETTID and
>> CLONE_CHILD_CLEARTID.  Since we’re passing NULL for ‘ctid’, I expect
>> that these flags have no effect at all.
>
> I added those flags in commit ee78d02 because they solved a real issue
> I ran into.  Adding those flags made 'clone' look like a
> 'primitive-fork' call when examined with strace.

Could you check whether removing these flags makes a difference now?
How can we test?  (Preferably not at the REPL.)

>> Conversely, libc uses these flags to update the thread ID in the child
>> process (x86_64/arch-fork.h):
>>
>> --8<---------------cut here---------------start------------->8---
>> #define ARCH_FORK() \
>>   INLINE_SYSCALL (clone, 4,                                                   \
>>                   CLONE_CHILD_SETTID | CLONE_CHILD_CLEARTID | SIGCHLD, 0,     \
>>                   NULL, &THREAD_SELF->tid)
>> --8<---------------cut here---------------end--------------->8---
>>
>> This is certainly useful, but we’d have troubles doing it from the FFI…
>> It may that this is fine if the process doesn’t use threads.
>
> Right, so here's what 'primitive-fork' does:
>
>     clone(child_stack=0,
> flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
> child_tidptr=0x7fc5398cea10) = 13247
>
> Here's what 'clone' does:
>
>     clone(child_stack=0,
> flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0)
> = 14038

You mean ‘clone’ from libc?

I guess CLONE_CHILD_{CLEARTID,SETTID} don’t hurt here, but they have no
effect either.  That’s what the clone(2) page suggests:

   CLONE_CHILD_CLEARTID (since Linux 2.5.49)
          Erase child thread ID at location ctid in child memory when
          the child exits, and do a wakeup on the futex at that address.
          The address involved may be changed by the set_tid_address(2)
          system call.  This is used by threading libraries.

   CLONE_CHILD_SETTID (since Linux 2.5.49)
          Store child thread ID at location ctid in child memory.

And here ctid == NULL.

And indeed, kernel/fork.c in Linux does:

    p->set_child_tid = (clone_flags & CLONE_CHILD_SETTID) ? child_tidptr : NULL;
    /*
     * Clear TID on mm_release()?
     */
    p->clear_child_tid = (clone_flags & CLONE_CHILD_CLEARTID) ? child_tidptr : NULL;

So in effect, using NULL for ctid equates to not passing the
CLEARTID/SETTID flags.

QED.  :-)

> In practice it may not be a problem since most of the time you'd
> 'exec' after cloning.  Is there any reliable way to get a hold of
> whatever THREAD_SELF is? 

THREAD_SELF is really not something we want to poke at; quoth
x86_64/tls.h:

--8<---------------cut here---------------start------------->8---
# define THREAD_SELF \
  ({ struct pthread *__self;						      \
     asm ("mov %%fs:%c1,%0" : "=r" (__self)				      \
	  : "i" (offsetof (struct pthread, header.self)));	 	      \
     __self;})
--8<---------------cut here---------------end--------------->8---

> I wish the libc 'clone' function didn't have that silly callback and
> behaved like 'fork', then we could have avoided these issues
> altogether.

Is the callback really an issue?  We have ‘procedure->pointer’ after
all.

Thanks,
Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#21694; Package guix. (Thu, 22 Oct 2015 14:39:02 GMT) Full text and rfc822 format available.

Message #14 received at 21694 <at> debbugs.gnu.org (full text, mbox):

From: Mark H Weaver <mhw <at> netris.org>
To: ludo <at> gnu.org (Ludovic Courtès)
Cc: 21694 <at> debbugs.gnu.org, David Thompson <davet <at> gnu.org>
Subject: Re: bug#21694: 'clone' syscall binding unreliable
Date: Thu, 22 Oct 2015 10:38:21 -0400
ludo <at> gnu.org (Ludovic Courtès) writes:

> Looking at (guix build syscalls) though, I see an ABI mismatch between
> our definition and the actual ‘syscall’ C function, and between our
> ‘clone’ definition and the actual C function.

Good catch!  However, please see below.

> This leads to the attached patch, which also fixes the above problem for me.
>
> diff --git a/guix/build/syscalls.scm b/guix/build/syscalls.scm
> index 80b9d00..f931f8d 100644
> --- a/guix/build/syscalls.scm
> +++ b/guix/build/syscalls.scm
> @@ -322,10 +322,16 @@ string TMPL and return its file name.  TMPL must end with 'XXXXXX'."
>  (define CLONE_NEWNET         #x40000000)
>  
>  ;; The libc interface to sys_clone is not useful for Scheme programs, so the
> -;; low-level system call is wrapped instead.
> +;; low-level system call is wrapped instead.  The 'syscall' function is
> +;; declared in <unistd.h> as a variadic function; in practice, it expects 6
> +;; pointer-sized arguments, as shown in, e.g., x86_64/syscall.S.
>  (define clone
>    (let* ((ptr        (dynamic-func "syscall" (dynamic-link)))
> -         (proc       (pointer->procedure int ptr (list int int '*)))
> +         (proc       (pointer->procedure long ptr
> +                                         (list long                   ;sysno
> +                                               unsigned-long          ;flags

'long' and 'unsigned long' might not be the same size as a pointer.
Better to use 'size_t' for both of these.  While not strictly guaranteed
to be the same size as a pointer, in practice they should be the same
except on architectures with segmented memory models.

What do you think?

      Mark

PS: 'intptr_t' and 'uintptr_t' would be best, but they are optional in
    C99 and not in (system foreign).  'ptrdiff_t' would be better, but
    was not available in (system foreign) before guile-2.0.9.




Information forwarded to bug-guix <at> gnu.org:
bug#21694; Package guix. (Sun, 25 Oct 2015 21:01:01 GMT) Full text and rfc822 format available.

Message #17 received at 21694 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Mark H Weaver <mhw <at> netris.org>
Cc: 21694 <at> debbugs.gnu.org, David Thompson <davet <at> gnu.org>
Subject: Re: bug#21694: 'clone' syscall binding unreliable
Date: Sun, 25 Oct 2015 21:59:45 +0100
Mark H Weaver <mhw <at> netris.org> skribis:

> ludo <at> gnu.org (Ludovic Courtès) writes:

[...]

>>  ;; The libc interface to sys_clone is not useful for Scheme programs, so the
>> -;; low-level system call is wrapped instead.
>> +;; low-level system call is wrapped instead.  The 'syscall' function is
>> +;; declared in <unistd.h> as a variadic function; in practice, it expects 6
>> +;; pointer-sized arguments, as shown in, e.g., x86_64/syscall.S.
>>  (define clone
>>    (let* ((ptr        (dynamic-func "syscall" (dynamic-link)))
>> -         (proc       (pointer->procedure int ptr (list int int '*)))
>> +         (proc       (pointer->procedure long ptr
>> +                                         (list long                   ;sysno
>> +                                               unsigned-long          ;flags
>
> 'long' and 'unsigned long' might not be the same size as a pointer.
> Better to use 'size_t' for both of these.  While not strictly guaranteed
> to be the same size as a pointer, in practice they should be the same
> except on architectures with segmented memory models.
>
> What do you think?

I had the same reaction, but posix/unistd.h in libc really uses these
types for ‘syscall’ so I thought it’d be best to stick to them.

WDYT?

Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#21694; Package guix. (Wed, 28 Oct 2015 04:54:02 GMT) Full text and rfc822 format available.

Message #20 received at 21694 <at> debbugs.gnu.org (full text, mbox):

From: Mark H Weaver <mhw <at> netris.org>
To: ludo <at> gnu.org (Ludovic Courtès)
Cc: 21694 <at> debbugs.gnu.org, David Thompson <davet <at> gnu.org>
Subject: Re: bug#21694: 'clone' syscall binding unreliable
Date: Wed, 28 Oct 2015 00:53:28 -0400
ludo <at> gnu.org (Ludovic Courtès) writes:

> Mark H Weaver <mhw <at> netris.org> skribis:
>
>> ludo <at> gnu.org (Ludovic Courtès) writes:
>
> [...]
>
>>>  ;; The libc interface to sys_clone is not useful for Scheme programs, so the
>>> -;; low-level system call is wrapped instead.
>>> +;; low-level system call is wrapped instead.  The 'syscall' function is
>>> +;; declared in <unistd.h> as a variadic function; in practice, it expects 6
>>> +;; pointer-sized arguments, as shown in, e.g., x86_64/syscall.S.
>>>  (define clone
>>>    (let* ((ptr        (dynamic-func "syscall" (dynamic-link)))
>>> -         (proc       (pointer->procedure int ptr (list int int '*)))
>>> +         (proc       (pointer->procedure long ptr
>>> +                                         (list long                   ;sysno
>>> +                                               unsigned-long          ;flags
>>
>> 'long' and 'unsigned long' might not be the same size as a pointer.
>> Better to use 'size_t' for both of these.  While not strictly guaranteed
>> to be the same size as a pointer, in practice they should be the same
>> except on architectures with segmented memory models.
>>
>> What do you think?
>
> I had the same reaction, but posix/unistd.h in libc really uses these
> types for ‘syscall’ so I thought it’d be best to stick to them.

Okay, makes sense.

    Thanks,
      Mark




Reply sent to ludo <at> gnu.org (Ludovic Courtès):
You have taken responsibility. (Wed, 28 Oct 2015 14:40:03 GMT) Full text and rfc822 format available.

Notification sent to ludo <at> gnu.org (Ludovic Courtès):
bug acknowledged by developer. (Wed, 28 Oct 2015 14:40:05 GMT) Full text and rfc822 format available.

Message #25 received at 21694-done <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: "Thompson\, David" <dthompson2 <at> worcester.edu>
Cc: 21694-done <at> debbugs.gnu.org, David Thompson <davet <at> gnu.org>
Subject: Re: bug#21694: 'clone' syscall binding unreliable
Date: Wed, 28 Oct 2015 15:39:49 +0100
ludo <at> gnu.org (Ludovic Courtès) skribis:

> "Thompson, David" <dthompson2 <at> worcester.edu> skribis:
>
>> On Fri, Oct 16, 2015 at 4:39 PM, Ludovic Courtès <ludo <at> gnu.org> wrote:

[...]

>>> Now, there remains the question of CLONE_CHILD_SETTID and
>>> CLONE_CHILD_CLEARTID.  Since we’re passing NULL for ‘ctid’, I expect
>>> that these flags have no effect at all.
>>
>> I added those flags in commit ee78d02 because they solved a real issue
>> I ran into.  Adding those flags made 'clone' look like a
>> 'primitive-fork' call when examined with strace.
>
> Could you check whether removing these flags makes a difference now?

I removed them in commit after confirming that it affects neither the
test suite nor ‘guix system environment’ (on x86_64, with Linux-libre
4.2.3-gnu.)

Thanks,
Ludo’.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 26 Nov 2015 12:24:05 GMT) Full text and rfc822 format available.

This bug report was last modified 8 years and 151 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.