GNU bug report logs - #15683
[critical] ERROR: ... close-pipe: pipe not in table

Previous Next

Package: guile;

Reported by: David Pirotte <david <at> altosw.be>

Date: Tue, 22 Oct 2013 16:26:02 UTC

Severity: important

Done: Mark H Weaver <mhw <at> netris.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 15683 in the body.
You can then email your comments to 15683 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guile <at> gnu.org:
bug#15683; Package guile. (Tue, 22 Oct 2013 16:26:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to David Pirotte <david <at> altosw.be>:
New bug report received and forwarded. Copy sent to bug-guile <at> gnu.org. (Tue, 22 Oct 2013 16:26:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: David Pirotte <david <at> altosw.be>
To: <bug-guile <at> gnu.org>
Subject: [critical] ERROR: ... close-pipe: pipe not in table
Date: Tue, 22 Oct 2013 14:24:43 -0200
Hello guilers,

	GNU Guile 2.0.9.20-10454

[cfr irc chat of October the 17th, 2013]

I am facing a bug that only occurs on extremely powerful servers:

	in ice-9/popen.scm:
	 106: 1 [close-pipe #<input: #{read pipe}# 69>]
	In unknown file:
	 ?: 0 [scm-error misc-error #f "~A" ("close-pipe: pipe not in table") #f]

I can not reproduce the bug on my personal computer [i5, 4 cores], neither on the
lab most powerful server we have [i7 12 cores], but on this customer's server [2
Xeons E5-2687W, 32 cores total], the bug is not random anymore, it _always_ raises, 
which is critical to us.

Thank you for debugging this asap,
Cheers,
David

in case it might help, here is an extract of the code that raises the error.  in this code,
rg-ergbd1 is an octave [heavy] script [that will be called between 1000 to 62000 depending on
other factors...]

...

(define (ergbd path im-name im-type seeds-dir im-ones x y threshold connectivity mutex log-port)
  (with-mutex mutex ;; (write-log-filename (format #f "(~A, ~A) " x y) log-port)
	      (write-log "." log-port))
  (let* ((cmd (format #f "rg-ergbd1 ~A ~A ~A ~A ~A ~A ~A ~A ~A" path im-name im-type seeds-dir im-ones x y threshold connectivity))
	 (s (open-input-pipe cmd))
	 (results (read-line s)))
    (unless (zero? (status:exit-val (close-pipe s)))
      (error "subprocess returned non-zero result code" cmd))
    results))
...
...
	...
	(par-map (lambda (coord)
	       (ergbd target-dir im-cpol-norm-name im-type seeds-dir im-ones (car coord) (cdr coord) threshold connectivity mutex log-port))
	  coords)
	...




Severity set to 'important' from 'normal' Request was from Mark H Weaver <mhw <at> netris.org> to control <at> debbugs.gnu.org. (Tue, 22 Oct 2013 18:24:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-guile <at> gnu.org:
bug#15683; Package guile. (Sun, 17 Nov 2013 09:48:02 GMT) Full text and rfc822 format available.

Message #10 received at 15683 <at> debbugs.gnu.org (full text, mbox):

From: Mark H Weaver <mhw <at> netris.org>
To: David Pirotte <david <at> altosw.be>
Cc: 15683 <at> debbugs.gnu.org
Subject: Re: bug#15683: [critical] ERROR: ... close-pipe: pipe not in table
Date: Sun, 17 Nov 2013 04:46:37 -0500
[Message part 1 (text/plain, inline)]
Hi David,

Here's a set of patches that should make (ice-9 popen) thread safe.
I've also pushed these to the 'wip-thread-safe-popen' branch in git.

Please let us know if they fix your problem.

    Regards,
      Mark


[0001-Add-mutex-locking-functions-that-also-block-asyncs.patch (text/x-patch, inline)]
From 9b48be7107f3f98cdf2e756d4c1f4c937ff233d7 Mon Sep 17 00:00:00 2001
From: Mark H Weaver <mhw <at> netris.org>
Date: Sun, 17 Nov 2013 04:00:29 -0500
Subject: [PATCH 1/6] Add mutex locking functions that also block asyncs.

* libguile/async.h (scm_i_pthread_mutex_lock_with_asyncs,
  scm_i_pthread_mutex_unlock_with_asyncs): New macros.

* libguile/threads.c (do_unlock_with_asyncs): New static helper.
  (scm_i_dynwind_pthread_mutex_lock_with_asyncs): New function.

* libguile/threads.h (scm_i_dynwind_pthread_mutex_lock_with_asyncs):
  Add prototype.
---
 libguile/async.h   |   12 ++++++++++++
 libguile/threads.c |   16 ++++++++++++++++
 libguile/threads.h |    1 +
 3 files changed, 29 insertions(+), 0 deletions(-)

diff --git a/libguile/async.h b/libguile/async.h
index ceb2b96..6d0460c 100644
--- a/libguile/async.h
+++ b/libguile/async.h
@@ -78,6 +78,18 @@ SCM_API void scm_critical_section_end (void);
     scm_async_click ();						\
   } while (0)
 
+# define scm_i_pthread_mutex_lock_with_asyncs(m)    \
+  do {                                              \
+    SCM_I_CURRENT_THREAD->block_asyncs++;           \
+    scm_i_pthread_mutex_lock(m);                    \
+  } while (0)
+
+# define scm_i_pthread_mutex_unlock_with_asyncs(m)  \
+  do {                                              \
+    scm_i_pthread_mutex_unlock(m);                  \
+    SCM_I_CURRENT_THREAD->block_asyncs--;           \
+  } while (0)
+
 #else /* !BUILDING_LIBGUILE */
 
 # define SCM_CRITICAL_SECTION_START  scm_critical_section_start ()
diff --git a/libguile/threads.c b/libguile/threads.c
index 8cbe1e2..6aeaeb9 100644
--- a/libguile/threads.c
+++ b/libguile/threads.c
@@ -2010,6 +2010,22 @@ scm_pthread_cond_timedwait (scm_i_pthread_cond_t *cond,
 
 #endif
 
+static void
+do_unlock_with_asyncs (void *data)
+{
+  scm_i_pthread_mutex_unlock ((scm_i_pthread_mutex_t *)data);
+  SCM_I_CURRENT_THREAD->block_asyncs--;
+}
+
+void
+scm_i_dynwind_pthread_mutex_lock_with_asyncs (scm_i_pthread_mutex_t *mutex)
+{
+  SCM_I_CURRENT_THREAD->block_asyncs++;
+  scm_i_scm_pthread_mutex_lock (mutex);
+  scm_dynwind_unwind_handler (do_unlock_with_asyncs, mutex,
+                              SCM_F_WIND_EXPLICITLY);
+}
+
 unsigned long
 scm_std_usleep (unsigned long usecs)
 {
diff --git a/libguile/threads.h b/libguile/threads.h
index 901c37b..5a2afa2 100644
--- a/libguile/threads.h
+++ b/libguile/threads.h
@@ -143,6 +143,7 @@ SCM_INTERNAL void scm_init_threads (void);
 SCM_INTERNAL void scm_init_thread_procs (void);
 SCM_INTERNAL void scm_init_threads_default_dynamic_state (void);
 
+SCM_INTERNAL void scm_i_dynwind_pthread_mutex_lock_with_asyncs (scm_i_pthread_mutex_t *mutex);
 
 #define SCM_THREAD_SWITCHING_CODE \
   do { } while (0)
-- 
1.7.5.4

[0002-Block-system-asyncs-while-overrides_lock-is-held.patch (text/x-patch, inline)]
From f68f42d2014bb3dfb8a0d7c502f9d3d9593ee458 Mon Sep 17 00:00:00 2001
From: Mark H Weaver <mhw <at> netris.org>
Date: Sun, 17 Nov 2013 03:19:32 -0500
Subject: [PATCH 2/6] Block system asyncs while 'overrides_lock' is held.

* libguile/procprop.c (scm_set_procedure_property_x): Block system
  asyncs while overrides_lock is held.  Use dynwind block in case
  an exception is thrown.
---
 libguile/procprop.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/libguile/procprop.c b/libguile/procprop.c
index 36228d3..dae3ea7 100644
--- a/libguile/procprop.c
+++ b/libguile/procprop.c
@@ -229,7 +229,8 @@ SCM_DEFINE (scm_set_procedure_property_x, "set-procedure-property!", 3, 0, 0,
     SCM_MISC_ERROR ("arity is a deprecated read-only property", SCM_EOL);
 #endif
 
-  scm_i_pthread_mutex_lock (&overrides_lock);
+  scm_dynwind_begin (0);
+  scm_i_dynwind_pthread_mutex_lock_with_asyncs (&overrides_lock);
   props = scm_hashq_ref (overrides, proc, SCM_BOOL_F);
   if (scm_is_false (props))
     {
@@ -239,7 +240,7 @@ SCM_DEFINE (scm_set_procedure_property_x, "set-procedure-property!", 3, 0, 0,
         props = SCM_EOL;
     }
   scm_hashq_set_x (overrides, proc, scm_assq_set_x (props, key, val));
-  scm_i_pthread_mutex_unlock (&overrides_lock);
+  scm_dynwind_end ();
 
   return SCM_UNSPECIFIED;
 }
-- 
1.7.5.4

[0003-Make-guardians-thread-safe.patch (text/x-patch, inline)]
From 467e1d4c0438d24e310a45bc7370bd19b0e8c659 Mon Sep 17 00:00:00 2001
From: Mark H Weaver <mhw <at> netris.org>
Date: Sun, 17 Nov 2013 03:35:09 -0500
Subject: [PATCH 3/6] Make guardians thread-safe.

* libguile/guardians.c (t_guardian): Add mutex.
  (finalize_guarded, scm_i_guard, scm_i_get_one_zombie): Lock mutex and
  block system asyncs during critical sections.
  (scm_make_guardian): Initialize mutex.
---
 libguile/guardians.c |   18 ++++++++++++++++--
 1 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/libguile/guardians.c b/libguile/guardians.c
index 6ba8c0b..e59e1bb 100644
--- a/libguile/guardians.c
+++ b/libguile/guardians.c
@@ -40,7 +40,6 @@
  * monsters we had...
  *
  * Rewritten for the Boehm-Demers-Weiser GC by Ludovic Courtès.
- * FIXME: This is currently not thread-safe.
  */
 
 /* Uncomment the following line to debug guardian finalization.  */
@@ -72,6 +71,7 @@ static scm_t_bits tc16_guardian;
 
 typedef struct t_guardian
 {
+  scm_i_pthread_mutex_t mutex;
   unsigned long live;
   SCM zombies;
   struct t_guardian *next;
@@ -144,6 +144,9 @@ finalize_guarded (void *ptr, void *finalizer_data)
 	}
 
       g = GUARDIAN_DATA (SCM_CAR (guardian_list));
+
+      scm_i_pthread_mutex_lock_with_asyncs (&g->mutex);
+
       if (g->live == 0)
 	abort ();
 
@@ -157,7 +160,8 @@ finalize_guarded (void *ptr, void *finalizer_data)
       g->zombies = zombies;
 
       g->live--;
-      g->zombies = zombies;
+
+      scm_i_pthread_mutex_unlock_with_asyncs (&g->mutex);
     }
 
   if (scm_is_true (proxied_finalizer))
@@ -208,6 +212,8 @@ scm_i_guard (SCM guardian, SCM obj)
       void *prev_data;
       SCM guardians_for_obj, finalizer_data;
 
+      scm_i_pthread_mutex_lock_with_asyncs (&g->mutex);
+
       g->live++;
 
       /* Note: GUARDIANS_FOR_OBJ is a weak list so that a guardian can be
@@ -249,6 +255,8 @@ scm_i_guard (SCM guardian, SCM obj)
 					PTR2SCM (prev_data));
 	  SCM_SETCAR (finalizer_data, proxied_finalizer);
 	}
+
+      scm_i_pthread_mutex_unlock_with_asyncs (&g->mutex);
     }
 }
 
@@ -258,6 +266,8 @@ scm_i_get_one_zombie (SCM guardian)
   t_guardian *g = GUARDIAN_DATA (guardian);
   SCM res = SCM_BOOL_F;
 
+  scm_i_pthread_mutex_lock_with_asyncs (&g->mutex);
+
   if (!scm_is_null (g->zombies))
     {
       /* Note: We return zombies in reverse order.  */
@@ -265,6 +275,8 @@ scm_i_get_one_zombie (SCM guardian)
       g->zombies = SCM_CDR (g->zombies);
     }
 
+  scm_i_pthread_mutex_unlock_with_asyncs (&g->mutex);
+
   return res;
 }
 
@@ -335,6 +347,8 @@ SCM_DEFINE (scm_make_guardian, "make-guardian", 0, 0, 0,
   t_guardian *g = scm_gc_malloc (sizeof (t_guardian), "guardian");
   SCM z;
 
+  scm_i_pthread_mutex_init (&g->mutex, NULL);
+
   /* A tconc starts out with one tail pair. */
   g->live = 0;
   g->zombies = SCM_EOL;
-- 
1.7.5.4

[0004-Make-port-alists-accessible-from-Scheme.patch (text/x-patch, inline)]
From 527a2938b55fb29b29091b96c5f803238adf42a7 Mon Sep 17 00:00:00 2001
From: Mark H Weaver <mhw <at> netris.org>
Date: Sun, 17 Nov 2013 01:11:57 -0500
Subject: [PATCH 4/6] Make port alists accessible from Scheme.

* libguile/ports.c (scm_i_port_alist, scm_i_set_port_alist_x): Make
  these available from Scheme, as '%port-alist' and '%set-port-alist!'.
  Validate port argument.

* libguile/ports.h (scm_i_set_port_alist_x): Change return type from
  'void' to 'SCM'.
---
 libguile/ports.c |   17 +++++++++++++----
 libguile/ports.h |    2 +-
 2 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/libguile/ports.c b/libguile/ports.c
index 6f219d6..a20a820 100644
--- a/libguile/ports.c
+++ b/libguile/ports.c
@@ -254,17 +254,26 @@ scm_i_clear_pending_eof (SCM port)
   SCM_PORT_GET_INTERNAL (port)->pending_eof = 0;
 }
 
-SCM
-scm_i_port_alist (SCM port)
+SCM_DEFINE (scm_i_port_alist, "%port-alist", 0, 1, 0,
+            (SCM port),
+            "Return the alist associated with @var{port}.")
+#define FUNC_NAME s_scm_i_port_alist
 {
+  SCM_VALIDATE_OPPORT (1, port);
   return SCM_PORT_GET_INTERNAL (port)->alist;
 }
+#undef FUNC_NAME
 
-void
-scm_i_set_port_alist_x (SCM port, SCM alist)
+SCM_DEFINE (scm_i_set_port_alist_x, "%set-port-alist!", 0, 2, 0,
+            (SCM port, SCM alist),
+            "Set the alist associated with @var{port} to @var{alist}.")
+#define FUNC_NAME s_scm_i_set_port_alist_x
 {
+  SCM_VALIDATE_OPPORT (1, port);
   SCM_PORT_GET_INTERNAL (port)->alist = alist;
+  return SCM_UNSPECIFIED;
 }
+#undef FUNC_NAME
 
 
 
diff --git a/libguile/ports.h b/libguile/ports.h
index 39317f8..c8d08df 100644
--- a/libguile/ports.h
+++ b/libguile/ports.h
@@ -318,7 +318,7 @@ SCM_API SCM scm_set_port_column_x (SCM port, SCM line);
 SCM_API SCM scm_port_filename (SCM port);
 SCM_API SCM scm_set_port_filename_x (SCM port, SCM filename);
 SCM_INTERNAL SCM scm_i_port_alist (SCM port);
-SCM_INTERNAL void scm_i_set_port_alist_x (SCM port, SCM alist);
+SCM_INTERNAL SCM scm_i_set_port_alist_x (SCM port, SCM alist);
 SCM_INTERNAL const char *scm_i_default_port_encoding (void);
 SCM_INTERNAL void scm_i_set_default_port_encoding (const char *);
 SCM_INTERNAL void scm_i_set_port_encoding_x (SCM port, const char *str);
-- 
1.7.5.4

[0005-Stylistic-improvements-for-ice-9-popen.patch (text/x-patch, inline)]
From 0e9c87402bf309323ebff4def7049572cb11562a Mon Sep 17 00:00:00 2001
From: Mark H Weaver <mhw <at> netris.org>
Date: Sun, 17 Nov 2013 02:46:08 -0500
Subject: [PATCH 5/6] Stylistic improvements for (ice-9 popen).

* module/ice-9/popen.scm (close-process, close-process-quietly): Accept
  'port' and 'pid' as separate arguments.  Improve style.
  (close-pipe, read-pipes): Improve style.
---
 module/ice-9/popen.scm |   45 +++++++++++++++++++++------------------------
 1 files changed, 21 insertions(+), 24 deletions(-)

diff --git a/module/ice-9/popen.scm b/module/ice-9/popen.scm
index 7d0549e..f8668cd 100644
--- a/module/ice-9/popen.scm
+++ b/module/ice-9/popen.scm
@@ -74,27 +74,26 @@ port to the process is created: it should be the value of
     (hashq-remove! port/pid-table port)
     pid))
 
-(define (close-process port/pid)
-  (close-port (car port/pid))
-  (cdr (waitpid (cdr port/pid))))
+(define (close-process port pid)
+  (close-port port)
+  (cdr (waitpid pid)))
 
 ;; for the background cleanup handler: just clean up without reporting
 ;; errors.  also avoids blocking the process: if the child isn't ready
 ;; to be collected, puts it back into the guardian's live list so it
 ;; can be tried again the next time the cleanup runs.
-(define (close-process-quietly port/pid)
+(define (close-process-quietly port pid)
   (catch 'system-error
 	 (lambda ()
-	   (close-port (car port/pid)))
+	   (close-port port))
 	 (lambda args #f))
   (catch 'system-error
 	 (lambda ()
-	   (let ((pid/status (waitpid (cdr port/pid) WNOHANG)))
-	     (cond ((= (car pid/status) 0)
-		    ;; not ready for collection
-		    (pipe-guardian (car port/pid))
-		    (hashq-set! port/pid-table
-				(car port/pid) (cdr port/pid))))))
+	   (let ((pid/status (waitpid pid WNOHANG)))
+             (when (zero? (car pid/status))
+               ;; not ready for collection
+               (pipe-guardian port)
+               (hashq-set! port/pid-table port pid))))
 	 (lambda args #f)))
 
 (define (close-pipe p)
@@ -102,19 +101,17 @@ port to the process is created: it should be the value of
 to terminate and returns its status value, @xref{Processes, waitpid}, for
 information on how to interpret this value."
   (let ((pid (fetch-pid p)))
-    (if (not pid)
-        (error "close-pipe: pipe not in table"))
-    (close-process (cons p pid))))
-
-(define reap-pipes
-  (lambda ()
-    (let loop ((p (pipe-guardian)))
-      (cond (p 
-	     ;; maybe removed already by close-pipe.
-	     (let ((pid (fetch-pid p)))
-	       (if pid
-		   (close-process-quietly (cons p pid))))
-	     (loop (pipe-guardian)))))))
+    (unless pid (error "close-pipe: pipe not in table"))
+    (close-process p pid)))
+
+(define (reap-pipes)
+  (let loop ()
+    (let ((p (pipe-guardian)))
+      (when p
+        ;; maybe removed already by close-pipe.
+        (let ((pid (fetch-pid p)))
+          (when pid (close-process-quietly p pid)))
+        (loop)))))
 
 (add-hook! after-gc-hook reap-pipes)
 
-- 
1.7.5.4

[0006-Make-ice-9-popen-thread-safe.patch (text/x-patch, inline)]
From 40676067383d8fef9cc1690154011708c7e8e256 Mon Sep 17 00:00:00 2001
From: Mark H Weaver <mhw <at> netris.org>
Date: Sun, 17 Nov 2013 02:54:31 -0500
Subject: [PATCH 6/6] Make (ice-9 popen) thread-safe.

* module/ice-9/popen.scm: Import (ice-9 threads).
  (port/pid-table): Mark as deprecated in comment.
  (port/pid-table-mutex): New variable.
  (open-pipe*): Stash the pid in the port's alist.  Lock
  'port/pid-table-mutex' while mutating 'port/pid-table'.
  (fetch-pid): Fetch the pid from the port's alist.  Don't touch
  'port/pid-table'.
  (close-process-quietly): Don't add the port to 'port/pid-table-mutex',
  since it was never removed.
  (close-pipe): Improve error message.
  (reap-pipes): Check to see if the port is already closed.
---
 module/ice-9/popen.scm |   27 +++++++++++++++++----------
 1 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/module/ice-9/popen.scm b/module/ice-9/popen.scm
index f8668cd..0e896d7 100644
--- a/module/ice-9/popen.scm
+++ b/module/ice-9/popen.scm
@@ -18,6 +18,7 @@
 ;;;; 
 
 (define-module (ice-9 popen)
+  :use-module (ice-9 threads)
   :export (port/pid-table open-pipe* open-pipe close-pipe open-input-pipe
 	   open-output-pipe open-input-output-pipe))
 
@@ -40,7 +41,10 @@
 (define pipe-guardian (make-guardian))
 
 ;; a weak hash-table to store the process ids.
+;; XXX use of this table is deprecated.  It is no longer used, and is
+;; populated only for backward compatibility (since it is exported).
 (define port/pid-table (make-weak-key-hash-table 31))
+(define port/pid-table-mutex (make-mutex))
 
 (define (open-pipe* mode command . args)
   "Executes the program @var{command} with optional arguments
@@ -57,8 +61,13 @@ port to the process is created: it should be the value of
                       read-port
                       write-port
                       (%make-void-port mode))))
+        (%set-port-alist! port (acons 'popen-pid pid (%port-alist port)))
         (pipe-guardian port)
-        (hashq-set! port/pid-table port pid)
+
+        ;; XXX populate port/pid-table for backward compatibility.
+        (with-mutex port/pid-table-mutex
+          (hashq-set! port/pid-table port pid))
+
         port))))
 
 (define (open-pipe command mode)
@@ -70,9 +79,7 @@ port to the process is created: it should be the value of
   (open-pipe* mode "/bin/sh" "-c" command))
 
 (define (fetch-pid port)
-  (let ((pid (hashq-ref port/pid-table port)))
-    (hashq-remove! port/pid-table port)
-    pid))
+  (assq-ref (%port-alist port) 'popen-pid))
 
 (define (close-process port pid)
   (close-port port)
@@ -92,8 +99,7 @@ port to the process is created: it should be the value of
 	   (let ((pid/status (waitpid pid WNOHANG)))
              (when (zero? (car pid/status))
                ;; not ready for collection
-               (pipe-guardian port)
-               (hashq-set! port/pid-table port pid))))
+               (pipe-guardian port))))
 	 (lambda args #f)))
 
 (define (close-pipe p)
@@ -101,16 +107,17 @@ port to the process is created: it should be the value of
 to terminate and returns its status value, @xref{Processes, waitpid}, for
 information on how to interpret this value."
   (let ((pid (fetch-pid p)))
-    (unless pid (error "close-pipe: pipe not in table"))
+    (unless pid (error "close-pipe: pipe not created by (ice-9 popen)"))
     (close-process p pid)))
 
 (define (reap-pipes)
   (let loop ()
     (let ((p (pipe-guardian)))
       (when p
-        ;; maybe removed already by close-pipe.
-        (let ((pid (fetch-pid p)))
-          (when pid (close-process-quietly p pid)))
+        ;; maybe closed already.
+        (unless (port-closed? p)
+          (let ((pid (fetch-pid p)))
+            (when pid (close-process-quietly p pid))))
         (loop)))))
 
 (add-hook! after-gc-hook reap-pipes)
-- 
1.7.5.4


Information forwarded to bug-guile <at> gnu.org:
bug#15683; Package guile. (Sun, 17 Nov 2013 10:05:02 GMT) Full text and rfc822 format available.

Message #13 received at 15683 <at> debbugs.gnu.org (full text, mbox):

From: Mark H Weaver <mhw <at> netris.org>
To: David Pirotte <david <at> altosw.be>
Cc: 15683 <at> debbugs.gnu.org
Subject: Re: bug#15683: [critical] ERROR: ... close-pipe: pipe not in table
Date: Sun, 17 Nov 2013 05:04:05 -0500
[Message part 1 (text/plain, inline)]
Mark H Weaver <mhw <at> netris.org> writes:
> Here's a set of patches that should make (ice-9 popen) thread safe.
> I've also pushed these to the 'wip-thread-safe-popen' branch in git.

There was a minor mistake in one of the patches: the new internal scheme
procedures for accessing the port alist declared their arguments as
optional, but they should have been required.

Here are the patches again, with that problem fixed.

      Mark


[0001-Add-mutex-locking-functions-that-also-block-asyncs.patch (text/x-patch, attachment)]
[0002-Block-system-asyncs-while-overrides_lock-is-held.patch (text/x-patch, attachment)]
[0003-Make-guardians-thread-safe.patch (text/x-patch, attachment)]
[0004-Make-port-alists-accessible-from-Scheme.patch (text/x-patch, attachment)]
[0005-Stylistic-improvements-for-ice-9-popen.patch (text/x-patch, attachment)]
[0006-Make-ice-9-popen-thread-safe.patch (text/x-patch, attachment)]

Information forwarded to bug-guile <at> gnu.org:
bug#15683; Package guile. (Sun, 17 Nov 2013 11:30:03 GMT) Full text and rfc822 format available.

Message #16 received at 15683 <at> debbugs.gnu.org (full text, mbox):

From: Mark H Weaver <mhw <at> netris.org>
To: David Pirotte <david <at> altosw.be>
Cc: 15683 <at> debbugs.gnu.org
Subject: Re: bug#15683: [critical] ERROR: ... close-pipe: pipe not in table
Date: Sun, 17 Nov 2013 06:29:12 -0500
To gain some confidence in these patches, I wrote a little test program:

--8<---------------cut here---------------start------------->8---
(use-modules (ice-9 popen))
(define threads
  (map (lambda (_)
         (call-with-new-thread
           (lambda ()
             (let loop ()
               (let ((pipe (open-pipe* OPEN_READ "echo" "foo")))
                 (read pipe)
                 (close-pipe pipe))
               (loop)))))
       (iota 4)))
--8<---------------cut here---------------end--------------->8---

Replace the '4' with the number of cores in your machine.  The code
above will create the requested number of threads, which run in the
background forever, rapidly creating and closing pipes.

The above program is able to reproduce the bug within a few seconds on
both of the multicore machines I have access to (a dual-core x86_64 box
and a 4-core MIPS-compatible Loongson 3A box).

With my patches applied, the above program runs indefinitely without a
problem (I let it run for several minutes).

Please try the code above on the largest machines you have access to.

     Thanks,
       Mark




Information forwarded to bug-guile <at> gnu.org:
bug#15683; Package guile. (Sun, 17 Nov 2013 16:01:01 GMT) Full text and rfc822 format available.

Message #19 received at 15683 <at> debbugs.gnu.org (full text, mbox):

From: Mark H Weaver <mhw <at> netris.org>
To: David Pirotte <david <at> altosw.be>
Cc: 15683 <at> debbugs.gnu.org
Subject: Re: bug#15683: [critical] ERROR: ... close-pipe: pipe not in table
Date: Sun, 17 Nov 2013 10:59:16 -0500
[Message part 1 (text/plain, inline)]
Sorry, I discovered that the reaper wasn't working properly, and that
slightly more radical changes were necessary.  I've attached a new
version of the patch set.  The only patch that changed here is the last
one, but I include all of them again for simplicity.

I deleted the old 'wip-thread-safe-popen' branch and pushed a new one
called 'thread-safe-popen', because I believe this one finally does the
entire job correctly.

      Mark


[0001-Add-mutex-locking-functions-that-also-block-asyncs.patch (text/x-patch, attachment)]
[0002-Block-system-asyncs-while-overrides_lock-is-held.patch (text/x-patch, attachment)]
[0003-Make-guardians-thread-safe.patch (text/x-patch, attachment)]
[0004-Make-port-alists-accessible-from-Scheme.patch (text/x-patch, attachment)]
[0005-Stylistic-improvements-for-ice-9-popen.patch (text/x-patch, attachment)]
[0006-Make-ice-9-popen-thread-safe.patch (text/x-patch, attachment)]

Information forwarded to bug-guile <at> gnu.org:
bug#15683; Package guile. (Sun, 17 Nov 2013 17:55:02 GMT) Full text and rfc822 format available.

Message #22 received at 15683 <at> debbugs.gnu.org (full text, mbox):

From: David Pirotte <david <at> altosw.be>
To: Mark H Weaver <mhw <at> netris.org>
Cc: 15683 <at> debbugs.gnu.org
Subject: Re: bug#15683: [critical] ERROR: ... close-pipe: pipe not in table
Date: Sun, 17 Nov 2013 15:54:23 -0200
Hello Mark,

Thank you for your work on this [tremendous] problem [for us].  I am running the
test on our 12 cores machine on the lab and so far so good [it did crash immediately
when running the 'old' guile...]

Not sure it interest you, but i made a module version of your test, where I
also call (current-processor-count), here below...

Many thanks again,
David

;; --

> To gain some confidence in these patches, I wrote a little test program:
> 

--8<---------------cut here---------------start------------->8---
(define-module (tests thread-safe-popen)
  :use-module (ice-9 popen)
  :export (thread-safe-popen-test))

(define (thread-safe-popen-test)
  (map (lambda (_)
         (call-with-new-thread
	  (lambda ()
	    (let loop ()
	      (let ((pipe (open-pipe* OPEN_READ "echo" "foo")))
		(read pipe)
		(close-pipe pipe))
	      (loop)))))
       (iota (current-processor-count))))

#!

(use-modules (tests thread-safe-popen))
(reload-module (resolve-module '(tests thread-safe-popen)))

(thread-safe-popen-test)

!#
--8<---------------cut here---------------end--------------->8---




Reply sent to Mark H Weaver <mhw <at> netris.org>:
You have taken responsibility. (Sat, 23 Nov 2013 23:09:02 GMT) Full text and rfc822 format available.

Notification sent to David Pirotte <david <at> altosw.be>:
bug acknowledged by developer. (Sat, 23 Nov 2013 23:09:02 GMT) Full text and rfc822 format available.

Message #27 received at 15683-done <at> debbugs.gnu.org (full text, mbox):

From: Mark H Weaver <mhw <at> netris.org>
To: David Pirotte <david <at> altosw.be>
Cc: 15683-done <at> debbugs.gnu.org
Subject: Re: bug#15683: [critical] ERROR: ... close-pipe: pipe not in table
Date: Sat, 23 Nov 2013 18:07:56 -0500
Hi David,

David Pirotte <david <at> altosw.be> writes:
> I am facing a bug that only occurs on extremely powerful servers:
>
> 	in ice-9/popen.scm:
> 	 106: 1 [close-pipe #<input: #{read pipe}# 69>]
> 	In unknown file:
> 	 ?: 0 [scm-error misc-error #f "~A" ("close-pipe: pipe not in table") #f]

This bug should now be fixed on the stable-2.0 branch, which will become
Guile 2.0.10.

  http://git.savannah.gnu.org/cgit/guile.git/commit/?h=stable-2.0&id=e7bd20f7d9b2110fdc0fa25db5a2bfe6b2214923

I'm closing this bug now.

     Thanks!
       Mark




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 22 Dec 2013 12:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 10 years and 140 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.