GNU bug report logs - #14474
24.3.50; Zombie subprocesses (again)

Previous Next

Package: emacs;

Reported by: michael_heerdegen <at> web.de

Date: Sat, 25 May 2013 23:41:02 UTC

Severity: normal

Found in version 24.3.50

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 14474 in the body.
You can then email your comments to 14474 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#14474; Package emacs. (Sat, 25 May 2013 23:41:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to michael_heerdegen <at> web.de:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Sat, 25 May 2013 23:41:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Michael Heerdegen <michael_heerdegen <at> web.de>
To: bug-gnu-emacs <at> gnu.org
Subject: 24.3.50; Zombie subprocesses (again)
Date: Sun, 26 May 2013 01:38:56 +0200
Hello,

dunno if this is related to bug#12980.  Although I had used a fresh
build all the time, I saw the following problem yesterday for the first
time (note: was on a trip before, so the problem could have been
introduced one or two weeks before today).

I'm using emacs-snapshot on Debian, currently a five days old build:

"GNU Emacs 24.3.50.1 (x86_64-pc-linux-gnu, GTK+ Version 3.4.2)
 of 2013-05-21 on dex, modified by Debian"

I'm experiencing the following:

- I start Emacs in X as a different user (via gksu), or

- I start Emacs from an X session that was started with startx

In such an Emacs, any child process seems to become a zombie after being
finished.  E.g., after typing "exit" in a *terminal* running bash, there
is still a running buffer process.  As a symptom, CPU is used
continuously at 100% until I C-x C-c.

However, if I log in via display manager and don't switch to another
user via gksu, this doesn't happen.  And: it happens with the gtk
version as well as with the lucid version, but _not_ with emacs -nw in
an xterm.

Please ask me if you need more info.


Thanks,

Michael.




In GNU Emacs 24.3.50.1 (x86_64-pc-linux-gnu, GTK+ Version 3.4.2)
 of 2013-05-21 on dex, modified by Debian
 (emacs-snapshot package, version 2:20130520-1)
Windowing system distributor `The X.Org Foundation', version 11.0.11204000
System Description:	Debian GNU/Linux testing (jessie)

Configured using:
 `configure --build x86_64-linux-gnu --host x86_64-linux-gnu
 --prefix=/usr --sharedstatedir=/var/lib --libexecdir=/usr/lib
 --localstatedir=/var --infodir=/usr/share/info --mandir=/usr/share/man
 --with-pop=yes
 --enable-locallisppath=/etc/emacs-snapshot:/etc/emacs:/usr/local/share/emacs/24.3.50/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/24.3.50/site-lisp:/usr/share/emacs/site-lisp
 --without-compress-info --with-crt-dir=/usr/lib/x86_64-linux-gnu/
 --with-x=yes --with-x-toolkit=gtk3 --with-imagemagick=yes
 CFLAGS='-DDEBIAN -DSITELOAD_PURESIZE_EXTRA=5000 -g -O2'
 CPPFLAGS='-D_FORTIFY_SOURCE=2' LDFLAGS='-g -Wl,--as-needed
 -znocombreloc''





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#14474; Package emacs. (Sat, 25 May 2013 23:51:02 GMT) Full text and rfc822 format available.

Message #8 received at 14474 <at> debbugs.gnu.org (full text, mbox):

From: Michael Heerdegen <michael_heerdegen <at> web.de>
To: 14474 <at> debbugs.gnu.org
Subject: Re: bug#14474: 24.3.50; Zombie subprocesses (again)
Date: Sun, 26 May 2013 01:49:08 +0200
Michael Heerdegen <michael_heerdegen <at> web.de> writes:

> I'm experiencing the following:
>
> - I start Emacs in X as a different user (via gksu), or
>
> - I start Emacs from an X session that was started with startx
>
> In such an Emacs, any child process seems to become a zombie after being
> finished.  E.g., after typing "exit" in a *terminal* running bash, there
> is still a running buffer process.  As a symptom, CPU is used
> continuously at 100% until I C-x C-c.

BTW, this is what Paul Eggert answered in emacs-dev:

> I can reproduce the problem on Ubuntu 13.04.  Apparently when you
> start up a GTK Emacs session that can't talk to dbus (because it's
> su'ed), the dbus library starts up its own service, using dbus-launch.
> This messes up Emacs somehow (I don't know why).


Thanks,

Michael.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#14474; Package emacs. (Sun, 26 May 2013 02:57:02 GMT) Full text and rfc822 format available.

Message #11 received at 14474 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: michael_heerdegen <at> web.de
Cc: 14474 <at> debbugs.gnu.org
Subject: Re: bug#14474: 24.3.50; Zombie subprocesses (again)
Date: Sun, 26 May 2013 05:55:23 +0300
> From: Michael Heerdegen <michael_heerdegen <at> web.de>
> Date: Sun, 26 May 2013 01:38:56 +0200
> 
> In such an Emacs, any child process seems to become a zombie after being
> finished.  E.g., after typing "exit" in a *terminal* running bash, there
> is still a running buffer process.  As a symptom, CPU is used
> continuously at 100% until I C-x C-c.

Can you attach a debugger and see where Emacs is looping?  etc/DEBUG
tells how to do that.

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#14474; Package emacs. (Sun, 26 May 2013 17:40:02 GMT) Full text and rfc822 format available.

Message #14 received at 14474 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: 14474 <at> debbugs.gnu.org
Subject: Re: bug#14474: 24.3.50; Zombie subprocesses (again)
Date: Sun, 26 May 2013 10:37:52 -0700
A workaround, for me at least, is to propagate the
DBUS_SESSION_BUS_ADDRESS environment variable into
the child process with a different userid.  

For example, here is a failing session, where I became the user 'exp'
and later observed the problem in a shell window:

$ env | grep DBUS
DBUS_SESSION_BUS_ADDRESS=unix:abstract=/tmp/dbus-bpx4rxPk7z,guid=6e491bf38a5b2b6fce17d0a251a221bf
$ sudo sh
# su exp
$ env | grep DBUS
$ emacs
** (emacs:15115): WARNING **: Couldn't connect to accessibility bus: Failed to connect to socket /tmp/dbus-x2KgryK9C8: Connection refused

And here is a session that worked.  The key difference is that
I used su's '-E' option:

$ env | grep DBUS
DBUS_SESSION_BUS_ADDRESS=unix:abstract=/tmp/dbus-bpx4rxPk7z,guid=6e491bf38a5b2b6fce17d0a251a221bf
$ sudo -E sh
# su exp
$ env | grep DBUS
DBUS_SESSION_BUS_ADDRESS=unix:abstract=/tmp/dbus-bpx4rxPk7z,guid=6e491bf38a5b2b6fce17d0a251a221bf
$ emacs
** (emacs:15441): WARNING **: Couldn't connect to accessibility bus: Failed to connect to socket /tmp/dbus-x2KgryK9C8: Connection refused

In both cases, the dbus library complains to stderr that it can't connect
to /tmp/dbus-x2KgryK9C8 (I don't know where it's getting that name from).
When DBUS_SESSION_BUS_ADDRESS is unset, the dbus library arranges to run
the shell script /usr/bin/dbus-launch, which seems to cause the problem.
But when DBUS_SESSION_BUS_ADDRESS is set, the dbus library falls back
on its contents and doesn't invoke dbus-launch.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#14474; Package emacs. (Sun, 26 May 2013 18:36:02 GMT) Full text and rfc822 format available.

Message #17 received at 14474 <at> debbugs.gnu.org (full text, mbox):

From: Michael Heerdegen <michael_heerdegen <at> web.de>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 14474 <at> debbugs.gnu.org
Subject: Re: bug#14474: 24.3.50; Zombie subprocesses (again)
Date: Sun, 26 May 2013 20:33:55 +0200
Paul Eggert <eggert <at> cs.ucla.edu> writes:

> And here is a session that worked.  The key difference is that
> I used su's '-E' option:
>
> $ env | grep DBUS
> DBUS_SESSION_BUS_ADDRESS=unix:abstract=/tmp/dbus-bpx4rxPk7z,guid=6e491bf38a5b2b6fce17d0a251a221bf
> $ sudo -E sh
> # su exp
> $ env | grep DBUS
> DBUS_SESSION_BUS_ADDRESS=unix:abstract=/tmp/dbus-bpx4rxPk7z,guid=6e491bf38a5b2b6fce17d0a251a221bf
> $ emacs
> ** (emacs:15441): WARNING **: Couldn't connect to accessibility bus:
> Failed to connect to socket /tmp/dbus-x2KgryK9C8: Connection refused

I see something similar - using the -E flag for sudo works as a
workaround.  However, I don't get this "Failed to connect to socket..."
warning.  Instead, I get

** (emacs:6638): WARNING **: The connection is closed


Michael.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#14474; Package emacs. (Mon, 27 May 2013 01:38:01 GMT) Full text and rfc822 format available.

Message #20 received at 14474 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Michael Heerdegen <michael_heerdegen <at> web.de>
Cc: Colin Walters <walters <at> verbum.org>, 14474 <at> debbugs.gnu.org
Subject: Re: bug#14474: 24.3.50; Zombie subprocesses (again)
Date: Sun, 26 May 2013 18:36:15 -0700
[The bug is that a bleeding-edge GTK Emacs loses child processes
when it's run via sudo; see <http://bugs.gnu.org/14474>.]

I think I may have spotted the problem.
Glib 2.36.2's glib/gmain.c has a function
'ensure_unix_signal_handler_installed_unlocked'
that is run in the dconf worker thread.
This function calls sigaction to replace Emacs's SIGCHLD handler
with glib's own handler g_unix_signal_handler.
Signal handlers are process-wide, so this replacement affects
all threads, including the main (Emacs) thread.

After that happens, Emacs never sees when its children
exit, since g_unix_signal_handler discards Emacs's
child-exit notices, and the Emacs function
deliver_child_signal is never invoked.

The comment for g_child_watch_source_new
says that Emacs isn't supposed to invoke waitpid (-1, ...),
but that's already the case in the Emacs trunk.
Is there another limitation that we
didn't know about, a limitation that says Emacs can't
have signal handlers either?

I'll CC: this to Colin Walters since he seemed to have
a good handle on the situation from the glib point of view; see
<https://bugzilla.gnome.org/show_bug.cgi?id=676167>.

One possibility is to see if we can get Emacs to use
glib's child watcher.  But that's a bit of a delicate balance,
since Emacs must work even when gtk is absent, and it may need
to hand off from its own watcher to glib's watcher, and processes
shouldn't get lost during the handoff.  I don't offhand know how
to do all that.

A simpler but hacky workaround is to not use the graphical interface if
DBUS_SESSION_BUS_ADDRESS is unset.  Something like this:

--- src/xterm.c	2013-05-09 14:49:56 +0000
+++ src/xterm.c	2013-05-27 01:32:44 +0000
@@ -9819,6 +9819,14 @@ x_display_ok (const char *display)
     int dpy_ok = 1;
     Display *dpy;
 
+#ifdef USE_GTK
+    if (! egetenv ("DBUS_SESSION_BUS_ADDRESS"))
+      {
+	fprintf (stderr, "DBUS_SESSION_BUS_ADDRESS unset, so Gtk is unsafe\n");
+	return 0;
+      }
+#endif
+
     dpy = XOpenDisplay (display);
     if (dpy)
       XCloseDisplay (dpy);






Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#14474; Package emacs. (Mon, 27 May 2013 17:39:02 GMT) Full text and rfc822 format available.

Message #23 received at 14474 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Colin Walters <walters <at> verbum.org>
Cc: Michael Heerdegen <michael_heerdegen <at> web.de>,
	Jan Djärv <jan.h.d <at> swipnet.se>, 14474 <at> debbugs.gnu.org
Subject: Re: bug#14474: 24.3.50; Zombie subprocesses (again)
Date: Mon, 27 May 2013 10:36:48 -0700
[The context is
http://bugs.gnu.org/14474
]

On 05/27/2013 05:46 AM, Colin Walters wrote:

> Basically it's going to be very hard over time to avoid codepaths
> in the GTK+ stack that don't call g_spawn_*() indirectly, thus
> installing a SIGCHLD handler

Thanks.  In that case, shouldn't the glib documentation be
changed to warn application developers not to install a SIGCHLD
handler as well?  Currently it warns them only to not call
waitpid(-1, ...).

Are application developers allowed to temporarily mask SIGCHLD?
Emacs does that a lot.

>> One possibility is to see if we can get Emacs to use
>> > glib's child watcher.
> That'd be best obviously.

I suspect so too, but it requires more expertise in
glib than I have (which is, basically, nothing).
If I understand things correctly, if Emacs is using
Gtk it should

 * never call sigaction (SIGCHLD, ...) or signal (SIGCHLD, ...)
   or waitpid (-1, ...).
   E.g., remove the current call to sigaction (SIGCHLD, ...),
   in src/process.c's init_process_emacs.
   
 * Whenever Emacs creates a child process, use the
   following pattern:

       block SIGCHLD;
       pid = vfork ();
       if (pid > 0)
         {
           record pid in Emacs's process table, as location 'loc';
           record in *loc that glib is watching this pid;
           g_child_watch_add (pid, watcher, loc);
         }
       unblock SIGCHLD;

  * never call waitpid (pid, ...) if PID is recorded
    in Emacs's process table as something that glib is
    watching.

  * Add a glue function ("watcher", above) that does
    something like this:

      void watcher (GPid pid, gint status, gpointer loc) {
	block SIGCHLD
        record that PID exited with status STATUS, by modifying *LOC,
	  sort of like's what currently done in handle_child_signal;
        if (input_available_clear_time)
	  *input_available_clear_time = make_emacs_time (0, 0);
        unblock SIGCHLD
     }

But this sounds incomplete.  No doubt there's something
about the main loop, or setting up the watchers, that I don't
know about.  E.g., how does one remove the watcher once it
has fired and told us that the process has exited?

I'll CC: this to Jan Djärv, who knows about gtk, to
see if he can help.






Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#14474; Package emacs. (Mon, 27 May 2013 18:27:06 GMT) Full text and rfc822 format available.

Message #26 received at 14474 <at> debbugs.gnu.org (full text, mbox):

From: Colin Walters <walters <at> verbum.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: Michael Heerdegen <michael_heerdegen <at> web.de>, 14474 <at> debbugs.gnu.org
Subject: Re: bug#14474: 24.3.50; Zombie subprocesses (again)
Date: Mon, 27 May 2013 08:46:20 -0400
On Sun, 2013-05-26 at 18:36 -0700, Paul Eggert wrote:

> but that's already the case in the Emacs trunk.
> Is there another limitation that we
> didn't know about, a limitation that says Emacs can't
> have signal handlers either?

Basically it's going to be very hard over time to avoid codepaths
in the GTK+ stack that don't call g_spawn_*() indirectly, thus
installing a SIGCHLD handler, particuarly due to the pluggable nature of
Gio.

> I'll CC: this to Colin Walters since he seemed to have
> a good handle on the situation from the glib point of view; see
> <https://bugzilla.gnome.org/show_bug.cgi?id=676167>.

Yeah, I don't think much has changed since then.

> One possibility is to see if we can get Emacs to use
> glib's child watcher.

That'd be best obviously.

>   But that's a bit of a delicate balance,
> since Emacs must work even when gtk is absent,

Bear in mind that GLib is usable without gtk.  Even if you don't
have an X connection, if the GLib mainloop is linked into the process,
I don't see a reason not to use it.

>  and it may need
> to hand off from its own watcher to glib's watcher, and processes
> shouldn't get lost during the handoff. 

Would Emacs really be spawning processes before initializing
the frontend?

> A simpler but hacky workaround is to not use the graphical interface if
> DBUS_SESSION_BUS_ADDRESS is unset.

I don't see a real problem with that as a temporary thing.

Anyways, if there is something I can do GLib side, let me know.






Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#14474; Package emacs. (Tue, 28 May 2013 16:58:02 GMT) Full text and rfc822 format available.

Message #29 received at 14474 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Michael Heerdegen <michael_heerdegen <at> web.de>
Cc: 14474 <at> debbugs.gnu.org
Subject: Re: bug#14474: 24.3.50; Zombie subprocesses (again)
Date: Tue, 28 May 2013 09:56:06 -0700
In <http://lists.gnu.org/archive/html/emacs-devel/2013-05/msg00628.html>
something like the following milder workaround was suggested instead.
Michael, does this patch work around the bug for your test case?

=== modified file 'src/xterm.c'
--- src/xterm.c	2013-05-09 14:49:56 +0000
+++ src/xterm.c	2013-05-28 16:34:44 +0000
@@ -9897,6 +9897,13 @@ x_term_init (Lisp_Object display_name, c
 
         XSetLocaleModifiers ("");
 
+	/* If D-Bus is not already configured, inhibit D-Bus autolaunch,
+	   as autolaunch can mess up Emacs's SIGCHLD handler.
+	   FIXME: Rewrite subprocess handlers to use glib's child watchers.
+	   See Bug#14474.  */
+	if (! egetenv ("DBUS_SESSION_BUS_ADDRESS"))
+	  xputenv ("DBUS_SESSION_BUS_ADDRESS=");
+
         /* Emacs can only handle core input events, so make sure
            Gtk doesn't use Xinput or Xinput2 extensions.  */
 	xputenv ("GDK_CORE_DEVICE_EVENTS=1");




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#14474; Package emacs. (Tue, 28 May 2013 17:07:02 GMT) Full text and rfc822 format available.

Message #32 received at 14474 <at> debbugs.gnu.org (full text, mbox):

From: Jan Djärv <jan.h.d <at> swipnet.se>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: Michael Heerdegen <michael_heerdegen <at> web.de>, 14474 <at> debbugs.gnu.org,
	Colin Walters <walters <at> verbum.org>
Subject: Re: bug#14474: 24.3.50; Zombie subprocesses (again)
Date: Tue, 28 May 2013 19:04:59 +0200
Hello.

27 maj 2013 kl. 19:36 skrev Paul Eggert <eggert <at> cs.ucla.edu>:

> [The context is
> http://bugs.gnu.org/14474
> ]
> 
> On 05/27/2013 05:46 AM, Colin Walters wrote:
> 
>> Basically it's going to be very hard over time to avoid codepaths
>> in the GTK+ stack that don't call g_spawn_*() indirectly, thus
>> installing a SIGCHLD handler
> 
> Thanks.  In that case, shouldn't the glib documentation be
> changed to warn application developers not to install a SIGCHLD
> handler as well?  Currently it warns them only to not call
> waitpid(-1, ...).
> 
> Are application developers allowed to temporarily mask SIGCHLD?
> Emacs does that a lot.
> 
>>> One possibility is to see if we can get Emacs to use
>>>> glib's child watcher.
>> That'd be best obviously.
> 

> I suspect so too, but it requires more expertise in
> glib than I have (which is, basically, nothing).
> If I understand things correctly, if Emacs is using
> Gtk it should
> 

Actually GLib is linked in whenever one of GSettings, GConf, Gtk or rsvg is used.
I see rsvg only is not handeled in xgselect.c, an oversight.


> * never call sigaction (SIGCHLD, ...) or signal (SIGCHLD, ...)
>   or waitpid (-1, ...).
>   E.g., remove the current call to sigaction (SIGCHLD, ...),
>   in src/process.c's init_process_emacs.
> 
> * Whenever Emacs creates a child process, use the
>   following pattern:
> 
>       block SIGCHLD;
>       pid = vfork ();
>       if (pid > 0)
>         {
>           record pid in Emacs's process table, as location 'loc';
>           record in *loc that glib is watching this pid;
>           g_child_watch_add (pid, watcher, loc);
>         }
>       unblock SIGCHLD;
> 
>  * never call waitpid (pid, ...) if PID is recorded
>    in Emacs's process table as something that glib is
>    watching.
> 
>  * Add a glue function ("watcher", above) that does
>    something like this:
> 
>      void watcher (GPid pid, gint status, gpointer loc) {
> 	block SIGCHLD
>        record that PID exited with status STATUS, by modifying *LOC,
> 	  sort of like's what currently done in handle_child_signal;
>        if (input_available_clear_time)
> 	  *input_available_clear_time = make_emacs_time (0, 0);
>        unblock SIGCHLD
>     }
> 
> But this sounds incomplete.  No doubt there's something
> about the main loop, or setting up the watchers, that I don't
> know about.  E.g., how does one remove the watcher once it
> has fired and told us that the process has exited?
> 

Keep track of the return value from g_child_watch_add and pass it to g_source_remove.
I think g_source_remove can be called in the callback function.

We kind of use GLibs main loop in xgselect.c, so child watches should be called from there.
As GLib:s main loop is an "all or nothing" approach, we could also move the filedescriptor and timeout handling  there.  Then xgselect.c could more or less go away.  But there is no real gain to do that, xgselect works well enough.

	Jan D.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#14474; Package emacs. (Tue, 28 May 2013 20:44:02 GMT) Full text and rfc822 format available.

Message #35 received at 14474 <at> debbugs.gnu.org (full text, mbox):

From: Michael Heerdegen <michael_heerdegen <at> web.de>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 14474 <at> debbugs.gnu.org
Subject: Re: bug#14474: 24.3.50; Zombie subprocesses (again)
Date: Tue, 28 May 2013 22:42:14 +0200
Paul Eggert <eggert <at> cs.ucla.edu> writes:

> In <http://lists.gnu.org/archive/html/emacs-devel/2013-05/msg00628.html>
> something like the following milder workaround was suggested instead.
> Michael, does this patch work around the bug for your test case?

Thanks for that, but I currently use a precompiled package for my OS
(emacs-snapshot), so I can neither debug C nor test patches.

It would be great if someone else that can reproduce this bug could try
that.  If not, I'll try to build Emacs myself in the next days, hoping
that the problem manifests there, too.


Regards,

Michael.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#14474; Package emacs. (Fri, 31 May 2013 01:45:02 GMT) Full text and rfc822 format available.

Message #38 received at 14474 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Michael Albinus <michael.albinus <at> gmx.de>
Cc: 14474 <at> debbugs.gnu.org
Subject: Re: Using glib's g_file_monitor_file and g_file_monitor_directory
Date: Thu, 30 May 2013 18:42:49 -0700
On 05/28/2013 11:12 PM, Michael Albinus wrote:
> DBUS_SESSION_BUS_ADDRESS="unix:path=/dev/null" seems to be reliable.

OK, thanks, I committed a patch along those lines to the trunk
(bzr 112795); please give it a try.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#14474; Package emacs. (Fri, 31 May 2013 18:41:02 GMT) Full text and rfc822 format available.

Message #41 received at 14474 <at> debbugs.gnu.org (full text, mbox):

From: Michael Albinus <michael.albinus <at> gmx.de>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 14474 <at> debbugs.gnu.org
Subject: Re: Using glib's g_file_monitor_file and g_file_monitor_directory
Date: Fri, 31 May 2013 20:39:04 +0200
Paul Eggert <eggert <at> cs.ucla.edu> writes:

Hi Paul,

> On 05/28/2013 11:12 PM, Michael Albinus wrote:
>> DBUS_SESSION_BUS_ADDRESS="unix:path=/dev/null" seems to be reliable.
>
> OK, thanks, I committed a patch along those lines to the trunk
> (bzr 112795); please give it a try.

You've put the lines inside #ifdef USE_GTK. However, glib could be
linked to Emacs without using gtk. We might need another check, like
#ifdef USE_GLIB. This would be useful in other places, too.

Furthermore, I'd wrap these lines with #ifdef HAVE_DBUS. If Emacs is
compiled without D-Bus support, external processes could still use
D-Bus via autolaunch. You suppress this possibility by overwriting
$DBUS_SESSION_BUS_ADDRESS.

Best regards, Michael.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#14474; Package emacs. (Fri, 31 May 2013 19:27:01 GMT) Full text and rfc822 format available.

Message #44 received at 14474 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Michael Albinus <michael.albinus <at> gmx.de>
Cc: 14474 <at> debbugs.gnu.org
Subject: Re: Using glib's g_file_monitor_file and g_file_monitor_directory
Date: Fri, 31 May 2013 12:24:52 -0700
On 05/31/13 11:39, Michael Albinus wrote:

> You've put the lines inside #ifdef USE_GTK. However, glib could be
> linked to Emacs without using gtk. We might need another check, like
> #ifdef USE_GLIB. This would be useful in other places, too.

That might make sense in the long run, if Emacs is built that way.
But as I understand it, currently Emacs links to glib only when
using Gtk, so the workaround is OK under the current framework.

> Furthermore, I'd wrap these lines with #ifdef HAVE_DBUS. If Emacs is
> compiled without D-Bus support, external processes could still use
> D-Bus via autolaunch. You suppress this possibility by overwriting
> $DBUS_SESSION_BUS_ADDRESS.

Can't the Emacs process autolaunch D-bus subprocesses even when it
HAVE_DBUS is not defined?  That is, some other toolkit that Emacs links to,
could autolaunch D-Bus.  So it wouldn't be safe to wrap the lines
with #ifdef HAVE_DBUS.

I do see the problem that you mention.  I think a better fix, though,
is to redo Emacs to use the glib child watcher code -- that will fix
the problem that you mention, along with the problem for Emacs itself.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#14474; Package emacs. (Fri, 31 May 2013 21:02:01 GMT) Full text and rfc822 format available.

Message #47 received at 14474 <at> debbugs.gnu.org (full text, mbox):

From: Michael Albinus <michael.albinus <at> gmx.de>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 14474 <at> debbugs.gnu.org
Subject: Re: Using glib's g_file_monitor_file and g_file_monitor_directory
Date: Fri, 31 May 2013 22:59:59 +0200
Paul Eggert <eggert <at> cs.ucla.edu> writes:

>> You've put the lines inside #ifdef USE_GTK. However, glib could be
>> linked to Emacs without using gtk. We might need another check, like
>> #ifdef USE_GLIB. This would be useful in other places, too.
>
> That might make sense in the long run, if Emacs is built that way.
> But as I understand it, currently Emacs links to glib only when
> using Gtk, so the workaround is OK under the current framework.

The g_file_monitor patch I've shown uses glib. No gtk. And the same is
true for gconf and gsettings, IIUC.

> I do see the problem that you mention.  I think a better fix, though,
> is to redo Emacs to use the glib child watcher code -- that will fix
> the problem that you mention, along with the problem for Emacs itself.

Yep, this might be better.

Best regards, Michael.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#14474; Package emacs. (Fri, 31 May 2013 22:11:01 GMT) Full text and rfc822 format available.

Message #50 received at 14474 <at> debbugs.gnu.org (full text, mbox):

From: Jan Djärv <jan.h.d <at> swipnet.se>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: Michael Albinus <michael.albinus <at> gmx.de>,
	"14474 <at> debbugs.gnu.org" <14474 <at> debbugs.gnu.org>
Subject: Re: bug#14474: Using glib's g_file_monitor_file and
	g_file_monitor_directory
Date: Sat, 1 Jun 2013 00:08:40 +0200
Hello.

31 maj 2013 kl. 21:24 skrev Paul Eggert <eggert <at> cs.ucla.edu>:

> On 05/31/13 11:39, Michael Albinus wrote:
> 
>> You've put the lines inside #ifdef USE_GTK. However, glib could be
>> linked to Emacs without using gtk. We might need another check, like
>> #ifdef USE_GLIB. This would be useful in other places, too.
> 
> That might make sense in the long run, if Emacs is built that way.
> But as I understand it, currently Emacs links to glib only when
> using Gtk, so the workaround is OK under the current framework.
> 

This is false. Emacs links with glib if one of GConf, Gsettings or rsvg is linked in. They are independent of Gtk.

      Jan D.



Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#14474; Package emacs. (Sat, 01 Jun 2013 01:06:02 GMT) Full text and rfc822 format available.

Message #53 received at 14474 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Colin Walters <walters <at> verbum.org>
Cc: Michael Heerdegen <michael_heerdegen <at> web.de>,
	Michael Albinus <michael.albinus <at> gmx.de>, 14474 <at> debbugs.gnu.org
Subject: Re: bug#14474: 24.3.50; Zombie subprocesses (again)
Date: Fri, 31 May 2013 18:03:55 -0700
On 05/27/2013 05:46 AM, Colin Walters wrote:
>> One possibility is to see if we can get Emacs to use
>> > glib's child watcher.
> That'd be best obviously.

I looked into this a bit, and found a problem.
Emacs wants to be notified about child processes
that are stopped, so it invokes waitpid with the
WUNTRACED option, but glib never uses WUNTRACED
when invoking waitpid.  If Emacs used glib to watch for
child processes, Emacs will not be informed about
a child process changing state because it has
stopped.  (Similarly for WCONTINUED and processes
that have been continued.)

Perhaps glib needs a new function, which lets the
caller specify additional options to be given to
waitpid?  Something like this, say:

   g_child_watch_source_new_full (pid, WUNTRACED | WCONTINUED)

Then, g_child_watch_source_new (pid) would be equivalent to
g_child_watch_source_new_full (pid, 0).




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#14474; Package emacs. (Sat, 01 Jun 2013 01:25:01 GMT) Full text and rfc822 format available.

Message #56 received at 14474 <at> debbugs.gnu.org (full text, mbox):

From: Colin Walters <walters <at> verbum.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: Michael Heerdegen <michael_heerdegen <at> web.de>,
	Michael Albinus <michael.albinus <at> gmx.de>, 14474 <at> debbugs.gnu.org
Subject: Re: bug#14474: 24.3.50; Zombie subprocesses (again)
Date: Fri, 31 May 2013 21:22:17 -0400
On Fri, 2013-05-31 at 18:03 -0700, Paul Eggert wrote:
> On 05/27/2013 05:46 AM, Colin Walters wrote:
> >> One possibility is to see if we can get Emacs to use
> >> > glib's child watcher.
> > That'd be best obviously.
> 
> I looked into this a bit, and found a problem.
> Emacs wants to be notified about child processes
> that are stopped, 

Why, out of curiosity?

>    g_child_watch_source_new_full (pid, WUNTRACED | WCONTINUED)

We could add that to glib-unix.h probably, yeah.






Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#14474; Package emacs. (Sat, 01 Jun 2013 06:17:02 GMT) Full text and rfc822 format available.

Message #59 received at 14474 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Colin Walters <walters <at> verbum.org>
Cc: Michael Heerdegen <michael_heerdegen <at> web.de>,
	Michael Albinus <michael.albinus <at> gmx.de>, 14474 <at> debbugs.gnu.org
Subject: Re: bug#14474: 24.3.50; Zombie subprocesses (again)
Date: Fri, 31 May 2013 23:14:54 -0700
On 05/31/2013 06:22 PM, Colin Walters wrote:
> Why, out of curiosity?

Emacs has a function process-status that returns
a process's status.  Possible statuses include

run  -- for a process that is running.
stop -- for a process stopped but continuable.
exit -- for a process that has exited.
signal -- for a process that has got a fatal signal.

To implement this, Emacs keeps track, for each of its
child processes, what that process's status is.
Emacs updates the information that it records about
a child process whenever it's notified about
a child process status change.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#14474; Package emacs. (Sat, 01 Jun 2013 14:36:02 GMT) Full text and rfc822 format available.

Message #62 received at 14474 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> IRO.UMontreal.CA>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: Michael Heerdegen <michael_heerdegen <at> web.de>, 14474 <at> debbugs.gnu.org,
	Michael Albinus <michael.albinus <at> gmx.de>,
	Colin Walters <walters <at> verbum.org>
Subject: Re: bug#14474: 24.3.50; Zombie subprocesses (again)
Date: Sat, 01 Jun 2013 10:33:29 -0400
> Emacs has a function process-status that returns
> a process's status.

Not only that, but the process-sentinel is called when the status
changes.  This said, I don't know if there are any process-sentinels out
there that need to be told when a process is stopped or "continued".


        Stefan




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#14474; Package emacs. (Mon, 03 Jun 2013 16:12:02 GMT) Full text and rfc822 format available.

Message #65 received at 14474 <at> debbugs.gnu.org (full text, mbox):

From: Colin Walters <walters <at> verbum.org>
To: Stefan Monnier <monnier <at> IRO.UMontreal.CA>
Cc: Michael Heerdegen <michael_heerdegen <at> web.de>,
	Paul Eggert <eggert <at> cs.ucla.edu>,
	Michael Albinus <michael.albinus <at> gmx.de>, 14474 <at> debbugs.gnu.org
Subject: Re: bug#14474: 24.3.50; Zombie subprocesses (again)
Date: Mon, 03 Jun 2013 12:09:39 -0400
On Sat, 2013-06-01 at 10:33 -0400, Stefan Monnier wrote:
> > Emacs has a function process-status that returns
> > a process's status.
> 
> Not only that, but the process-sentinel is called when the status
> changes.  This said, I don't know if there are any process-sentinels out
> there that need to be told when a process is stopped or "continued".

Right; I kind of doubt it.  Regardless though, I filed:

https://bugzilla.gnome.org/show_bug.cgi?id=701538

Are there any other blocking issues for Emacs using the GLib mainloop?
If that's the only one I can probably get around to doing a patch this
week.

I suspect though you could simply not report stopped status, and not
break any real world programs.  The only thing I can think of is a
multiprocess application which sends SIGSTOP to children (but why would
they do that?).






Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#14474; Package emacs. (Tue, 04 Jun 2013 07:23:02 GMT) Full text and rfc822 format available.

Message #68 received at 14474 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Colin Walters <walters <at> verbum.org>
Cc: Michael Heerdegen <michael_heerdegen <at> web.de>,
	Michael Albinus <michael.albinus <at> gmx.de>,
	Stefan Monnier <monnier <at> IRO.UMontreal.CA>, 14474 <at> debbugs.gnu.org
Subject: Re: bug#14474: 24.3.50; Zombie subprocesses (again)
Date: Tue, 04 Jun 2013 00:20:19 -0700
On 06/03/2013 09:09 AM, Colin Walters wrote:
> Are there any other blocking issues for Emacs using the GLib mainloop?
> If that's the only one I can probably get around to doing a patch this
> week.

Don't know of any.  But I haven't implemented it yet.

If it's the only problem, perhaps the Emacs code should be written
to run on older glibs, where it'll ignore child-process stops and continues.
If this turns out to be a real problem we can disable it (i.e., use the
current godawful workaround) on older glibs.  But anyway, the idea is
to prevent this from being a blocking issue for Emacs.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#14474; Package emacs. (Tue, 04 Jun 2013 17:15:02 GMT) Full text and rfc822 format available.

Message #71 received at 14474 <at> debbugs.gnu.org (full text, mbox):

From: Michael Heerdegen <michael_heerdegen <at> web.de>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 14474 <at> debbugs.gnu.org
Subject: Re: bug#14474: 24.3.50; Zombie subprocesses (again)
Date: Tue, 04 Jun 2013 19:12:41 +0200
Hi Paul,

> In <http://lists.gnu.org/archive/html/emacs-devel/2013-05/msg00628.html>
> something like the following milder workaround was suggested instead.
> Michael, does this patch work around the bug for your test case?

Have already installed it to trunk?  The issue is fixed for me after
upgrading my emacs-snapshot to

(emacs-version) ==>

GNU Emacs 24.3.50.1 (x86_64-pc-linux-gnu, GTK+ Version 3.8.2)
of 2013-06-03 on dex, modified by Debian


Thanks,

Michael.

>
> === modified file 'src/xterm.c'
> --- src/xterm.c	2013-05-09 14:49:56 +0000
> +++ src/xterm.c	2013-05-28 16:34:44 +0000
> @@ -9897,6 +9897,13 @@ x_term_init (Lisp_Object display_name, c
>  
>          XSetLocaleModifiers ("");
>  
> +	/* If D-Bus is not already configured, inhibit D-Bus autolaunch,
> +	   as autolaunch can mess up Emacs's SIGCHLD handler.
> +	   FIXME: Rewrite subprocess handlers to use glib's child watchers.
> +	   See Bug#14474.  */
> +	if (! egetenv ("DBUS_SESSION_BUS_ADDRESS"))
> +	  xputenv ("DBUS_SESSION_BUS_ADDRESS=");
> +
>          /* Emacs can only handle core input events, so make sure
>             Gtk doesn't use Xinput or Xinput2 extensions.  */
>  	xputenv ("GDK_CORE_DEVICE_EVENTS=1");




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#14474; Package emacs. (Wed, 05 Jun 2013 17:25:01 GMT) Full text and rfc822 format available.

Message #74 received at 14474 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Colin Walters <walters <at> verbum.org>
Cc: Michael Heerdegen <michael_heerdegen <at> web.de>,
	Michael Albinus <michael.albinus <at> gmx.de>,
	Stefan Monnier <monnier <at> IRO.UMontreal.CA>, 14474 <at> debbugs.gnu.org
Subject: Re: bug#14474: 24.3.50; Zombie subprocesses (again)
Date: Wed, 05 Jun 2013 10:21:46 -0700
I found another problem with trying to have Emacs use glib's child watcher.
glib's signal handling code uses SA_RESTART and SA_NOCLDSTOP.
Both flags are non-starters for Emacs.  SA_NOCLDSTOP, I suppose,
could be conditionalized based on the discussion in Gnome bug
reports 701538 and 562501.  But SA_RESTART is more of a worry.
An interactive Emacs doesn't want SA_RESTART, because Emacs wants
long-running syscalls to be interrupted after a signal, not
restarted.

I thought of a way to work around this problem: have Emacs catch
SIGCHLD using its own flags, and call glib's SIGCHLD handler as part
of Emacs's SIGCHLD handler.  So I installed the patch quoted at the
end of this message into the Emacs trunk as bzr 112859.  If you've
had D-bus problems please try this new approach.

This raises three more questions for glib, though.  First,
why does glib use SA_RESTART?  If it's to avoid having application
syscalls fail with errno==EINTR, then we're OK.  But if it's to
avoid having glib's internal syscalls fail with errno==EINTR, then
we have a problem, as that can happen with the following patch
(and it can also happen with vanilla Emacs 24.3).

Second, should there be a more robust way for Emacs to invoke
glib's SIGCHLD handler.  The code below is a bit of a hack:
it uses g_source_unref (g_child_watch_source_new (0)) to
create and free a dummy SIGCHLD source, the only reason being
to trick glib into installing its SIGCHLD handler.  It also assumes
that glib does not use SA_SIGINFO.  This all seems fairly fragile.

Third, if a glib memory allocation fails, what does Emacs do?
Emacs tries hard not to exit when there's a memory allocation failure,
but I worry that glib will simply call 'exit' if malloc fails, which
is not good.

=== modified file 'src/ChangeLog'
--- src/ChangeLog	2013-06-05 12:17:02 +0000
+++ src/ChangeLog	2013-06-05 17:04:13 +0000
@@ -1,3 +1,17 @@
+2013-06-05  Paul Eggert  <eggert <at> cs.ucla.edu>
+
+	Chain glib's SIGCHLD handler from Emacs's (Bug#14474).
+	* process.c (dummy_handler): New function.
+	(lib_child_handler): New static var.
+	(handle_child_signal): Invoke it.
+	(catch_child_signal): If a library has set up a signal handler,
+	save it into lib_child_handler.
+	(init_process_emacs): If using glib and not on Windows, tickle glib's
+	child-handling code so that it initializes its private SIGCHLD handler.
+	* syssignal.h (SA_SIGINFO): Default to 0.
+	* xterm.c (x_term_init): Remove D-bus hack that I installed on May
+	31; it should no longer be needed now.
+
 2013-06-05  Michael Albinus  <michael.albinus <at> gmx.de>
 
 	* emacs.c (main) [HAVE_GFILENOTIFY]: Call globals_of_gfilenotify.

=== modified file 'src/process.c'
--- src/process.c	2013-06-03 18:47:35 +0000
+++ src/process.c	2013-06-05 17:04:13 +0000
@@ -6100,6 +6100,12 @@
    might inadvertently reap a GTK-created process that happened to
    have the same process ID.  */
 
+/* LIB_CHILD_HANDLER is a SIGCHLD handler that Emacs calls while doing
+   its own SIGCHLD handling.  On POSIXish systems, glib needs this to
+   keep track of its own children.  The default handler does nothing.  */
+static void dummy_handler (int sig) {}
+static signal_handler_t volatile lib_child_handler = dummy_handler;
+
 /* Handle a SIGCHLD signal by looking for known child processes of
    Emacs whose status have changed.  For each one found, record its
    new status.
@@ -6184,6 +6190,8 @@
 	    }
 	}
     }
+
+  lib_child_handler (sig);
 }
 
 static void
@@ -7035,9 +7043,13 @@
 void
 catch_child_signal (void)
 {
-  struct sigaction action;
+  struct sigaction action, old_action;
   emacs_sigaction_init (&action, deliver_child_signal);
-  sigaction (SIGCHLD, &action, 0);
+  sigaction (SIGCHLD, &action, &old_action);
+  eassert (! (old_action.sa_flags & SA_SIGINFO));
+  if (old_action.sa_handler != SIG_DFL && old_action.sa_handler != SIG_IGN
+      && old_action.sa_handler != deliver_child_signal)
+    lib_child_handler = old_action.sa_handler;
 }
 
 
@@ -7055,6 +7067,11 @@
   if (! noninteractive || initialized)
 #endif
     {
+#if defined HAVE_GLIB && !defined WINDOWSNT
+      /* Tickle glib's child-handling code so that it initializes its
+	 private SIGCHLD handler.  */
+      g_source_unref (g_child_watch_source_new (0));
+#endif
       catch_child_signal ();
     }
 

=== modified file 'src/syssignal.h'
--- src/syssignal.h	2013-01-02 16:13:04 +0000
+++ src/syssignal.h	2013-06-05 17:04:13 +0000
@@ -50,6 +50,10 @@
 # define NSIG NSIG_MINIMUM
 #endif
 
+#ifndef SA_SIGINFO
+# define SA_SIGINFO 0
+#endif
+
 #ifndef emacs_raise
 # define emacs_raise(sig) raise (sig)
 #endif

=== modified file 'src/xterm.c'
--- src/xterm.c	2013-05-31 01:41:52 +0000
+++ src/xterm.c	2013-06-05 17:04:13 +0000
@@ -9897,13 +9897,6 @@
 
         XSetLocaleModifiers ("");
 
-	/* If D-Bus is not already configured, inhibit D-Bus autolaunch,
-	   as autolaunch can mess up Emacs's SIGCHLD handler.
-	   FIXME: Rewrite subprocess handlers to use glib's child watchers.
-	   See Bug#14474.  */
-	if (! egetenv ("DBUS_SESSION_BUS_ADDRESS"))
-	  xputenv ("DBUS_SESSION_BUS_ADDRESS=unix:path=/dev/null");
-
         /* Emacs can only handle core input events, so make sure
            Gtk doesn't use Xinput or Xinput2 extensions.  */
 	xputenv ("GDK_CORE_DEVICE_EVENTS=1");






Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#14474; Package emacs. (Wed, 09 Sep 2020 13:53:02 GMT) Full text and rfc822 format available.

Message #77 received at 14474 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Michael Heerdegen <michael_heerdegen <at> web.de>
Cc: Paul Eggert <eggert <at> cs.ucla.edu>, 14474 <at> debbugs.gnu.org
Subject: Re: bug#14474: 24.3.50; Zombie subprocesses (again)
Date: Wed, 09 Sep 2020 15:52:15 +0200
Michael Heerdegen <michael_heerdegen <at> web.de> writes:

>> In <http://lists.gnu.org/archive/html/emacs-devel/2013-05/msg00628.html>
>> something like the following milder workaround was suggested instead.
>> Michael, does this patch work around the bug for your test case?
>
> Have already installed it to trunk?  The issue is fixed for me after
> upgrading my emacs-snapshot to
>
> (emacs-version) ==>
>
> GNU Emacs 24.3.50.1 (x86_64-pc-linux-gnu, GTK+ Version 3.8.2)
> of 2013-06-03 on dex, modified by Debian

There was some followup talk here about other possible glib problems,
but it looks like Paul fixed those two?  (I just skimmed the patch,
which was applied at the time.)

So I'm closing this bug report; if there are any further issues here,
please respond to the debbugs address and we'll reopen.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




bug closed, send any further explanations to 14474 <at> debbugs.gnu.org and michael_heerdegen <at> web.de Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Wed, 09 Sep 2020 13:53:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 08 Oct 2020 11:24:09 GMT) Full text and rfc822 format available.

This bug report was last modified 3 years and 194 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.