GNU bug report logs - #21245
25.0.50; [PATCH] SIGSEGV when misusing (backtrace-frame) from custom debugger

Previous Next

Package: emacs;

Reported by: Pip Cet <pipcet <at> gmail.com>

Date: Wed, 12 Aug 2015 22:47:02 UTC

Severity: normal

Tags: patch

Found in version 25.0.50

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 21245 in the body.
You can then email your comments to 21245 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#21245; Package emacs. (Wed, 12 Aug 2015 22:47:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Pip Cet <pipcet <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Wed, 12 Aug 2015 22:47:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Pip Cet <pipcet <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: 25.0.50;
 [PATCH] SIGSEGV when misusing (backtrace-frame) from custom debugger
Date: Wed, 12 Aug 2015 22:45:10 +0000
[Message part 1 (text/plain, inline)]
With the current GIT tree, I am running into reproduceable segfaults
that happen when a custom debugger routine is installed with (let
((debugger #'org-elisp-debugger)) ...code...) and the right (wrong)
code is run.  I believe I have traced down the segfault to a bug in
eval.c that I have been able to fix, but have discovered two more
potentially problematic scenarios in the process that I am not yet
including a fix for.

Unfortunately, this is another bug report for which the backtrace
information isn't very helpful, but I have included it for
completeness.

In GNU Emacs 25.0.50.23 (x86_64-unknown-linux-gnu, GTK+ Version 3.16.6)
 of 2015-08-12 on ...
Repository revision: e4de91d8dd2a06125140fb42772ec84a2f7ab290
Windowing system distributor `The X.Org Foundation', version 11.0.11702000
System Description:    Debian GNU/Linux unstable (sid)

Configured using:
 `configure 'CFLAGS=-O0 -g3''

Configured features:
XPM JPEG TIFF GIF PNG RSVG IMAGEMAGICK SOUND DBUS GCONF GSETTINGS NOTIFY
LIBSELINUX GNUTLS LIBXML2 FREETYPE XFT ZLIB TOOLKIT_SCROLL_BARS GTK3 X11

Important settings:
  value of $LANG: en_US.UTF-8
  locale-coding-system: utf-8-unix

Major mode: Lisp Interaction

Minor modes in effect:
  tooltip-mode: t
  global-eldoc-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  tool-bar-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  line-number-mode: t
  transient-mark-mode: t

Recent messages:
For information about GNU Emacs and the GNU system, type C-h C-a.
Quit
completing-read-default: Command attempted to use minibuffer while in minibuffer
Quit [2 times]

Load-path shadows:
None found.

Features:
(shadow sort gnus-util mail-extr emacsbug message dired format-spec
rfc822 mml mml-sec mm-decode mm-bodies mm-encode mail-parse rfc2231
mailabbrev gmm-utils mailheader sendmail rfc2047 rfc2045 ietf-drums
mm-util help-fns help-mode easymenu cl-loaddefs pcase cl-lib mail-prsvr
mail-utils time-date mule-util tooltip eldoc electric uniquify
ediff-hook vc-hooks lisp-float-type mwheel x-win term/common-win x-dnd
tool-bar dnd fontset image regexp-opt fringe tabulated-list newcomment
elisp-mode lisp-mode prog-mode register page menu-bar rfn-eshadow timer
select scroll-bar mouse jit-lock font-lock syntax facemenu font-core
frame cl-generic cham georgian utf-8-lang misc-lang vietnamese tibetan
thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek romanian
slovak czech european ethiopic indian cyrillic chinese charscript
case-table epa-hook jka-cmpr-hook help simple abbrev minibuffer
cl-preloaded nadvice loaddefs button faces cus-face macroexp files
text-properties overlay sha1 md5 base64 format env code-pages mule
custom widget hashtable-print-readable backquote dbusbind gfilenotify
dynamic-setting system-font-setting font-render-setting move-toolbar gtk
x-toolkit x multi-tty make-network-process emacs)

Memory information:
((conses 16 81334 6405)
 (symbols 48 19075 0)
 (miscs 40 42 86)
 (strings 32 13247 4299)
 (string-bytes 1 376617)
 (vectors 16 11220)
 (vector-slots 8 413595 4312)
 (floats 8 129 124)
 (intervals 56 180 0)
 (buffers 976 11)
 (heap 1024 18494 1047))

-----

This is arguably a case of "don't do that, then", but I think it's a
bug. However, it's a bug that is triggered exclusively by installing a
custom debugger that is itself buggy.

I invested quite a bit of effort in finding this bug and attempting to
reproduce it without using too much local Emacs Lisp code that isn't
public at the moment, but have succeeded only in the first of those. I
do have a test case here, but it relies on a modified version of the
org-mode code and some extra code.

The problem, to proceed with analysis, is an invalid "arguments"
pointer in the backtrace structure on the specpdl stack.  Various
sub-cases of eval_sub allocate the arguments array in temporary
storage, then return to eval_sub after freeing the temp storage, even
though eval_sub calls the debugger next which might still need the
arguments residing in now-freed storage.

The culprit in my case was apply_lambda, which has this code:

  /* Do the debug-on-exit now, while arg_vector still exists.  */
  if (backtrace_debug_on_exit (specpdl + count))
    {
      /* Don't do it again when we return to eval.  */
      set_backtrace_debug_on_exit (specpdl + count, false);
      tem = call_debugger (list2 (Qexit, tem));
    }

Which I believe should read:


  /* Do the debug-on-exit now, while arg_vector still exists.  */
  if (backtrace_debug_on_exit (specpdl + count))
    {
      tem = call_debugger (list2 (Qexit, tem));
      /* Don't do it again when we return to eval.  */
      set_backtrace_debug_on_exit (specpdl + count, false);
    }

In my case, the debugger (again, this is a debugger bug) called
(backtrace-on-exit n t) and reset (to true) the debug-on-exit flag of
the specpdl entry that just had it cleared; the debugger would then
call (backtrace-frame n) and random stack data would be returned in
the resulting list, causing the segfault. In my case, this happened
only when garbage collection happened to be run after the debugger had
been started but before it had called (backtrace-frame n).

I am also very suspicious of the two cases of eval_sub that read:

      else if (XSUBR (fun)->max_args == MANY)
        {
          /* Pass a vector of evaluated arguments.  */
          Lisp_Object *vals;
          ptrdiff_t argnum = 0;
          USE_SAFE_ALLOCA;

          SAFE_ALLOCA_LISP (vals, XINT (numargs));

          GCPRO3 (args_left, fun, fun);
          gcpro3.var = vals;
          gcpro3.nvars = 0;

          while (!NILP (args_left))
            {
              vals[argnum++] = eval_sub (Fcar (args_left));
              args_left = Fcdr (args_left);
              gcpro3.nvars = argnum;
            }

          set_backtrace_args (specpdl + count, vals, XINT (numargs));

          val = (XSUBR (fun)->function.aMANY) (XINT (numargs), vals);
          UNGCPRO;
          SAFE_FREE ();
        }
      else

(which SAFE_FREEs the array that might yet be used by the debugger) and:

    {
      Lisp_Object numargs;
      Lisp_Object argvals[8];

      ...

      else
        {
          GCPRO3 (args_left, fun, fun);
          gcpro3.var = argvals;
          gcpro3.nvars = 0;

          maxargs = XSUBR (fun)->max_args;
          for (i = 0; i < maxargs; args_left = Fcdr (args_left))
            {
              argvals[i] = eval_sub (Fcar (args_left));
              gcpro3.nvars = ++i;
            }

          UNGCPRO;

          set_backtrace_args (specpdl + count, argvals, XINT (numargs));

          ...
        }
    }

which uses a stack array that goes out of scope before the debugger is
called.

While I strongly believe it would be best to focus our energies on
fixing this bug rather than reproducing it (which is going to be
unreliable as it relies on the contents of the C stack being modified
at the right time), here is the buggy debugger routine and the file
emacs-bug-002.el that triggered the bug in the gdb log I've attached:

---- debugger (not doing anything useful, crippled version that
reproduces the bug)

(defun org-elisp-debugger (&rest args)
  (message "args %S %S" args (backtrace-frame 1 #'org-elisp-debugger))
  (if (eq (car args) 'error)
      (apply debug args)
    (let ((count 0))
      (while (and (eq (car args) 'exit) (backtrace-frame count
#'org-elisp-debugger))
        (setq count (1+ count)))
      ;;(message "%S frames on stack, type %S" count (car args))
      ;(when (eq (car args) 'exit)
      (dotimes (delta0 10)
        (dotimes (delta1 10)
          (let ((a (nth (- count delta0) org-elisp-frames))
                (b (backtrace-frame (+ 0 delta1) #'org-elisp-debugger)))
            (when (and (not (equal (car a) (car b)))
                       (equal (cadr a) (cadr b))
                       (equal (length a) (length b)))
              (message "Backtrace:")
              (backtrace)
              (garbage-collect)
              (message "eval %S %S %S %S -> %S" delta0 delta1 (car args) a b)
              ;;(sleep-for 1.0)
              ))))
      (dotimes (i count)
        (if (and (> i 6) (< i (- count 93)))
            (backtrace-debug i t))
        (if (and (eq (car args) 'exit) (> i 0))
            (setf (nth (- count i) org-elisp-frames) (list i count
(length (backtrace-frame i #'org-elisp-debugger)) (backtrace-frame i
#'org-elisp-debugger))))))
    (prog1
        (if (eq (car args) 'exit)
            (cadr args)
          t)

----- emacs-bug-002.el

(add-to-list 'load-path "/home/pip/git/org-mode/lisp")
(require 'org)
(find-file "/home/pip/git/org-mode/lisp/org.el")
(eval-buffer)
(find-file "/home/pip/git/org-mode/lisp/org-colview.el")
(eval-buffer)
(find-file "/home/pip/emacs-bug-002.org")
(org-columns)

-----

Please contact me if it is absolutely necessary for you to reproduce
the bug, so I can work more to isolate the test case from my local
code or find a way of sharing the code with interested parties.
However, again, this bug is going to be hard to reproduce as it relies
on what intervening C code does with the stack, so it's possible a
test case would only work with my local compiler/library/org-mode
setup.

I've attached the patch for the case that I've actually seen, but
would like to repeat that I strongly suspect the other two cases to be
problematic as well.

Thanks!
[emacs-bug-005.diff (text/plain, attachment)]
[emacs-bug-info-005.txt (text/plain, attachment)]
[emacs-bug-002.el (text/x-emacs-lisp, attachment)]

Reply sent to Paul Eggert <eggert <at> cs.ucla.edu>:
You have taken responsibility. (Tue, 17 Nov 2015 07:18:02 GMT) Full text and rfc822 format available.

Notification sent to Pip Cet <pipcet <at> gmail.com>:
bug acknowledged by developer. (Tue, 17 Nov 2015 07:18:03 GMT) Full text and rfc822 format available.

Message #10 received at 21245-done <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Pip Cet <pipcet <at> gmail.com>
Cc: 21245-done <at> debbugs.gnu.org
Subject: Re: 25.0.50; [PATCH] SIGSEGV when misusing (backtrace-frame) from
 custom debugger
Date: Mon, 16 Nov 2015 23:16:46 -0800
[Message part 1 (text/plain, inline)]
Thanks for the bug report in <http://bugs.gnu.org/21245>. Good eye, noticing 
those dangling pointers -- these could cause problems in obscure circumstances 
even without custom debuggers.

I installed the attached patch into the emacs-25 branch, and (if I understand 
things correctly) it should address the issues raised in the bug report so I'll 
close the bug report for now; if I'm wrong and this patch doesn't fix the bugs 
we can always reopen it.
[0001-eval_sub-followed-dangling-pointer-when-debugging.txt (text/plain, attachment)]

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 15 Dec 2015 12:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 8 years and 144 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.