NOTE: This patch has been forwardported to RHEL-7.2. It is originally from RHEL-6.7. Message-ID: <54E37CE7.50703@redhat.com> Date: Tue, 17 Feb 2015 17:39:51 +0000 From: Pedro Alves To: Sergio Durigan Junior Subject: [debug-list] [PATCH] RH BZ #1162264 - gdb/linux-nat.c:1411: internal-error:, linux_nat_post_attach_wait: Assertion `pid == new_pid' failed. Hi. Ref: https://bugzilla.redhat.com/show_bug.cgi?id=1162264 So I spend a few more hours today trying to reproduce the EACCES, to no avail. Also, unfortunately, none of the attach bugs exposed by attach-many-short-lived-threads.exp test can explain this. It seems to be that really the best we can do is cope with the error, like in the patch below. Note that the backtrace at https://bugzilla.redhat.com/show_bug.cgi?id=1162264#c3 : shows that this triggers for the main thread already: ... #6 0x000000000044fd2e in linux_nat_post_attach_wait (ptid=..., first=1, cloned=0x1d84368, ... (note "first=1"). For upstream, I think linux_nat_attach should be adjusted to work like gdbserver -- that is, leave the initial waitpid to the main wait code, like all other events, instead of synchronously doing waitpid(PID). That'll get rid of linux_nat_post_attach_wait altogether. But that's too invasive for a bug fix. >From 072c61aeb9adc64e1eb45c120061b85fbf6f4d25 Mon Sep 17 00:00:00 2001 From: Pedro Alves Date: Tue, 17 Feb 2015 17:11:05 +0000 Subject: [PATCH] RH BZ #1162264 - gdb/linux-nat.c:1411: internal-error: linux_nat_post_attach_wait: Assertion `pid == new_pid' failed. According to BZ #1162264, it can happen that we manage to attach to a process, but then waitpid on it fails with EACCES. That's unexpected, and gdb hits an assertion. But given this is an error that is out of our control, we should handle it gracefully. I wasn't able to reproduce the EACCES, but hacking in the error, like: | --- a/gdb/linux-nat.c | +++ b/gdb/linux-nat.c | @@ -1409,7 +1409,7 @@ linux_nat_post_attach_wait (ptid_t ptid, int first, int *cloned, | *cloned = 1; | } | | - if (new_pid != pid) | + if (new_pid != pid || 1) | { | int saved_errno = errno; | | @@ -1423,6 +1423,7 @@ linux_nat_post_attach_wait (ptid_t ptid, int first, int *cloned, | ptrace (PTRACE_DETACH, pid, 0, 0); | | errno = saved_errno; | + errno = EACCES; | perror_with_name (_("waitpid")); | } ... I could confirm that the error handling works properly. On the EACCES case, we get: (gdb) attach 1202 Attaching to process 1202 Unable to attach: waitpid: Permission denied. (gdb) info inferiors Num Description Executable * 1 (gdb) No test because the conditions that lead to the waitpid error are unknown. gdb/ChangeLog: 2015-02-17 Pedro Alves * linux-nat.c: Include "exceptions.h". (linux_nat_post_attach_wait): If waitpid returns an excepted result, detach and error out instead of asserting. (linux_nat_attach): Wrap linux_nat_post_attach_wait in TRY_CATCH. Mourn inferior and rethrow in case of error while waiting for the initial stop. --- gdb/linux-nat.c | 34 +++++++++++++++++++++++++++++++--- 1 file changed, 31 insertions(+), 3 deletions(-) Index: gdb-7.6.1/gdb/linux-nat.c =================================================================== --- gdb-7.6.1.orig/gdb/linux-nat.c +++ gdb-7.6.1/gdb/linux-nat.c @@ -1397,7 +1397,22 @@ linux_nat_post_attach_wait (ptid_t ptid, *cloned = 1; } - gdb_assert (pid == new_pid); + if (new_pid != pid) + { + int saved_errno = errno; + + /* Unexpected waitpid result. EACCES has been observed on RHEL + 6.5 (RH BZ #1162264). This is most likely a kernel bug, thus + out of our control, so treat it as invalid input. The LWP's + state is indeterminate at this point, so best we can do is + error out, otherwise we'd probably end up wedged later on. + + In case we're still attached. */ + ptrace (PTRACE_DETACH, pid, 0, 0); + + errno = saved_errno; + perror_with_name (_("waitpid")); + } if (!WIFSTOPPED (status)) { @@ -1621,7 +1636,7 @@ static void linux_nat_attach (struct target_ops *ops, char *args, int from_tty) { struct lwp_info *lp; - int status; + int status = 0; ptid_t ptid; volatile struct gdb_exception ex; @@ -1659,8 +1674,19 @@ linux_nat_attach (struct target_ops *ops /* Add the initial process as the first LWP to the list. */ lp = add_initial_lwp (ptid); - status = linux_nat_post_attach_wait (lp->ptid, 1, &lp->cloned, - &lp->signalled); + TRY_CATCH (ex, RETURN_MASK_ERROR) + { + status = linux_nat_post_attach_wait (lp->ptid, 1, &lp->cloned, + &lp->signalled); + } + if (ex.reason < 0) + { + target_terminal_ours (); + target_mourn_inferior (); + + error (_("Unable to attach: %s"), ex.message); + } + if (!WIFSTOPPED (status)) { if (WIFEXITED (status))