You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
158 lines
5.0 KiB
158 lines
5.0 KiB
NOTE: This patch has been forwardported to RHEL-7.2. It is originally |
|
from RHEL-6.7. |
|
|
|
Message-ID: <54E37CE7.50703@redhat.com> |
|
Date: Tue, 17 Feb 2015 17:39:51 +0000 |
|
From: Pedro Alves <palves@redhat.com> |
|
To: Sergio Durigan Junior <sergiodj@redhat.com> |
|
Subject: [debug-list] [PATCH] RH BZ #1162264 - gdb/linux-nat.c:1411: |
|
internal-error:, |
|
linux_nat_post_attach_wait: Assertion `pid == new_pid' failed. |
|
|
|
Hi. |
|
|
|
Ref: https://bugzilla.redhat.com/show_bug.cgi?id=1162264 |
|
|
|
So I spend a few more hours today trying to reproduce the |
|
EACCES, to no avail. Also, unfortunately, none of the attach |
|
bugs exposed by attach-many-short-lived-threads.exp test |
|
can explain this. |
|
|
|
It seems to be that really the best we can do is cope with |
|
the error, like in the patch below. |
|
|
|
Note that the backtrace at |
|
|
|
https://bugzilla.redhat.com/show_bug.cgi?id=1162264#c3 : |
|
|
|
shows that this triggers for the main thread already: |
|
|
|
... |
|
#6 0x000000000044fd2e in linux_nat_post_attach_wait (ptid=..., first=1, cloned=0x1d84368, |
|
... |
|
|
|
(note "first=1"). |
|
|
|
For upstream, I think linux_nat_attach should be adjusted to work |
|
like gdbserver -- that is, leave the initial waitpid to the main |
|
wait code, like all other events, instead of synchronously |
|
doing waitpid(PID). That'll get rid of linux_nat_post_attach_wait |
|
altogether. But that's too invasive for a bug fix. |
|
|
|
>From 072c61aeb9adc64e1eb45c120061b85fbf6f4d25 Mon Sep 17 00:00:00 2001 |
|
From: Pedro Alves <palves@redhat.com> |
|
Date: Tue, 17 Feb 2015 17:11:05 +0000 |
|
Subject: [PATCH] RH BZ #1162264 - gdb/linux-nat.c:1411: internal-error: |
|
linux_nat_post_attach_wait: Assertion `pid == new_pid' failed. |
|
|
|
According to BZ #1162264, it can happen that we manage to attach to a |
|
process, but then waitpid on it fails with EACCES. That's unexpected, |
|
and gdb hits an assertion. But given this is an error that is out of |
|
our control, we should handle it gracefully. I wasn't able to |
|
reproduce the EACCES, but hacking in the error, like: |
|
|
|
| --- a/gdb/linux-nat.c |
|
| +++ b/gdb/linux-nat.c |
|
| @@ -1409,7 +1409,7 @@ linux_nat_post_attach_wait (ptid_t ptid, int first, int *cloned, |
|
| *cloned = 1; |
|
| } |
|
| |
|
| - if (new_pid != pid) |
|
| + if (new_pid != pid || 1) |
|
| { |
|
| int saved_errno = errno; |
|
| |
|
| @@ -1423,6 +1423,7 @@ linux_nat_post_attach_wait (ptid_t ptid, int first, int *cloned, |
|
| ptrace (PTRACE_DETACH, pid, 0, 0); |
|
| |
|
| errno = saved_errno; |
|
| + errno = EACCES; |
|
| perror_with_name (_("waitpid")); |
|
| } |
|
|
|
... I could confirm that the error handling works properly. On the |
|
EACCES case, we get: |
|
|
|
(gdb) attach 1202 |
|
Attaching to process 1202 |
|
Unable to attach: waitpid: Permission denied. |
|
(gdb) info inferiors |
|
Num Description Executable |
|
* 1 <null> |
|
(gdb) |
|
|
|
No test because the conditions that lead to the waitpid error are |
|
unknown. |
|
|
|
gdb/ChangeLog: |
|
2015-02-17 Pedro Alves <palves@redhat.com> |
|
|
|
* linux-nat.c: Include "exceptions.h". |
|
(linux_nat_post_attach_wait): If waitpid returns an excepted |
|
result, detach and error out instead of asserting. |
|
(linux_nat_attach): Wrap linux_nat_post_attach_wait in TRY_CATCH. |
|
Mourn inferior and rethrow in case of error while waiting for the |
|
initial stop. |
|
--- |
|
gdb/linux-nat.c | 34 +++++++++++++++++++++++++++++++--- |
|
1 file changed, 31 insertions(+), 3 deletions(-) |
|
|
|
Index: gdb-7.6.1/gdb/linux-nat.c |
|
=================================================================== |
|
--- gdb-7.6.1.orig/gdb/linux-nat.c |
|
+++ gdb-7.6.1/gdb/linux-nat.c |
|
@@ -1397,7 +1397,22 @@ linux_nat_post_attach_wait (ptid_t ptid, |
|
*cloned = 1; |
|
} |
|
|
|
- gdb_assert (pid == new_pid); |
|
+ if (new_pid != pid) |
|
+ { |
|
+ int saved_errno = errno; |
|
+ |
|
+ /* Unexpected waitpid result. EACCES has been observed on RHEL |
|
+ 6.5 (RH BZ #1162264). This is most likely a kernel bug, thus |
|
+ out of our control, so treat it as invalid input. The LWP's |
|
+ state is indeterminate at this point, so best we can do is |
|
+ error out, otherwise we'd probably end up wedged later on. |
|
+ |
|
+ In case we're still attached. */ |
|
+ ptrace (PTRACE_DETACH, pid, 0, 0); |
|
+ |
|
+ errno = saved_errno; |
|
+ perror_with_name (_("waitpid")); |
|
+ } |
|
|
|
if (!WIFSTOPPED (status)) |
|
{ |
|
@@ -1621,7 +1636,7 @@ static void |
|
linux_nat_attach (struct target_ops *ops, char *args, int from_tty) |
|
{ |
|
struct lwp_info *lp; |
|
- int status; |
|
+ int status = 0; |
|
ptid_t ptid; |
|
volatile struct gdb_exception ex; |
|
|
|
@@ -1659,8 +1674,19 @@ linux_nat_attach (struct target_ops *ops |
|
/* Add the initial process as the first LWP to the list. */ |
|
lp = add_initial_lwp (ptid); |
|
|
|
- status = linux_nat_post_attach_wait (lp->ptid, 1, &lp->cloned, |
|
- &lp->signalled); |
|
+ TRY_CATCH (ex, RETURN_MASK_ERROR) |
|
+ { |
|
+ status = linux_nat_post_attach_wait (lp->ptid, 1, &lp->cloned, |
|
+ &lp->signalled); |
|
+ } |
|
+ if (ex.reason < 0) |
|
+ { |
|
+ target_terminal_ours (); |
|
+ target_mourn_inferior (); |
|
+ |
|
+ error (_("Unable to attach: %s"), ex.message); |
|
+ } |
|
+ |
|
if (!WIFSTOPPED (status)) |
|
{ |
|
if (WIFEXITED (status))
|
|
|