Re: [PATCH 1/5] ptrace: Prepare to fix racy accesses on taskbreakpoints

From: Frederic Weisbecker
Date: Tue Apr 12 2011 - 13:54:52 EST


On Mon, Apr 11, 2011 at 11:47:57AM +0100, Will Deacon wrote:
> Hi Frederic,
>
> On Fri, 2011-04-08 at 18:34 +0100, Frederic Weisbecker wrote:
> > When a task is traced and is in a stopped state, the tracer
> > may execute a ptrace request to examine the tracee state and
> > get its task struct. Right after, the tracee can be killed
> > and thus its breakpoints released.
> > This can happen concurrently when the tracer is in the middle
> > of reading or modifying these breakpoints, leading to dereferencing
> > a freed pointer.
>
> Oo, that's nasty. Would an alternative solution be to free the
> breakpoints only when the task_struct usage count is zero?

Yeah my solution may look a bit gross. But the problem is
that perf events hold a ref on their task context. Thus the
task_struct usage will never be 0 until you release all the
perf events attached to it.

Normal perf events are released in two ways in the exit path:

- explicitly if they are inherited
- from the file release path if they are a parent

Now breakpoints are a bit specific because neither are they inherited,
nor do they have a file associated.

So we need to release them explicitly to release the task. And after that
we also need to ensure nobody will play with the breakpoints, otherwise there
will be a memory leak because those will never be freed.

So that patch protects against concurrent release of the breakpoints and
also against the possible memory leak.

May be we can think about a solution that involves not taking a ref
to the task when we allocate breakpoints, and then finally release
from the task_struct rcu release. But that may involve many corner
cases. Perhaps we can think about this later and for now opt for the
current solution that looks safe and without surprise. This fix needs
to be backported so it should stay KISS I think.


> > diff --git a/kernel/ptrace.c b/kernel/ptrace.c
> > index 0fc1eed..dc7ab65 100644
> > --- a/kernel/ptrace.c
> > +++ b/kernel/ptrace.c
> > @@ -22,6 +22,7 @@
> > #include <linux/syscalls.h>
> > #include <linux/uaccess.h>
> > #include <linux/regset.h>
> > +#include <linux/hw_breakpoint.h>
> >
> >
> > /*
> > @@ -879,3 +880,19 @@ asmlinkage long compat_sys_ptrace(compat_long_t request, compat_long_t pid,
> > return ret;
> > }
> > #endif /* CONFIG_COMPAT */
> > +
> > +#ifdef CONFIG_HAVE_HW_BREAKPOINT
> > +int ptrace_get_breakpoints(struct task_struct *tsk)
> > +{
> > + if (atomic_inc_not_zero(&tsk->ptrace_bp_refcnt))
> > + return 0;
> > +
> > + return -1;
> > +}
>
>
> Would it be better to return -ESRCH here instead?

So I'm going to be nitpicking there :)
The ptrace_get_breakpoints() function tells us whether
we can take a ref on the breakpoints or not.

Returning -ERSCH there would mean that the task struct doesn't exist,
or something confusing like this. Which is not true: the task exists.

OTOH, the caller, which is ptrace, needs to take a decision when he
can't take a reference to the breakpoints. The behaviour is
to act as if the process does not exist anymore, which is about to
happen for real but we anticipate because the task has reached a
state in its exiting path where we can't manipulate the breakpoints
anymore.

So the rationale behind it is that -ERSCH is an interpretation
of the caller.

Right?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/