Re: hit a KASan bug related to Perf during stress test

From: Peter Zijlstra
Date: Mon Oct 24 2016 - 10:36:55 EST


On Mon, Oct 24, 2016 at 03:25:55PM +0200, Oleg Nesterov wrote:
> Well, if we add that PIDTYPE_TGID hack, I think we can do something
> like below...
>
> Or do you think we should add a perf_alive() check into perf_event_pid()
> for a quick fix?

That is what I was thinking. Then we don't need to do the TGID hack,
I suspect some people might object to that.

> Either way it's a pity we can't report at least the valid tid, perhaps
> perf_event_tid() could use task_pid_nr() if event->ns == init_pid_ns,
> I dunno.

Right, but after unhash is there really still the notion of a valid TID?
I mean, the TID can be reused, at which point you'll end up with two
tasks etc..

But yes, very tedious.

I was thinking something like so?

---

kernel/events/core.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index c6e47e97b33f..2c9a22485e9e 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -1257,7 +1257,14 @@ static u32 perf_event_pid(struct perf_event *event, struct task_struct *p)
if (event->parent)
event = event->parent;

- return task_tgid_nr_ns(p, event->ns);
+ /*
+ * It is possible the task already got unhashed, in which case we
+ * cannot determine the current->group_leader/real_parent.
+ *
+ * Also, report -1 to indicate unhashed, so as not to confused with
+ * 0 for the idle task.
+ */
+ return pid_alive(p) ? task_tgid_nr_ns(p, event->ns) : ~0;
}

static u32 perf_event_tid(struct perf_event *event, struct task_struct *p)
@@ -1268,7 +1275,7 @@ static u32 perf_event_tid(struct perf_event *event, struct task_struct *p)
if (event->parent)
event = event->parent;

- return task_pid_nr_ns(p, event->ns);
+ return pid_alive(p) ? task_pid_nr_ns(p, event->ns) : ~0;
}

/*