Re: [PATCH] avoid race condition in pick_next_task_fair inkernel/sched_fair.c

From: Peter Zijlstra
Date: Thu Dec 23 2010 - 07:12:28 EST


On Thu, 2010-12-23 at 10:08 +0800, Yong Zhang wrote:
> > systemd--1251 0d..5. 2015398us : enqueue_task_fair <-enqueue_task
> > systemd--1251 0d..5. 2015398us : print_runqueue <-enqueue_task_fair
> > systemd--1251 0d..5. 2015399us : __print_runqueue: cfs_rq: c2407c34, nr: 3, load: 3072
> > systemd--1251 0d..5. 2015400us : __print_runqueue: curr: f6a8de5c, comm: systemd-cgroups/1251, load: 1024
> > systemd--1251 0d..5. 2015401us : __print_runqueue: se: f69e6300, load: 1024,
> > systemd--1251 0d..5. 2015401us : __print_runqueue: cfs_rq: f69e6540, nr: 2, load: 2048
> > systemd--1251 0d..5. 2015402us : __print_runqueue: curr: (null)
> > systemd--1251 0d..5. 2015402us : __print_runqueue: se: f69e65a0, load: 4137574976,
>
> the load == f69e65a0 == address of se, odd

This appears to be consistently true, I've also found that in between
these two prints, there is a free_sched_group() freeing that exact
entry. So post-print is a use-after-free artifact.

What's interesting is that its freeing a cfs_rq struct with
nr_running=1, that should not be possible...

/me goes stare at the whole cgroup task attach vs cgroup destruction
muck.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/