Re: [PATCH] avoid race condition in pick_next_task_fair inkernel/sched_fair.c

From: Miklos Vajna
Date: Sat Dec 18 2010 - 21:21:32 EST


On Tue, Jun 29, 2010 at 12:43:35PM +0200, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Tue, 2010-06-29 at 15:10 +0800, shenghui wrote:
> > I think some lock on the metadata can fix this issue, but we may
> > change plenty of code to add support for lock. I think the easist
> > way is just substacting nr_running before dequing tasks.
>
> But all that is fully serialized by the rq->lock.. so I'm really not
> seeing how this can happen.

Hi,

Here is a panic I got today:

http://frugalware.org/~vmiklos/pics/bug/2.6.37-rc6.png

More details:

I get this sometimes on boot or shutdown when testing systemd. I did not
get it with sysvinit, so I guess it may be related to systemd's heavy
cgroups usage, but I'm not sure. Sadly it isn't 100% reproducible but I
usually hit it at least once a day.

The config is here:
http://frugalware.org/~vmiklos/logs/2.6.37-rc6.config (I just did a
yes "" | make config to update it to 2.6.37-rc6.)

I got something similar with 2.6.36.1 as well:

http://frugalware.org/~vmiklos/pics/bug/2.6.36.1.png

Ah, and this is on i686 in VMware - though given that I never had this
problem with systemd, I guess it won't be an emulator bug. :)

I'm not familiar with the sched code, is it possible that this is
related?

Thanks,

Miklos

Attachment: pgp00000.pgp
Description: PGP signature