Re: Is it a workqueue related issue in 2.6.37 (Was: Re: [libvirt]blkio cgroup [solved])

From: Tejun Heo
Date: Thu Feb 24 2011 - 09:31:16 EST


On Thu, Feb 24, 2011 at 09:23:03AM -0500, Vivek Goyal wrote:
> On Thu, Feb 24, 2011 at 10:18:00AM +0100, Dominik Klein wrote:
> Hi Dominik,
> Thanks for the tests and reports. I checked the latest logs also and
> I see that cfq has scheduled a work but that work never gets scheduled.
> I never see the trace message which says cfq_kick_queue().
> I am ccing it to lkml and tejun to see if he has any suggestions.
> Tejun,
> I will give you some details about what we have discussed so far.
> Dominik is trying blkio throttling feature and trying to throttle some
> virtual machines. He is using 2.6.37 kernels and once he launches 3
> virtual machines he notices that system is kind of frozen. After running
> some traces we noticed that CFQ has requests but it is not dispatching
> these to devices any more.
> This problem does not show up with deadline scheduler and also goes away
> with 2.6.38-rc6 kernels.

Hmmm... Maybe the following commit?

commit 7576958a9d5a4a677ad7dd40901cdbb6c1110c98
Author: Tejun Heo <tj@xxxxxxxxxx>
Date: Mon Feb 14 14:04:46 2011 +0100

workqueue: wake up a worker when a rescuer is leaving a gcwq

After executing the matching works, a rescuer leaves the gcwq
whether there are more pending works or not. This may decrease
the concurrency level to zero and stall execution until a new work
item is queued on the gcwq.

Make rescuer wake up a regular worker when it leaves a gcwq if
there are more works to execute, so that execution isn't stalled.

Signed-off-by: Tejun Heo <tj@xxxxxxxxxx>
Reported-by: Ray Jui <rjui@xxxxxxxxxxxx>
Cc: stable@xxxxxxxxxx

