Re: regression 4.4: deadlock in with cgroup percpu_rwsem

From: Heiko Carstens
Date: Wed Jan 20 2016 - 02:08:08 EST


On Tue, Jan 19, 2016 at 02:38:45PM -0500, Tejun Heo wrote:
> Hello,
>
> On Tue, Jan 19, 2016 at 08:36:18PM +0100, Christian Borntraeger wrote:
> > No, its not a task_struct. Activating some more debug information did indeed
> > revealed several other issues (overwritten redzones etc). Unfortunately I
> > only saw the broken things after the facts, so I do not know which code did that.
> > When I disabled the cgroup controllers in libvirt I was no longer able to trigger
> > the bugs. Still trying to narrow things down.
>
> Hmmm... that's worrying. CONFIG_DEBUG_PAGEALLOC sometimes can catch
> these sort of bugs red-handed. Might worth trying.

Christian, just to avoid that you get surprised like I did:
CONFIG_DEBUG_PAGEALLOC requires in the meantime an additional kernel
parameter "debug_pagealloc=on" to be active.

That change was introduced a year ago, so it was probably only me who
wasn't aware of that change :)